December 19, 2017 ryanjuckett No comments

Improving the Font Pipeline

This post will be a bit different than my normal technical posts. Rather than covering a successful finished system, this will be a glimpse into the process of (hopefully) getting to that point in the future.

With INVERSUS finally out on all platforms, I can clean up some of the engine’s rough edges, and one of the bigger pain points has been the font pipeline. When it comes to refactoring workflows and engine architecture, I think it’s important to find tractable small steps you can take on the path towards your desired goal. I have a fuzzy picture of how I’d like things to be, and I make small improvements while at the same time getting a better focus on the destination.

Problems Everywhere

In my experience, the complexity of properly rendering text is an underappreciated problem by engineers that have never worked on it. It seems so simple! There are really only three steps:

  • Choose the next glyph to draw
  • Choose where to draw the glyph
  • Draw the glyph

So which of those steps is the hard one? Unfortunately, if you want to render crisp text in multiple languages across the world, all of them are the hard one.

If you want to do everything right, you need to consider the following (and I wouldn’t call this an exhaustive list):

  • Choose the next glyph to draw
    • To get anywhere, you’re going to want a way to convert your text into Unicode code points.
      • Your best bet is probably storing it all encoded in UTF-8.
      • Getting this far is probably the simplest thing you’ll find mentioned here.
    • How are you storing the mapping from code point to glyph?
      • Did you know that a full CJK font might have 65,535 glyphs in it?
      • Are you loading all of this data into memory at once or streaming sub sections at a time?
      • Do you want to strip out any glyph data not related to text actually used by your game?
        • Do you have any user generated text? Is there in game chat?
        • Are you aware that multiple platforms allow users to put just about any Unicode character in their usernames? Are you showing usernames on leaderboards or during multiplayer?
        • Are you okay with having some of this stuff just look wrong and falling back to solid boxes for unsupported glyphs?
    • Don’t forget about ligatures because multiple characters might actually be represented by a single glyph.
    • Were you aware that different languages can need different glyphs for the same code point? That means that when changing from Chinese to Japanese you’ll probably want to swap to a different (but still giant) CJK font that renders adapted Chinese characters appropriately for a Japanese audience.
  • Choose where to draw the glyph
    • You’re probably aware of pairwise kerning and it’s the simplest step to take.
      • Find the right compromise between memory efficiency and computation efficiency when maping from a pair of Unicode code points to a kerning value.
    • Are you aware that modern fonts support positioning characters relative to other characters in both X and Y dimensions?
    • Because you are rendering your font to a pixel display, should you add support for hinting?
  • Draw the glyph
    • Do you want to render the vector glyphs into a texture atlas at build-time?
      • This will simplify rendering the glyph textures at runtime.
      • Do you have enough memory to store all the glyphs you need?
        • Is a single texture even big enough? Multiple textures means more draw calls.
      • How will you support multiple sizes?
        • Scaling the glyphs down with mip maps won’t look that terrible, but it’s certainly not optimal.
        • Scaling them up will be a blurry mess, but you could try to mitigate it by rendering signed distance fields of the glyphs.
        • Would it be better to just render the font out at multiple sizes and select the best one at runtime?
      • How are you going to handle sub-pixel positioning? Whether you are scrolling text on screen, projecting it in 3d space or just supporting basic kerning, you will probably want to place a glyph at a sub-pixel location.
    • Do you want to render the vector glyphs into a texture atlas at runtime?
      • You can save memory by only rendering the glyphs currently used on screen.
      • You will need to worry about how much time you have to render the glyphs. Did opening the leaderboard screen require you to render 200 new Chinese characters before drawing the current frame?
      • How does your caching system even function?
        • Is it going to split glyphs across multiple textures creating more draw calls?
        • How well does it handle large and small fonts?
        • How do old glyphs get discarded from the cache?
        • When is it safe to write to your font cache textures?
        • If text is scaling dynamically, does that constantly update the cache or do you just deal with all the artifacts I listed above about scaling font textures?
        • Are you caching glyphs with the appropriate sub-pixel rendering offsets or just letting them smear under the texture filtering?
    • Do you want to render directly from the vector data?
      • This is going to fix almost all of the problems above and create a whole set of new ones.
      • You need to pack your vector data in a manner that you can interpret on the GPU.
      • The glyph shader is going to be more complicated and more expensive. You have gone from stamping textures to evaluating pixel coverage of (potentially complex) vector shapes.
      • That said, if done right, the results will look great.

As far as I’m aware, somewhere near zero games do everything right. I certainly don’t.

How Things Were

The font system for INVERSUS started very simple and then had features tacked on to better accommodate the 10 supported languages: English, French, Italian, German, Brazilian Portuguese, Russian, Simplified Chinese, Japanese, and Korean.

A game font is specified in text files using the C-like format supported by my data reflection system. Font definitions contain sets of font sources assigned to specific languages and a default set for any unspecified languages.  Each set of sources is a prioritized list of font assets to extract glyphs from. This could allow for Emoji to come from one font, English glyphs from another font and esoteric punctuation glyphs from a third. After building the data, everything is collapsed into a binary format for fast glyph and kerning queries at runtime.

The simplest font in the game is the one used by the big numbers on the score bar. It has no custom language support, only uses a single font asset, and only contains a small number of glyphs. The font definition looks like this:

It is requesting that glyphs be pulled from a file named “Fonts/ScoreBar/ScoreBar.fnt”. If you’re wondering why the extension, m_assetType, is split from the path, m_assetName, it’s just some historical garbage that I never got around to cleaning it up.

FNT files are a format output by a free program called BMFont that generates texture atlases from TTF and OTF fonts. I would use the program to select input fonts, line heights and glyphs to generate TGA images and FNT files from. The FNT file describes the glyph atlas locations and rendering instructions. The ScoreBar.fnt file only encodes a glyphs for ” 0123456789’+,.x” and looks like so:

Interfacing with the BMFont tool was one of the most cumbersome parts of the font workflow. There are numerous ways this could be automated/improved, but it never became a high enough priority. The ScoreBar font didn’t really expose these workflow problems, though. The issues would appear with the fonts that rendered localized text. Every time a new batch of translations came in, I needed to make sure that the textures contained any new glyphs. For example, rather than exporting every Chinese character by default, I would just export the ones used by the game. This involved manually copying the appropriate column of text from an Excel spreadsheet into a text file and then loading up all Chinese localized fonts in BMFont where I could tell them to select glyphs from my text file. Finally I saved the BMFont configuration file, and exported the font atlas with FNT file. This would also be performed for Japanese and Korean and involved lots of praying that I didn’t screw it up because validating it all worked in game was not fun either.

Here is the more interesting definition for the font used by localized menu items:

At the top of the file you’ll notice that it lists five sources.

  • The English source encodes every character supported by the primary font of the game, Glacial Indifference (basically the ASCII set).
  • The three CJK fonts point to atlases baked from the appropriate version of Noto Sans and the respective translated game text.
    • These are rendered out at a larger font height than Glacial Indifference to make the actual English glyph sizes match up between the two. During the compilation step, fonts of different heights are all aligned at the glyph base line. This might actually make the resulting game font height larger than that of any sources (maximal extent below base line to maximal extent above base line across all sources).
    • The Japanese source also includes a full set of hiragana and katakana as those are likely to appear in user names.
    • The Simplified Chinese source also pulls in the full set of characters used by Russian, English, Spanish, etc because it functions as the fallback for all characters missing from other fonts.
  • The Custom source points to a file of a different asset type: rj_glyph_set. This is a proprietary (and unfortunately hand edited) file format for specifying an atlas of custom glyphs made specifically for the game (e.g. button icons).

These sources are then placed in prioritized list (lowest to highest) for the different languages. For Japanese (language code “ja”), it searches the custom glyph source first and the Chinese source last. The “source_name : {}” syntax is just telling an empty structure to overlay on top of the “source_name” structure data. Because these are reflected arrays of structures and not arrays of pointers, this is a hacky way for me to share data across each array.

Initially, the list was simpler and only contained three sources,  Japanese, English and Custom. After launch, I realized that meant some usernames with Korean and Chinese characters which were supported by other languages might not render at all when viewing the Japanese language. In an early patch, I fixed this by inserting the other language fonts as lower priority fallbacks. Optimally those glyphs would be rendered with any tweaks from the Japanese font (if there even are any), but having something show up at all was far better than nothing.

Once all is said and done, the compiled font has separate glyph and kerning lookup tables for Japanese, Korean and Default. When looking up a glyph, it points you to the appropriate texture atlas – the above font would create 5 texture atlases. Having glyphs potentially bounce back and forth between textures at runtime is a bit of a disaster, but for INVERSUS it was shippable.

How Things Are Now

There are a million things I’d like to improve, but as a first step I wanted to focus on the build pipeline. More specifically, I wanted my tools to take TTF and OTF files as inputs instead of FNT files. I also intentionally decided to keep my changes focused on the transformation from source data to the binary format loaded by the game. If I could keep the binary format unchanged (and I mostly did), I would have a nice test bed because everything at runtime remains untouched.

One option would be to run the BMFont tool from the command line as part of my build precess, but I wasn’t a fan of that for a few reasons.

  • I’d like to eventually support all glyphs which means eventually moving towards some form of vector to bitmap conversion at runtime. Keeping that step locked behind a closed source tool doesn’t help.
  • There is a lot of font positioning data that gets lost in the BMFont tool and I would like to someday support better positioning above and beyond pairwise kerning.
  • Running a separate process for BMFont from within my font data compiler is a bit messy. It creates a great spot for things to fail in a way that’s hard to debug, and isn’t the best for performance.

Instead, I decided to render and pack my own texture atlas from the source vector data. Because I would be generating my own atlas, this also led to packing the glyphs from all sources into a single atlas!

In order to render the glyphs a few options came to mind.

  • Use Windows to render the fonts with GDI or some more modern interface.
    • Currently, my whole tool chain can technically run on any platform. In practice, I only develop on Windows, but this didn’t feel like a great reason to lock out the possibility of actually building data from a Mac in the future. I also wouldn’t want to port the font rasterizer to use different libraries (and thus get different results) when run on different platforms.
  • Use FreeType to render the fonts
    • FreeType is a huge dependency to add to the code base. I tend to keep my stuff relatively simple and don’t want to be wrangling something that massive.
    • FreeType also has a licensing clause about mentioning it in closed binary builds that use it. For the tool that’s not a big deal, but if I do end up moving this stuff to runtime in the future, it’s an annoyance I’d rather not bring into every project. I don’t currently have any legal baggage for crediting libraries in the runtime engine. INVERSUS does credit the one external library it uses, but I like that it can be forgotten if I’m ever making some dumb little app or were to open source the code base.
  • Use stb_truetype to render the fonts.
    • This is only a single file so it’s not unwieldy.
    • The license is very permissive (public domain).
    • While some of the code can be a bit cryptic, the logic is generally simple enough that I could extend it without going insane.

I ended up choosing stb_truetype and got rid of all my old code for parsing the FNT files. I also updated my font definition format to allow specifying some new fields per source:

  • Line Height is used to convert the vector fonts into pixel data.
    • This is currently ignored by the custom texture atlas source because scaling the bitmaps didn’t seem like a great idea. Were I to store the source image as a vector art file instead of a PSD, I think this would be worth doing.
  • Character Set specifies the set of glyphs that should be extracted from the font source (assuming they weren’t already found in a higher priority source). The set is built from a combination of multiple representations and each source can use whatever is most convenient.
    • Inline string data
    • Referenced text file data
    • Code point ranges
    • Code point list

Let’s take a look at the new menu font file! You might also notice that I’ve reversed the prioritization order of source lists – highest priority items are now at the top.

The combined set of glyphs used by each language gets packed into a single texture!

Unfortunately, it’s not perfect. I avoid packing the same glyph twice (e.g. if it is shared across languages) by checking if anything was already packed from the same source and code point. However, I don’t detect cases where the same pixel data is rendered from multiple sources. This can easily happen for characters that don’t change between the three CJK fonts. You can see an example in the top left of the atlas where the first two glyphs are identical. While this should be fixed someday, it is not the highest priority issue and once I start running out of texture space, I’ll do something about it. In the meantime, there is a much higher priority issue to worry about: none of the INVERSUS fonts actually work properly with stb_truetype.

Initially, I had an issue with the English font, Glacial Indifference, and I sort of glossed over it earlier. The font is distributed as an OTF file, but if you look in the above definition, I reference it as a TTF file. OTF files are a superset of TTF (TrueType) that supports additional features like CFF (Compressed Font Format) and advanced glyph positioning data. If you can’t guess from the name, stb_truetype was initially built for TrueType support. OpenType support for CFF was later added by a contributor and what ever parts are missing are the parts needed for Glacial Indifference to render glyphs. The Noto fonts are also CFF based, but their glyphs have rendered fine. Luckily, I was able to use an online font converter to create a TTF Glacial Indifference and life was good – or so it seemed.

I came to learn that none of fonts were getting kerning information. For Glacial Indifference, I assume it was lost in the OTF -> TTF conversion. For the Noto fonts, I discovered that stb_truetype only supports kerning for TTF encoded fonts. While TrueType fonts use a ‘kern’ table, OpenType supports a GPOS (Glyph Positioning) table which is more feature rich, but also more complicated. GPOS is where you can find all of the sophisticated 2D glyph positioning instructions, but can also encode simple pairwise kerning. At this point, I either need to go with a different font interpreter or make some improvements to stb_truetype. The later option seems more in line with my goals.

I read through the GPOS docs and it seems reasonable to parse. My current hope is that extracting the pairwise kerning data at minimum will get the Noto fonts working. If I’m lucky, that will also put me in a more comfortable place for figuring out why the OTF version of Glacial Indifference wouldn’t render. Wish me luck!

How Things Should Be

Once I get basic kerning working again, I’ll probably move onto tasks unrelated to fonts, but I’ve got a better picture of where I want the font architecture to head.

  • The next large goal is to building the font atlas on demand at runtime.
    • The actual implementation of the cache is up in the air, but the details aren’t important until I can actually start writing it.
  • I’d like to split stb_truetype into two parts.
    • The TTF/OTF parser should output to an intermediate format that is designed for faster lookup and rendering of code points. This is the format that gets stored to disk.
    • The glyph renderer should operate on the accelerated intermediate format loaded from disk instead of the actual source font files.
    • If I later moved to font rendering on the GPU this architectural split would still hold, but with a modified intermediate format.
  • I’d like to stick with the font merging system that I’m currently using to allow support for a robust CJK font (or at least a much larger portion of one) while also overriding a subset of glyphs from a smaller stylized font.
  • It sounds like the robust positioning support provided by the GPOS table could be transformed into a format more appropriate for real time evaluation. At minimum, pairwise 2D data could be stored instead of 1D. Even better might be to store some sort of Trie per relevant code point for evaluating the rules involving three or more characters.
    • The alternative is to use HarfBuzz but that is another giant dependency and my current guess (and it is very much a guess and not a well informed statement) is that it wouldn’t be terribly hard to find a better middle ground on complexity, performance and compatibility.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *