TrueType字体的cmap表中的字符代码是什么

时间:2018-09-18 08:54:59

标签: unicode fonts character-codes

想知道TrueType字体中cmap table的“字符代码”是什么。微软谈论Character to Glyph Index Mapping Table,但我看不出字符或字形索引的含义。

想知道是否在字体文件中的某处指定了 encoding ,例如Unicode 11.0,然后字符代码等于Unicode代码点,例如{{1 }}。或者,如果字符代码是“浏览器”字符代码(我猜是十进制代码),例如U+0061代表a

基本上想知道如何将键盘字符映射到字形,以及这实际上意味着什么。我想您不是很想将键盘代码映射到字形,而是将97这样的unicode代码映射到字形,因此,如果使用JavaScript(for example),则可以做a,如果您的字体支持,它将给您U+0061

试图从字体文件如何将数学字形作为矢量/路径映射到某种字符或代码的角度来理解字体文件的结构。

1 个答案:

答案 0 :(得分:3)

The short, but perhaps not desired, answer is of course "read the OpenType spec. It takes a while", so a slightly longer, but easier and less detailed answer would be http://pomax.github.io/CFF-glyphlet-fonts, although that skips over TTF so let's look at that here:

Your input code gets run through whatever is the applicable CMAP given the context you're applying the font to, which maps the computer's code (ascii code, unicode code point, ISO-2022-jp, what have you) to a glyph id. For TTF specifically, that id is then used as array offset in the "loca" table, which is the "glyph index to data location" table and specifies the byte offset in the "glyf" table for each glyph that the font contains. You then consult the glyf table at that byte offset, and starting parsing the bytes as specified by https://docs.microsoft.com/en-us/typography/opentype/spec/glyf