As a customer, I would like a function similar to ImGearPDEFont.GetOneByteEncoding(), but that will work with Multi-Byte Character Encoding.
A Basic Example:
The ImGearPDEText class may return a character with code 000 (a non-printable control character) but ImGearPDEFont.GetOneByteEncoding() allows me to look up how this font is being stored (which can vary every document) and map this to the correct character with code 065 or 'A'.
A very similar thing happens with multi-byte fonts. So inside one of my sample PDFs I can see that there is a lookup table for a multibyte font. The table looks like
/CIDInit /ProcSet findresource begin 12 dict begin begincmap /CIDSystemInfo << /Registry (Adobe) /Ordering (UCS) /Supplement 0 >> def /CMapName /Adobe-Identity-UCS def /CMapType 2 def 1 begincodespacerange <0000> <FFFF> endcodespacerange 2 beginbfchar <0003> <0020> <043E> <2212> endbfchar endcmap CMapName currentdict /CMap defineresource pop end end
This defines a few things but the important thing is that it provides mappings between how each character is stored and what it's real character code is. (e.g the table above defines that code 043E maps to 2212 (a hyphen))