-
Notifications
You must be signed in to change notification settings - Fork 1
2.3 Language support, fonts and character sets
The PDF File Writer library supports most of the fonts installed on your computer. The only exception is device fonts. The supported fonts follow the OpenType font specifications. More information is available at Microsoft Typography - OpenType Specification. The text to be drawn is stored in a String made of Unicode characters. The library will accept any character (0 to 65536) except control codes 0 to 31 and 128 to 159. Every character is translated into a glyph. The glyphs are drawn on the page left to right in the same order as they are stored in the string. Most font files support only a subset of all possible Unicode characters. In other words, you must select a font that supports the language of your project or the symbols you are trying to display. If the input String contains unsupported glyphs, the PDF reader will display the "undefined glyph". Normally it is a small rectangle. The test program attached to this article has a "Font Families" button. If you click it you can see all available fonts on your computer and within each font all available characters. If the language of your project is a left to right language and each character is translated into one glyph and the glyph is defined in the font, the result should be what you expect. If the result is not what you expect, here are some additional notes:
Unicode control characters. Unicode control characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation. The PDF File writer does not identify these characters. The library assumes that every character is a display character. They will be displayed as undefined character.
Right to left language. Normally the order of characters in a text string is the order a person would read them. Since the library draws left to right the text will be written backwards. The ReverseString
method reverses the character order. This will solve the problem if the text is made only of right to left characters. If the text is a mix of right to left, left to right, numbers and some characters such as brackets ()[]<>{} it will not produce the desired results. Another limitation is TextBox
class cannot break long right to left text into lines.
Ligature. In some languages a sequence of two or more characters are grouped together to display one glyph. Your software can identify these sequences and replaced them with the proper glyph.
Dotted circle. If you look at the Glyph column of Glyph Metrics screen you can see that some glyphs have a small dotted circle (i.e. character codes 2364 and 2367). These characters are part of a sequence of characters. The dotted circle is not displayed. If the advance width is zero and the bounding box is on the left side of the Y axis, this glyph will be drawen ok. It will be displayed on top of the previous character. If The advance width is not zero, this glyph should be displayed before the previous character. Your software can achieve it by reversing the two characters.
This page is a copy from https://www.codeproject.com/Articles/570682/PDF-File-Writer-Csharp-Class-Library by Uzi Granot. The article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL). All rights to the texts and source code remain with Uzi Granot.