Tamil character tables

This document lists the per-character shaping information needed to shape Tamil text.

Table of Contents

Tamil character table
Tamil Supplement character table
Grantha marks character table
Vedic Extensions character table
Miscellaneous character table

Tamil character table

Tamil glyphs should be classified as in the following table. Codepoints in the Tamil block with no assigned meaning are designated as unassigned in the Unicode category column.

Assigned codepoints with a null in the Shaping class column evoke no special behavior from the shaping engine. Note that this does include some valid codepoints, such as currency marks, punctuation, and other symbols.

Note: the NUMBER and SYMBOL Shaping classes are important during syllable identification, but generally evoke no further special behavior during the rest of the shaping process.

The Mark-placement subclass column indicates mark-placement positioning for codepoints in the Mark category. Assigned, non-mark codepoints have a null in this column and evoke no special mark-placement behavior. Marks tagged with [Mn] in the Unicode category column are categorized as non-spacing; marks tagged with [Mc] are categorized as spacing-combining.

Some codepoints in the following table use a Shaping class that differs from the codepoint's Unicode General Category. The Shaping class takes precedence during OpenType shaping, as it captures more specific, script-aware behavior.

Codepoint	Unicode category	Shaping class	Mark-placement subclass	Glyph
`U+0B80`	unassigned
`U+0B81`	unassigned
`U+0B82`	Mark [Mn]	BINDU	TOP_POSITION	ஂ Anusvara
`U+0B83`	Letter	MODIFYING_LETTER	null	ஃ Visarga
`U+0B84`	unassigned
`U+0B85`	Letter	VOWEL_INDEPENDENT	null	அ A
`U+0B86`	Letter	VOWEL_INDEPENDENT	null	ஆ Aa
`U+0B87`	Letter	VOWEL_INDEPENDENT	null	இ I
`U+0B88`	Letter	VOWEL_INDEPENDENT	null	ஈ Ii
`U+0B89`	Letter	VOWEL_INDEPENDENT	null	உ U
`U+0B8A`	Letter	VOWEL_INDEPENDENT	null	ஊ Uu
`U+0B8B`	unassigned
`U+0B8C`	unassigned
`U+0B8D`	unassigned
`U+0B8E`	Letter	VOWEL_INDEPENDENT	null	எ E
`U+0B8F`	Letter	VOWEL_INDEPENDENT	null	ஏ Ee

`U+0B90`	Letter	VOWEL_INDEPENDENT	null	ஐ Ai
`U+0B91`	unassigned
`U+0B92`	Letter	VOWEL_INDEPENDENT	null	ஒ O
`U+0B93`	Letter	VOWEL_INDEPENDENT	null	ஓ Oo
`U+0B94`	Letter	VOWEL_INDEPENDENT	null	ஔ Au
`U+0B95`	Letter	CONSONANT	null	க Ka
`U+0B96`	unassigned
`U+0B97`	unassigned
`U+0B98`	unassigned
`U+0B99`	Letter	CONSONANT	null	ங Nga
`U+0B9A`	Letter	CONSONANT	null	ச Ca
`U+0B9B`	unassigned
`U+0B9C`	Letter	CONSONANT	null	ஜ Ja
`U+0B9D`	unassigned
`U+0B9E`	Letter	CONSONANT	null	ஞ Nya
`U+0B9F`	Letter	CONSONANT	null	ட Tta

`U+0BA0`	unassigned
`U+0BA1`	unassigned
`U+0BA2`	unassigned
`U+0BA3`	Letter	CONSONANT	null	ண Nna
`U+0BA4`	Letter	CONSONANT	null	த Ta
`U+0BA5`	unassigned
`U+0BA6`	unassigned
`U+0BA7`	unassigned
`U+0BA8`	Letter	CONSONANT	null	ந Na
`U+0BA9`	Letter	CONSONANT	null	ன Nnna
`U+0BAA`	Letter	CONSONANT	null	ப Pa
`U+0BAB`	unassigned
`U+0BAC`	unassigned
`U+0BAD`	unassigned
`U+0BAE`	Letter	CONSONANT	null	ம Ma
`U+0BAF`	Letter	CONSONANT	null	ய Ya

`U+0BB0`	Letter	CONSONANT	null	ர Ra
`U+0BB1`	Letter	CONSONANT	null	ற Rra
`U+0BB2`	Letter	CONSONANT	null	ல La
`U+0BB3`	Letter	CONSONANT	null	ள Lla
`U+0BB4`	Letter	CONSONANT	null	ழ Llla
`U+0BB5`	Letter	CONSONANT	null	வ Va
`U+0BB6`	Letter	CONSONANT	null	ஶ Sha
`U+0BB7`	Letter	CONSONANT	null	ஷ Ssa
`U+0BB8`	Letter	CONSONANT	null	ஸ Sa
`U+0BB9`	Letter	CONSONANT	null	ஹ Ha
`U+0BBA`	unassigned
`U+0BBB`	unassigned
`U+0BBC`	unassigned
`U+0BBD`	unassigned
`U+0BBE`	Mark [Mc]	VOWEL_DEPENDENT	RIGHT_POSITION	ா Sign Aa
`U+0BBF`	Mark [Mc]	VOWEL_DEPENDENT	RIGHT_POSITION	ி Sign I

`U+0BC0`	Mark [Mn]	VOWEL_DEPENDENT	TOP_POSITION	ீ Sign Ii
`U+0BC1`	Mark [Mc]	VOWEL_DEPENDENT	RIGHT_POSITION	ு Sign U
`U+0BC2`	Mark [Mc]	VOWEL_DEPENDENT	RIGHT_POSITION	ூ Sign Uu
`U+0BC3`	unassigned
`U+0BC4`	unassigned
`U+0BC5`	unassigned
`U+0BC6`	Mark [Mc]	VOWEL_DEPENDENT	LEFT_POSITION	ெ Sign E
`U+0BC7`	Mark [Mc]	VOWEL_DEPENDENT	LEFT_POSITION	ே Sign Ee
`U+0BC8`	Mark [Mc]	VOWEL_DEPENDENT	LEFT_POSITION	ை Sign Ai
`U+0BC9`	unassigned
`U+0BCA`	Mark [Mc]	VOWEL_DEPENDENT	LEFT_AND_RIGHT_POSITION	ொ Sign O
`U+0BCB`	Mark [Mc]	VOWEL_DEPENDENT	LEFT_AND_RIGHT_POSITION	ோ Sign Oo
`U+0BCC`	Mark [Mc]	VOWEL_DEPENDENT	LEFT_AND_RIGHT_POSITION	ௌ Sign Au
`U+0BCD`	Mark [Mn]	VIRAMA	TOP_POSITION	் Virama
`U+0BCE`	unassigned
`U+0BCF`	unassigned

`U+0BD0`	Letter	null	null	ௐ Om
`U+0BD1`	unassigned
`U+0BD2`	unassigned
`U+0BD3`	unassigned
`U+0BD4`	unassigned
`U+0BD5`	unassigned
`U+0BD6`	unassigned
`U+0BD7`	Mark [Mc]	VOWEL_DEPENDENT	RIGHT_POSITION	ௗ Au Length Mark
`U+0BD8`	unassigned
`U+0BD9`	unassigned
`U+0BDA`	unassigned
`U+0BDB`	unassigned
`U+0BDC`	unassigned
`U+0BDD`	unassigned
`U+0BDE`	unassigned
`U+0BDF`	unassigned

`U+0BE0`	unassigned
`U+0BE1`	unassigned
`U+0BE2`	unassigned
`U+0BE3`	unassigned
`U+0BE4`	unassigned
`U+0BE5`	unassigned
`U+0BE6`	Number	NUMBER	null	௦ Digit Zero
`U+0BE7`	Number	NUMBER	null	௧ Digit One
`U+0BE8`	Number	NUMBER	null	௨ Digit Two
`U+0BE9`	Number	NUMBER	null	௩ Digit Three
`U+0BEA`	Number	NUMBER	null	௪ Digit Four
`U+0BEB`	Number	NUMBER	null	௫ Digit Five
`U+0BEC`	Number	NUMBER	null	௬ Digit Six
`U+0BED`	Number	NUMBER	null	௭ Digit Seven
`U+0BEE`	Number	NUMBER	null	௮ Digit Eight
`U+0BEF`	Number	NUMBER	null	௯ Digit Nine

`U+0BF0`	Number	NUMBER	null	௰ Number Ten
`U+0BF1`	Number	NUMBER	null	௱ Number One Hundred
`U+0BF2`	Number	NUMBER	null	௲ Number One Thousand
`U+0BF3`	Symbol	SYMBOL	null	௳ Day Sign
`U+0BF4`	Symbol	SYMBOL	null	௴ Month Sign
`U+0BF5`	Symbol	SYMBOL	null	௵ Year Sign
`U+0BF6`	Symbol	SYMBOL	null	௶ Debit Sign
`U+0BF7`	Symbol	SYMBOL	null	௷ Credit Sign
`U+0BF8`	Symbol	SYMBOL	null	௸ As Above Sign
`U+0BF9`	Symbol	SYMBOL	null	௹ Tamil Rupee Sign
`U+0BFA`	Symbol	SYMBOL	null	௺ Number Sign
`U+0BFB`	unassigned
`U+0BFC`	unassigned
`U+0BFD`	unassigned
`U+0BFE`	unassigned
`U+0BFF`	unassigned

Tamil Supplement character table

Tamil text runs may also include historical symbols and fractions from the Tamil Supplement block. These characters should be classified as follows.

Codepoint	Unicode category	Shaping class	Mark-placement subclass	Glyph
`U+11FC0`	Number	NUMBER	null	𑿀 Fraction One Three-Hundred-And-Twentieth
`U+11FC1`	Number	NUMBER	null	𑿁 Fraction One One-Hundred-And-Sixtieth
`U+11FC2`	Number	NUMBER	null	𑿂 Fraction One Eightieth
`U+11FC3`	Number	NUMBER	null	𑿃 Fraction One Sixty-Fourth
`U+11FC4`	Number	NUMBER	null	𑿄 Fraction One Fortieth
`U+11FC5`	Number	NUMBER	null	𑿅 Fraction One Thirty-Second
`U+11FC6`	Number	NUMBER	null	𑿆 Fraction Three Eightieths
`U+11FC7`	Number	NUMBER	null	𑿇 Fraction Three Sixty-Fourths
`U+11FC8`	Number	NUMBER	null	𑿈 Fraction One Twentieth
`U+11FC9`	Number	NUMBER	null	𑿉 Fraction One Sixteenth-1
`U+11FCA`	Number	NUMBER	null	𑿊 Fraction One Sixteenth-2
`U+11FCB`	Number	NUMBER	null	𑿋 Fraction One Tenth
`U+11FCC`	Number	NUMBER	null	𑿌 Fraction One Eighth
`U+11FCD`	Number	NUMBER	null	𑿍 Fraction Three Twentieths
`U+11FCE`	Number	NUMBER	null	𑿎 Fraction Three Sixteenths
`U+11FCF`	Number	NUMBER	null	𑿏 Fraction One Fifth

`U+11FD0`	Number	NUMBER	null	𑿐 Fraction One Quarter
`U+11FD1`	Number	NUMBER	null	𑿑 Fraction One Half-1
`U+11FD2`	Number	NUMBER	null	𑿒 Fraction One Half-2
`U+11FD3`	Number	NUMBER	null	𑿓 Fraction Three Quarters
`U+11FD4`	Number	NUMBER	null	𑿔 Fraction Downscaling Factor Kiizh
`U+11FD5`	Symbol	SYMBOL	null	𑿕 Sign Nel
`U+11FD6`	Symbol	SYMBOL	null	𑿖 Sign Cevitu
`U+11FD7`	Symbol	SYMBOL	null	𑿗 Sign Aazhaakku
`U+11FD8`	Symbol	SYMBOL	null	𑿘 Sign Uzhakku
`U+11FD9`	Symbol	SYMBOL	null	𑿙 Sign Muuvuzhakku
`U+11FDA`	Symbol	SYMBOL	null	𑿚 Sign Kuruni
`U+11FDB`	Symbol	SYMBOL	null	𑿛 Sign Pathakku
`U+11FDC`	Symbol	SYMBOL	null	𑿜 Sign Mukkuruni
`U+11FDD`	Symbol	SYMBOL	null	𑿝 Sign Kaacu
`U+11FDE`	Symbol	SYMBOL	null	𑿞 Sign Panam
`U+11FDF`	Symbol	SYMBOL	null	𑿟 Sign Pon

`U+11FE0`	Symbol	SYMBOL	null	𑿠 Sign Varaakan
`U+11FE1`	Symbol	SYMBOL	null	𑿡 Sign Paaram
`U+11FE2`	Symbol	SYMBOL	null	𑿢 Sign Kuzhi
`U+11FE3`	Symbol	SYMBOL	null	𑿣 Sign Veli
`U+11FE4`	Symbol	SYMBOL	null	𑿤 Wet Cultivation Sign
`U+11FE5`	Symbol	SYMBOL	null	𑿥 Dry Cultivation Sign
`U+11FE6`	Symbol	SYMBOL	null	𑿦 Land Sign
`U+11FE7`	Symbol	SYMBOL	null	𑿧 Salt Pan Sign
`U+11FE8`	Symbol	SYMBOL	null	𑿨 Traditional Credit Sign
`U+11FE9`	Symbol	SYMBOL	null	𑿩 Traditional Number Sign
`U+11FEA`	Symbol	SYMBOL	null	𑿪 Current Sign
`U+11FEB`	Symbol	SYMBOL	null	𑿫 And Odd Sign
`U+11FEC`	Symbol	SYMBOL	null	𑿬 Spent Sign
`U+11FED`	Symbol	SYMBOL	null	𑿭 Total Sign
`U+11FEE`	Symbol	SYMBOL	null	𑿮 In Possession Sign
`U+11FEF`	Symbol	SYMBOL	null	𑿯 Starting From Sign

`U+11FF0`	Symbol	SYMBOL	null	𑿰 Sign Muthaliya
`U+11FF1`	Symbol	SYMBOL	null	𑿱 Sign Vakaiyaraa
`U+11FF2`	unassigned
`U+11FF3`	unassigned
`U+11FF4`	unassigned
`U+11FF5`	unassigned
`U+11FF6`	unassigned
`U+11FF7`	unassigned
`U+11FF8`	unassigned
`U+11FF9`	unassigned
`U+11FFA`	unassigned
`U+11FFB`	unassigned
`U+11FFC`	unassigned
`U+11FFD`	unassigned
`U+11FFE`	unassigned
`U+11FFF`	Punctuation	null	null	𑿿 End Of Text

Grantha marks character table

Tamil text runs may also include diacritical and syllable-modifier marks from the Grantha block. These characters should be classified as follows.

Codepoint	Unicode category	Shaping class	Mark-placement subclass	Glyph
`U+11301`	Mark [Mn]	BINDU	TOP_POSITION	𑌁 Grantha Candrabindu
`U+11303`	Mark [Mc]	VISARGA	RIGHT_POSITION	𑌃 Grantha Visarga
`U+1133B`	Mark [Mn]	NUKTA	BOTTOM_POSITION	𑌻 Combining Bindu Below
`U+1133C`	Mark [Mn]	NUKTA	BOTTOM_POSITION	𑌼 Grantha Nukta

Vedic Extensions character table

Sanskrit runs written in the Tamil script may also include characters from the Vedic Extensions block. These characters should be classified as follows.

Note: See the Vedic Extensions document for additional information.

Codepoint	Unicode category	Shaping class	Mark-placement subclass	Glyph
`U+1CD0`	Mark [Mn]	CANTILLATION	TOP_POSITION	᳐ Tone Karshana
`U+1CD1`	Mark [Mn]	CANTILLATION	TOP_POSITION	᳑ Tone Shara
`U+1CD2`	Mark [Mn]	CANTILLATION	TOP_POSITION	᳒ Tone Prenkha
`U+1CD3`	Punctuation	null	null	᳓ Sign Nihshvasa
`U+1CD4`	Mark [Mn]	CANTILLATION	OVERSTRUCK	᳔ Tone Midline Svarita
`U+1CD5`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳕ Tone Aggravated Independent Svarita
`U+1CD6`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳖ Tone Independent Svarita
`U+1CD7`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳗ Tone Kathaka Independent Svarita
`U+1CD8`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳘ Tone Candra Below
`U+1CD9`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳙ Tone Kathaka Independent Svarita Schroeder
`U+1CDA`	Mark [Mn]	CANTILLATION	TOP_POSITION	᳚ Tone Double Svarita
`U+1CDB`	Mark [Mn]	CANTILLATION	TOP_POSITION	᳛ Tone Triple Svarita
`U+1CDC`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳜ Tone Kathaka Anudatta
`U+1CDD`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳝ Tone Dot Below
`U+1CDE`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳞ Tone Two Dots Below
`U+1CDF`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	᳟ Tone Three Dots Below

`U+1CE0`	Mark [Mn]	CANTILLATION	TOP_POSITION	᳠ Tone Rigvedic Kashmiri Independent Svarita
`U+1CE1`	Mark [Mc]	CANTILLATION	RIGHT_POSITION	᳡ Tone Atharavedic Independent Svarita
`U+1CE2`	Mark [Mn]	AVAGRAHA	OVERSTRUCK	᳢ Sign Visarga Svarita
`U+1CE3`	Mark [Mn]	null	OVERSTRUCK	᳣ Sign Visarga Udatta
`U+1CE4`	Mark [Mn]	null	OVERSTRUCK	᳤ Sign Reversed Visarga Udatta
`U+1CE5`	Mark [Mn]	null	OVERSTRUCK	᳥ Sign Visarga Anudatta
`U+1CE6`	Mark [Mn]	null	OVERSTRUCK	᳦ Sign Reversed Visarga Anudatta
`U+1CE7`	Mark [Mn]	null	OVERSTRUCK	᳧ Sign Visarga Udatta With Tail
`U+1CE8`	Mark [Mn]	AVAGRAHA	OVERSTRUCK	᳨ Sign Visarga Anudatta With Tail
`U+1CE9`	Letter	SYMBOL	null	ᳩ Sign Anusvara Antargomukha
`U+1CEA`	Letter	null	null	ᳪ Sign Anusvara Bahirgomukha
`U+1CEB`	Letter	null	null	ᳫ Sign Anusvara Vamagomukha
`U+1CEC`	Letter	SYMBOL	null	ᳬ Sign Anusvara Vamagomukha With Tail
`U+1CED`	Mark [Mn]	AVAGRAHA	BOTTOM_POSITION	᳭ Sign Tiryak
`U+1CEE`	Letter	SYMBOL	null	ᳮ Sign Hexiform Long Anusvara
`U+1CEF`	Letter	null	null	ᳯ Sign Long Anusvara

`U+1CF0`	Letter	null	null	ᳰ Sign Rthang Long Anusvara
`U+1CF2`	Letter	CONSONANT_DEAD	null	ᳲ Sign Ardhavisarga
`U+1CF3`	Letter	CONSONANT_DEAD	null	ᳳ Sign Rotated Ardhavisarga
`U+1CF3`	Mark [Mc]	VISARGA	null	ᳳ Sign Rotated Ardhavisarga
`U+1CF4`	Mark [Mn]	CANTILLATION	TOP_POSITION	᳴ Tone Candra Above
`U+1CF5`	Letter	CONSONANT_WITH_STACKER	null	ᳵ Sign Jihvamuliya
`U+1CF6`	Letter	CONSONANT_WITH_STACKER	null	ᳶ Sign Upadhmaniya
`U+1CF7`	Mark [Mc]	null	null	᳷ Sign Atikrama
`U+1CF8`	Mark [Mn]	CANTILLATION	null	᳸ Tone Ring Above
`U+1CF9`	Mark [Mn]	CANTILLATION	null	᳹ Tone Double Ring Above
`U+1CFA`	Letter	PLACEHOLDER	null	ᳺ Sign Double Anusvara Antargomukha
`U+1CFB`	unassigned
`U+1CFC`	unassigned
`U+1CFD`	unassigned
`U+1CFE`	unassigned
`U+1CFF`	unassigned

Miscellaneous character table

In addition to general punctuation, runs of Tamil text often use the danda (U+0964) and double danda (U+0965) punctuation marks from the Devanagari block. Tamil text can also incorporate the udatta (U+0951) and anudatta (U+0952) signs from the Devanagari block.

Codepoint	Unicode category	Shaping class	Mark-placement subclass	Glyph
`U+0951`	Mark [Mn]	CANTILLATION	TOP_POSITION	॑ Udatta
`U+0952`	Mark [Mn]	CANTILLATION	BOTTOM_POSITION	॒ Anudatta
`U+0964`	Punctuation	null	null	। Danda
`U+0965`	Punctuation	null	null	॥ Double Danda

Other important characters that may be encountered when shaping runs of Tamil text include the dotted-circle placeholder (U+25CC), the zero-width joiner (U+200D) and zero-width non-joiner (U+200C), and the no-break space (U+00A0).

The dotted-circle placeholder is frequently used when displaying a dependent vowel (matra) or a combining mark in isolation. Real-world text syllables may also use other characters, such as hyphens or dashes, in a similar placeholder fashion; shaping engines should cope with this situation gracefully.

Codepoint	Unicode category	Shaping class	Mark-placement subclass	Glyph
`U+00A0`	Separator	PLACEHOLDER	null	No-break space
`U+00B2`	Number	SYLLABLE_MODIFIER	TOP	² Superscript Two
`U+00B3`	Number	SYLLABLE_MODIFIER	TOP	³ Superscript Three
`U+200C`	Other	NON_JOINER	null	‌ Zero-width non-joiner
`U+200D`	Other	JOINER	null	‍ Zero-width joiner
`U+2010`	Punctuation	PLACEHOLDER	null	‐ Hyphen
`U+2011`	Punctuation	PLACEHOLDER	null	‑ No-break hyphen
`U+2012`	Punctuation	PLACEHOLDER	null	‒ Figure dash
`U+2013`	Punctuation	PLACEHOLDER	null	– En dash
`U+2014`	Punctuation	PLACEHOLDER	null	— Em dash
`U+2074`	Number	SYLLABLE_MODIFIER	TOP	⁴ Superscript Four
`U+2082`	Number	SYLLABLE_MODIFIER	TOP	₂ Subscript Two
`U+2083`	Number	SYLLABLE_MODIFIER	TOP	₃ Subscript Three
`U+2084`	Number	SYLLABLE_MODIFIER	TOP	₄ Subscript Four
`U+25CC`	Symbol	DOTTED_CIRCLE	null	◌ Dotted circle

The zero-width joiner (ZWJ) is primarily used to prevent the formation of a conjunct from a "Consonant,Halant,Consonant" sequence. The sequence "Consonant,Halant,ZWJ,Consonant" blocks the formation of a conjunct between the two consonants.

Note, however, that the "Consonant,Halant" subsequence in the above example may still trigger a half-forms feature. To prevent the application of the half-forms feature in addition to preventing the conjunct, the zero-width non-joiner (ZWNJ) must be used instead. The sequence "Consonant,Halant,ZWNJ,Consonant" should produce the first consonant in its standard form, followed by an explicit "Halant".

A secondary usage of the zero-width joiner is to prevent the formation of "Reph". An initial "Ra,Halant,ZWJ" sequence should not produce a "Reph", where an initial "Ra,Halant" sequence without the zero-width joiner otherwise would.

The no-break space (NBSP) is primarily used to display those codepoints that are defined as non-spacing (marks, dependent vowels (matras), below-base consonant forms, and post-base consonant forms) in an isolated context, as an alternative to displaying them superimposed on the dotted-circle placeholder. These sequences will match "NBSP,ZWJ,Halant,Consonant", "NBSP,mark", or "NBSP,matra".

Tamil text sometimes uses the Latin numerals 2, 3, and 4 in superscript or subscript positions to annotate Sanskrit. When used in this fashion, the superscripts and subscripts are treated as SYLLABLE_MODIFIER signs for shaping purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

character-tables-tamil.md

character-tables-tamil.md

Tamil character tables