- Unicode Standard
- Unicode Technical Standard #46: IDNA
- Unicode Technical Standard #51: Emoji
- Unicode Standard Annex #15: Normalization Forms
- Unicode Standard Annex #24: Script Property
- Unicode Standard Annex #29: Text Segmentation
- Unicode Standard Annex #31: Identifier and Pattern Syntax
- Unicode Technical Standard #39: Security Mechanisms
- RFC-3492: Punycode
- RFC-5891: IDNA: Protocol
- RFC-5892: The Unicode Code Points and IDNA
- WHATWG URL: IDNA
- Unicode data files
- Download Latest:
node download.js
- To download older version:
node download.js 12.1.0
- Already included: Unicode 11-16
- Download Latest:
- CLDR data files
- Download Latest:
node parse-cldr.js
- To download older version:
node parse-cldr.js 42
- Already included: CLDR 42-45
⚠️ Versioned separately from Unicode!
- Download Latest:
- edit unicode-version.js — specify which versions to use
- edit Rules Files
node make.js
— creates/output/
with data files
- chars-valid.js
- chars-ignored.js
- chars-mapped.js
- chars-disallow.js
- chars-fenced.js — characters that occur in the middle and can't touch
- chars-escape.js — characters that should be escaped
- emoji.js — various emoji configurations
- cm.js — combining mark sequence whitelist
- scripts.js — various script configurations
- confuse.js — confusables groups
- group-order.js — how groups should be sorted for matching efficiency (auto-generated)
node names.js 61..7A 200D
— print Unicode names for hex codepointsnode names.js script Latn
— print Unicode names forLatin
node names.js prop White_Space
— print Unicode names with propertyWhite_Space
node names.js find abc
— find characters by name
- Release
- Diff
node unicode.diff.js 15.1 16
- CLDR
short-names.json
Unchangedregions.json
New"CQ"
- UAX-31:
- New 7 Scripts: Gara, Gukh, Krai, Onao, Sunu, Todr, Tutg
- UTS-39:
- Change
confusables.txt
OUTLINED LATIN [AZ]
withLATIN CAPITAL [AZ]
(no effect, not confusable)LATIN SMALL LETTER SHARP S
(no effect)
- Change
- UTS-46:
- Change IDNA — kept prior behavior
disallowed_STD3_valid
→valid
disallowed_STD3_mapped
→mapped
- Change Various Invisibles/Filler: disallowed → ignored
- Change
_
and$
mappings - New Legacy Computing Supplement
- Change IDNA — kept prior behavior
- UTS-51:
- New 8 Emoji
- Prior Validation:
node test/validate.js 1.10.1
- Fails on new character changes
- Fails on new emoji
- Release
- Diff
node unicode.diff.js 15 15.1
- CLDR
short-names.json
Unchanged
- UCD:
- New Ideographic Description Characters
- New CJK Ideograph Extension I Block
- UAX-31:
- Unchanged Scripts
- Unchanged Recommended, Excluded, Limited Use Scripts
- New Zyyy Script Extensions
- UTS-39:
- New Confusable Bidi Logic (no effect, since mixed-Bidi not allowed)
confusables.txt
Unchanged
- UTS-46:
- Change
1E9E (ẞ) LATIN CAPITAL LETTER SHARP S
→DF (ß) LATIN SMALL LETTER SHARP S
- Change
- UTS-51:
- New 118 Emoji
- Prior Validation:
node test/validate.js 1.9.4
- Fails new emoji