Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support astral symbols #4

Open
mathiasbynens opened this issue Oct 17, 2013 · 0 comments
Open

Support astral symbols #4

mathiasbynens opened this issue Oct 17, 2013 · 0 comments

Comments

@mathiasbynens
Copy link

> htmlentities.decode('𝌆') // U+1D306 TETRAGRAM FOR CENTRE
'\uD306' // should be `\uD834\uDF06` i.e. `𝌆`

E.g. © decodes just fine, but 𝌆 doesn’t because String.fromCharCode(0x1D306) doesn’t work for astral values (i.e. values > 0xFFFF). U+1D306 is an astral symbol. Details here: http://mathiasbynens.be/notes/javascript-encoding

This can easily be fixed by using the Punycode module:

// Instead of…
String.fromCharCode(codePoint);
// …which only works for values from 0x0000 to 0xFFFF, use this:
punycode.ucs2.encode([ codePoint ]);
// …which works for all Unicode code points (i.e. values from 0x000000 to 0x10FFFF)

(Note: Punycode.js is bundled with Node.js v0.6.2+ but you could always add it to package.json anyway if you want to support older versions).

See he’s he.decode() for a working example that doesn’t rely on Punycode.js.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant