This project is a fork of https://github.com/foliojs/unicode-properties created for use in https://github.com/Hopding/pdf-lib.
Listed below are changes that have been made in this fork:
- Store binary data as compressed base64 JSON so the
fs
module isn't needed to read it back: - Update to Babel 7, replace Browserify with Rollup, and build UMDs
- Build non-rolled-up ES6 and CommonJS in
es/
andlib/
directories: - Released to NPM as
@pdf-lib/unicode-properties
Also see
- https://github.com/Hopding/fontkit
- https://github.com/Hopding/brotli.js
- https://github.com/Hopding/restructure
- https://github.com/Hopding/png-ts
Provides fast access to unicode character properties. Uses unicode-trie to compress the properties for all code points into just 12KB.
import unicode from '@pdf-lib/unicode-properties';
unicode.getCategory('2'.charCodeAt()) //=> 'Nd'
unicode.getNumericValue('2'.charCodeAt()) //=> 2
To install the latest stable version:
# With npm
npm install --save @pdf-lib/unicode-properties
# With yarn
yarn add @pdf-lib/unicode-properties
This assumes you're using npm or yarn as your package manager.
You can also download @pdf-lib/unicode-properties
as a UMD module from unpkg. The UMD builds have been compiled to ES5, so they should work in any modern browser. UMD builds are useful if you aren't using a package manager or module bundler. For example, you can use them directly in the <script>
tag of an HTML page.
The following builds are available:
- https://unpkg.com/@pdf-lib/unicode-properties/dist/unicode-properties.js
- https://unpkg.com/@pdf-lib/unicode-properties/dist/unicode-properties.min.js
When using a UMD build, you will have access to a global window.UnicodeProperties
variable. This variable contains the object exported by @pdf-lib/unicode-properties
. For example:
// NPM module
import unicode from '@pdf-lib/unicode-properties';
// UMD module
var unicode = window.UnicodeProperties;
Returns the unicode general category for the given code point.
Returns the script for the given code point.
Returns the canonical combining class for the given code point.
Returns the East Asian width for the given code point.
Returns the numeric value for the given code point, or null if there is no numeric value for that code point.
Returns whether the code point is an alphabetic character.
Returns whether the code point is a digit.
Returns whether the code point is a punctuation character.
Returns whether the code point is lower case.
Returns whether the code point is upper case.
Returns whether the code point is title case.
Returns whether the code point is whitespace: specifically, whether the category is one of Zs, Zl, or Zp.
Returns whether the code point is a base form. A code point of base form does not graphically combine with preceding characters.
Returns whether the code point is a mark character (e.g. accent).