diff --git a/README.rst b/README.rst index 31af3e5..27731b1 100644 --- a/README.rst +++ b/README.rst @@ -1,17 +1,19 @@ Internationalized Domain Names in Applications (IDNA) ===================================================== -Support for the Internationalised Domain Names in Applications -(IDNA) protocol as specified in `RFC 5891 `_. -This is the latest version of the protocol and is sometimes referred to as -“IDNA 2008”. +Support for the Internationalized Domain Names in +Applications (IDNA) protocol as specified in `RFC 5891 +`_. This is the latest version of +the protocol and is sometimes referred to as “IDNA 2008”. -This library also provides support for Unicode Technical Standard 46, -`Unicode IDNA Compatibility Processing `_. +This library also provides support for Unicode Technical +Standard 46, `Unicode IDNA Compatibility Processing +`_. -This acts as a suitable replacement for the “encodings.idna” module that -comes with the Python standard library, but which only supports the -older superseded IDNA specification (`RFC 3490 `_). +This acts as a suitable replacement for the “encodings.idna” +module that comes with the Python standard library, but which +only supports the older superseded IDNA specification (`RFC 3490 +`_). Basic functions are simply executed: @@ -27,24 +29,19 @@ Basic functions are simply executed: Installation ------------ -To install this library, you can use pip: +This package is available for installation from PyPI: .. code-block:: bash - $ pip install idna - -Alternatively, you can install the package using the bundled setup script: - -.. code-block:: bash - - $ python setup.py install + $ python3 -m pip install idna Usage ----- -For typical usage, the ``encode`` and ``decode`` functions will take a domain -name argument and perform a conversion to A-labels or U-labels respectively. +For typical usage, the ``encode`` and ``decode`` functions will take a +domain name argument and perform a conversion to A-labels or U-labels +respectively. .. code-block:: pycon @@ -65,8 +62,8 @@ You may use the codec encoding and decoding methods using the >>> print(b'xn--d1acufc.xn--80akhbyknj4f'.decode('idna')) домен.испытание -Conversions can be applied at a per-label basis using the ``ulabel`` or ``alabel`` -functions if necessary: +Conversions can be applied at a per-label basis using the ``ulabel`` or +``alabel`` functions if necessary: .. code-block:: pycon @@ -76,20 +73,22 @@ functions if necessary: Compatibility Mapping (UTS #46) +++++++++++++++++++++++++++++++ -As described in `RFC 5895 `_, the IDNA -specification does not normalize input from different potential ways a user -may input a domain name. This functionality, known as a “mapping”, is -considered by the specification to be a local user-interface issue distinct -from IDNA conversion functionality. +As described in `RFC 5895 `_, the +IDNA specification does not normalize input from different potential +ways a user may input a domain name. This functionality, known as +a “mapping”, is considered by the specification to be a local +user-interface issue distinct from IDNA conversion functionality. -This library provides one such mapping, that was developed by the Unicode -Consortium. Known as `Unicode IDNA Compatibility Processing `_, -it provides for both a regular mapping for typical applications, as well as -a transitional mapping to help migrate from older IDNA 2003 applications. +This library provides one such mapping, that was developed by the +Unicode Consortium. Known as `Unicode IDNA Compatibility Processing +`_, it provides for both a regular +mapping for typical applications, as well as a transitional mapping to +help migrate from older IDNA 2003 applications. -For example, “Königsgäßchen” is not a permissible label as *LATIN CAPITAL -LETTER K* is not allowed (nor are capital letters in general). UTS 46 will -convert this into lower case prior to applying the IDNA conversion. +For example, “Königsgäßchen” is not a permissible label as *LATIN +CAPITAL LETTER K* is not allowed (nor are capital letters in general). +UTS 46 will convert this into lower case prior to applying the IDNA +conversion. .. code-block:: pycon @@ -102,36 +101,38 @@ convert this into lower case prior to applying the IDNA conversion. >>> print(idna.decode('xn--knigsgchen-b4a3dun')) königsgäßchen -Transitional processing provides conversions to help transition from the older -2003 standard to the current standard. For example, in the original IDNA -specification, the *LATIN SMALL LETTER SHARP S* (ß) was converted into two -*LATIN SMALL LETTER S* (ss), whereas in the current IDNA specification this -conversion is not performed. +Transitional processing provides conversions to help transition from +the older 2003 standard to the current standard. For example, in the +original IDNA specification, the *LATIN SMALL LETTER SHARP S* (ß) was +converted into two *LATIN SMALL LETTER S* (ss), whereas in the current +IDNA specification this conversion is not performed. .. code-block:: pycon >>> idna.encode('Königsgäßchen', uts46=True, transitional=True) 'xn--knigsgsschen-lcb0w' -Implementors should use transitional processing with caution, only in rare -cases where conversion from legacy labels to current labels must be performed -(i.e. IDNA implementations that pre-date 2008). For typical applications -that just need to convert labels, transitional processing is unlikely to be -beneficial and could produce unexpected incompatible results. +Implementors should use transitional processing with caution, only in +rare cases where conversion from legacy labels to current labels must be +performed (i.e. IDNA implementations that pre-date 2008). For typical +applications that just need to convert labels, transitional processing +is unlikely to be beneficial and could produce unexpected incompatible +results. ``encodings.idna`` Compatibility ++++++++++++++++++++++++++++++++ Function calls from the Python built-in ``encodings.idna`` module are mapped to their IDNA 2008 equivalents using the ``idna.compat`` module. -Simply substitute the ``import`` clause in your code to refer to the -new module name. +Simply substitute the ``import`` clause in your code to refer to the new +module name. Exceptions ---------- -All errors raised during the conversion following the specification should -raise an exception derived from the ``idna.IDNAError`` base class. +All errors raised during the conversion following the specification +should raise an exception derived from the ``idna.IDNAError`` base +class. More specific exceptions that may be generated as ``idna.IDNABidiError`` when the error reflects an illegal combination of left-to-right and @@ -149,29 +150,31 @@ tables for performance. These tables are derived from computing against eligibility criteria in the respective standards. These tables are computed using the command-line script ``tools/idna-data``. -This tool will fetch relevant codepoint data from the Unicode repository -and perform the required calculations to identify eligibility. There are +This tool will fetch relevant codepoint data from the Unicode repository +and perform the required calculations to identify eligibility. There are three main modes: -* ``idna-data make-libdata``. Generates ``idnadata.py`` and ``uts46data.py``, - the pre-calculated lookup tables using for IDNA and UTS 46 conversions. Implementors - who wish to track this library against a different Unicode version may use this tool - to manually generate a different version of the ``idnadata.py`` and ``uts46data.py`` - files. +* ``idna-data make-libdata``. Generates ``idnadata.py`` and + ``uts46data.py``, the pre-calculated lookup tables using for IDNA and + UTS 46 conversions. Implementors who wish to track this library against + a different Unicode version may use this tool to manually generate a + different version of the ``idnadata.py`` and ``uts46data.py`` files. * ``idna-data make-table``. Generate a table of the IDNA disposition - (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix B.1 of RFC - 5892 and the pre-computed tables published by `IANA `_. + (e.g. PVALID, CONTEXTJ, CONTEXTO) in the format found in Appendix + B.1 of RFC 5892 and the pre-computed tables published by `IANA + `_. -* ``idna-data U+0061``. Prints debugging output on the various properties - associated with an individual Unicode codepoint (in this case, U+0061), that are - used to assess the IDNA and UTS 46 status of a codepoint. This is helpful in debugging - or analysis. +* ``idna-data U+0061``. Prints debugging output on the various + properties associated with an individual Unicode codepoint (in this + case, U+0061), that are used to assess the IDNA and UTS 46 status of a + codepoint. This is helpful in debugging or analysis. -The tool accepts a number of arguments, described using ``idna-data -h``. Most notably, -the ``--version`` argument allows the specification of the version of Unicode to use -in computing the table data. For example, ``idna-data --version 9.0.0 make-libdata`` -will generate library data against Unicode 9.0.0. +The tool accepts a number of arguments, described using ``idna-data +-h``. Most notably, the ``--version`` argument allows the specification +of the version of Unicode to use in computing the table data. For +example, ``idna-data --version 9.0.0 make-libdata`` will generate +library data against Unicode 9.0.0. Additional Notes @@ -180,25 +183,28 @@ Additional Notes * **Packages**. The latest tagged release version is published in the `Python Package Index `_. -* **Version support**. This library supports Python 3.5 and higher. As this library - serves as a low-level toolkit for a variety of applications, many of which strive - for broad compatibility with older Python versions, there is no rush to remove - older intepreter support. Removing support for older versions should be well - justified in that the maintenance burden has become too high. - -* **Python 2**. Python 2 is supported by version 2.x of this library. While active - development of the version 2.x series has ended, notable issues being corrected - may be backported to 2.x. Use "idna<3" in your requirements file if you need this - library for a Python 2 application. - -* **Testing**. The library has a test suite based on each rule of the IDNA specification, as - well as tests that are provided as part of the Unicode Technical Standard 46, - `Unicode IDNA Compatibility Processing `_. - -* **Emoji**. It is an occasional request to support emoji domains in this library. Encoding - of symbols like emoji is expressly prohibited by the technical standard IDNA 2008 and - emoji domains are broadly phased out across the domain industry due to associated security - risks. For now, applications that wish need to support these non-compliant labels may - wish to consider trying the encode/decode operation in this library first, and then falling - back to using `encodings.idna`. See `the Github project `_ - for more discussion. \ No newline at end of file +* **Version support**. This library supports Python 3.5 and higher. + As this library serves as a low-level toolkit for a variety of + applications, many of which strive for broad compatibility with older + Python versions, there is no rush to remove older intepreter support. + Removing support for older versions should be well justified in that the + maintenance burden has become too high. + +* **Python 2**. Python 2 is supported by version 2.x of this library. + While active development of the version 2.x series has ended, notable + issues being corrected may be backported to 2.x. Use "idna<3" in your + requirements file if you need this library for a Python 2 application. + +* **Testing**. The library has a test suite based on each rule of the + IDNA specification, as well as tests that are provided as part of the + Unicode Technical Standard 46, `Unicode IDNA Compatibility Processing + `_. + +* **Emoji**. It is an occasional request to support emoji domains in + this library. Encoding of symbols like emoji is expressly prohibited by + the technical standard IDNA 2008 and emoji domains are broadly phased + out across the domain industry due to associated security risks. For + now, applications that wish need to support these non-compliant labels + may wish to consider trying the encode/decode operation in this library + first, and then falling back to using `encodings.idna`. See `the Github + project `_ for more discussion.