Skip to content

Commit

Permalink
Merge pull request #965 from hsivonen/adapter
Browse files Browse the repository at this point in the history
Enable choice from multiple Unicode back ends
  • Loading branch information
hsivonen authored Oct 29, 2024
2 parents 9163f30 + 662970f commit 59c7ea3
Show file tree
Hide file tree
Showing 6 changed files with 152 additions and 196 deletions.
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,7 @@ URL library for Rust, based on the [URL Standard](https://url.spec.whatwg.org/).
[Documentation](https://docs.rs/url)

Please see [UPGRADING.md](https://github.com/servo/rust-url/blob/main/UPGRADING.md) if you are upgrading from a previous version.

## Alternative Unicode back ends

`url` depends on the `idna` crate. By default, `idna` uses [ICU4X](https://github.com/unicode-org/icu4x/) as its Unicode back end. If you wish to opt for different tradeoffs between correctness, run-time performance, binary size, compile time, and MSRV, please see the [README of the latest version of the `idna_adapter` crate](https://docs.rs/crate/idna_adapter/latest) for how to opt into a different Unicode back end.
9 changes: 4 additions & 5 deletions idna/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
[package]
name = "idna"
version = "1.0.2"
version = "1.0.3"
authors = ["The rust-url developers"]
description = "IDNA (Internationalizing Domain Names in Applications) and Punycode."
keywords = ["no_std", "web", "http"]
repository = "https://github.com/servo/rust-url/"
license = "MIT OR Apache-2.0"
autotests = false
edition = "2018"
rust-version = "1.67"
rust-version = "1.57" # For panic in const context

[lib]
doctest = false
Expand All @@ -17,7 +17,7 @@ doctest = false
default = ["std", "compiled_data"]
std = ["alloc"]
alloc = []
compiled_data = ["icu_normalizer/compiled_data", "icu_properties/compiled_data"]
compiled_data = ["idna_adapter/compiled_data"]

[[test]]
name = "tests"
Expand All @@ -36,10 +36,9 @@ tester = "0.9"
serde_json = "1.0"

[dependencies]
icu_normalizer = "1.4.3"
icu_properties = "1.4.2"
utf8_iter = "1.0.4"
smallvec = { version = "1.13.1", features = ["const_generics"]}
idna_adapter = "1"

[[bench]]
name = "all"
Expand Down
4 changes: 4 additions & 0 deletions idna/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@ Apps that need to display host names to the user should use `uts46::Uts46::to_us
* `std` - Adds `impl std::error::Error for Errors {}` (and implies `alloc`).
* By default, all of the above are enabled.

## Alternative Unicode back ends

By default, `idna` uses [ICU4X](https://github.com/unicode-org/icu4x/) as its Unicode back end. If you wish to opt for different tradeoffs between correctness, run-time performance, binary size, compile time, and MSRV, please see the [README of the latest version of the `idna_adapter` crate](https://docs.rs/crate/idna_adapter/latest) for how to opt into a different Unicode back end.

## Breaking changes since 0.5.0

* Stricter IDNA 2008 restrictions are no longer supported. Attempting to enable them panics immediately. UTS 46 allows all the names that IDNA 2008 allows, and when transitional processing is disabled, they resolve the same way. There are additional names that IDNA 2008 disallows but UTS 46 maps to names that IDNA 2008 allows (notably, input is mapped to fold-case output). UTS 46 also allows symbols that were allowed in IDNA 2003 as well as newer symbols that are allowed according to the same principle. (Earlier versions of this crate allowed rejecting such symbols. Rejecting characters that UTS 46 maps to IDNA 2008-permitted characters wasn't supported in earlier versions, either.)
Expand Down
6 changes: 6 additions & 0 deletions idna/benches/all.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,11 @@ fn to_ascii_cow_plain(bench: &mut Bencher) {
bench.iter(|| idna::domain_to_ascii_cow(black_box(encoded), idna::AsciiDenyList::URL));
}

fn to_ascii_cow_hyphen(bench: &mut Bencher) {
let encoded = "hyphenated-example.com".as_bytes();
bench.iter(|| idna::domain_to_ascii_cow(black_box(encoded), idna::AsciiDenyList::URL));
}

fn to_ascii_cow_leading_digit(bench: &mut Bencher) {
let encoded = "1test.example".as_bytes();
bench.iter(|| idna::domain_to_ascii_cow(black_box(encoded), idna::AsciiDenyList::URL));
Expand Down Expand Up @@ -99,6 +104,7 @@ benchmark_group!(
to_ascii_simple,
to_ascii_merged,
to_ascii_cow_plain,
to_ascii_cow_hyphen,
to_ascii_cow_leading_digit,
to_ascii_cow_unicode_mixed,
to_ascii_cow_punycode_mixed,
Expand Down
Loading

0 comments on commit 59c7ea3

Please sign in to comment.