Support multi PEM parsing and speed up PEM parsing in general #39

dnadoba · 2023-09-12T14:46:34Z

Motivation

Trust roots on Ubuntu and other linux distributions are stored in a single PEM file that contains multiple certificates. We want to support loading these in the future and therefore need to be able to parse multiple certificates from a single PEM file.

Modificaitons

add PEMDocument.parseMultiple(pemString:) that returns an array of PEMDocuments
use new multi PEM parser for PEMDocument(pemString:) as well to speed up parsing and reduce allocations significantly

Result

TL;DR: Parsing & decoding a PEM document is now ~5x faster and mallocs ~12x less. This allows us to parse the WebPKI trust roots from its PEM string representation to the Swift type Certificate from swift-certificates in under 5ms.

I have run a couple benchmarks (Swift 5.8.1 on arm64 (M1 Max) in docker) that parses 130 certificates (100 times in a loop) found at /etc/ssl/certs on Ubuntu.
The first test just parse the PEM String to a PEMDocument:

----------------------------------------------------------------------------------------------------------------------------
Parse WebPKI Roots from PEM to PEMDocument  metrics
----------------------------------------------------------------------------------------------------------------------------

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│          Time (wall clock) (ms)          │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│            asn1-1.0.0-beta.1             │    1008 │    1008 │    1008 │    1008 │    1008 │    1008 │    1008 │       1 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│               Current_run                │     204 │     205 │     206 │     208 │     212 │     212 │     212 │       5 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │    -804 │    -803 │    -802 │    -800 │    -796 │    -796 │    -796 │       4 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │      80 │      80 │      80 │      79 │      79 │      79 │      79 │       4 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│            Malloc (total) (K)            │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│            asn1-1.0.0-beta.1             │    1212 │    1212 │    1212 │    1212 │    1212 │    1212 │    1212 │       1 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│               Current_run                │      96 │      96 │      96 │      96 │      96 │      96 │      96 │       5 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │   -1116 │   -1116 │   -1116 │   -1116 │   -1116 │   -1116 │   -1116 │       4 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │      92 │      92 │      92 │      92 │      92 │      92 │      92 │       4 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

The second test parse the PEM String to a PEMDocument and subsequently as a Certificate:

----------------------------------------------------------------------------------------------------------------------------
Parse WebPKI Roots from PEM to Certificate metrics
----------------------------------------------------------------------------------------------------------------------------

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│          Time (wall clock) (ms)          │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│            asn1-1.0.0-beta.1             │    1311 │    1311 │    1311 │    1311 │    1311 │    1311 │    1311 │       1 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│               Current_run                │     477 │     477 │     479 │     481 │     481 │     481 │     481 │       3 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │    -834 │    -834 │    -832 │    -830 │    -830 │    -830 │    -830 │       2 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │      64 │      64 │      63 │      63 │      63 │      63 │      63 │       2 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

╒══════════════════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│            Malloc (total) (K)            │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞══════════════════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│            asn1-1.0.0-beta.1             │    1763 │    1763 │    1763 │    1763 │    1763 │    1763 │    1763 │       1 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│               Current_run                │     646 │     646 │     646 │     646 │     646 │     646 │     646 │       3 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│                    Δ                     │   -1117 │   -1117 │   -1117 │   -1117 │   -1117 │   -1117 │   -1117 │       2 │
├──────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│              Improvement %               │      63 │      63 │      63 │      63 │      63 │      63 │      63 │       2 │
╘══════════════════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

I also run a benchmark that uses the new PEMDocument.parseMultiple(pemString:) method:

Parse WebPKI Roots from multi PEM to PEMDocument 
╒════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                 │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) (K)     │      96 │      96 │      96 │      96 │      96 │      96 │      96 │       5 │
├────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (wall clock) (ms) │     209 │     211 │     213 │     214 │     219 │     219 │     219 │       5 │
╘════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

which is roughly the same as parsing each PEM individually:

Parse WebPKI Roots from PEM to PEMDocument 
╒════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                 │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) (K)     │      96 │      96 │      96 │      96 │      96 │      96 │      96 │       5 │
├────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (wall clock) (ms) │     205 │     205 │     205 │     206 │     207 │     207 │     207 │       5 │
╘════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

Note that macOS is even faster, likely because of the different base64 decode implementation:

Parse WebPKI Roots from PEM to PEMDocument 
╒════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                 │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) (K)     │      82 │      82 │      82 │      82 │      82 │      82 │      82 │       8 │
├────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (wall clock) (ms) │     135 │     135 │     135 │     136 │     137 │     137 │     137 │       8 │
╘════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛
Parse WebPKI Roots from multi PEM to PEMDocument 
╒════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                 │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Malloc (total) (K)     │      83 │      83 │      83 │      83 │      83 │      83 │      83 │       8 │
├────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (wall clock) (ms) │     138 │     138 │     139 │     139 │     143 │     143 │     143 │       8 │
╘════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

Lukasa · 2023-09-12T16:10:42Z