-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SA: Add and use revokedCertificates table #7095
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(partial review)
`revokedDate` datetime NOT NULL, | ||
`revokedReason` int(11) NOT NULL, | ||
PRIMARY KEY (`id`), | ||
KEY `notAfter_shardIdx_idx` (`notAfter`, `shardIdx`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want this index in the opposite order, since shardIdx
has a bunch of discrete values, while notAfter
is more continuous.
Think of it like a phone book: if you sorted by last name, and then by millisecond of someone's birth, that might be a useful disambiguator. But if you sorted by millisecond of someone's birth, and then by last name, the last name would never be useful because everyone would just be in order by birth.
KEY `notAfter_shardIdx_idx` (`notAfter`, `shardIdx`) | |
KEY `shardIdx_notAfter_idx` (`shardIdx`, `notAfter`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done! I also wonder if I want to have issuerID
in here, since that is part of the query I've built.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some conversation with @jcjones I learned that even if the DATETIME
comes second, it can still cause the index to be too big, which could cause the query planner to go to disk instead of using the index. For our purposes, truncating this value to an hour would be sufficient since we really only care about when we get to stop including a row in CRLs (or serving 404 for OCSP).
And yeah putting the issuerID
in there is a good idea IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cardinality of the index is highly important, but I don't know for certain which ordering here is best, and I suspect the only way to find out will be to run synthetic benchmarks with a few million rows in a MariaDB.
That said, indexes on full-precision datetime
types are problematical. We do it everywhere, and everywhere it hurts performance. We need to start doing better, and for a fresh table with need for such an index, we should draw a line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest patchset replaces the notAfter
column with notAfterHour
. It rounds the notAfter up at insert time to ensure we don't think certs expire before they actually do, and conversely rounds the notAfter down at select time, to ensure we're getting more certs than we strictly need to.
I've also added the issuerID to the index, so the index now covers all three columns used in the select's WHERE clause, in order of least-to-most-specific: issuerID cuts the selection space by 1/2, shardIdx cuts the selection space by 1/128, and notAfterHour cuts the selection space by 1/(24*90).
atTime := time.Unix(0, req.RevokedBeforeNS) | ||
|
||
clauses := ` | ||
WHERE issuerID = ? | ||
AND shardIdx = ? | ||
AND notAfter >= ?` | ||
AND notAfterHour >= ?` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think technically this can become a >
now. Imagine certificate XYZ, with a NotAfter
of 13:00. It would be stored in revokecCertificates
with a notAfterHour
of 14:00. So if ExpiresAfterNS
is 13:00, notAfterHour > 13:00
would include certificate XYZ.
Perhaps it's just worth being over-inclusive here in case we got the math wrong, but I wanted to write it down to make sure I understand correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I had the same thought and agree with your math, but chose to leave it >= as a "more obviously correct" measure. I like both halves (inserting and selecting) being obviously correct on their own, rather than only correct in combination with each other.
Add a new "revokedCertificates" table to the database schema. This table is similar to the existing "certificateStatus" table in many ways, but the idea is that it will only have rows added to it when certificates are revoked, not when they're issued. Thus, it will grow many orders of magnitude slower than the certificateStatus table does. Eventually, it will replace that table entirely.
The one column that revokedCertificates adds is the new "ShardIdx" column, which is the CRL shard in which the revoked certificate will appear. This way we can assign certificates to CRL shards at the time they are revoked, and guarantee that they will never move to a different shard even if we change the number of shards we produce. This will eventually allow us to put CRL URLs directly into our certificates, replacing OCSP URLs.
Add new logic to the SA's RevokeCertificate and UpdateRevokedCertificate methods to handle this new table. If these methods receive a request which specifies a CRL shard (our CRL shards are 1-indexed, so shard 0 does not exist), then they will ensure that the new revocation status is written into both the certificateStatus and revokedCertificates tables. This logic will not function until the RA is updated to take advantage of it, so it is not a risk for it to appear in Boulder before the new table has been created.
Also add new logic to the SA's GetRevokedCertificates method. Similar to the above, this reads from the new table if the ShardIdx field is supplied in the request message. This code will not operate until the crl-updater is updated to include this field. We will not perform this update for a minimum of 100 days after this code is deployed, to ensure that all unexpired revoked certificates are present in the revokedCertificates table.
Part of #7094