Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-validation & true async issuance #17

Merged
merged 7 commits into from
Apr 13, 2017
Merged

Conversation

cpu
Copy link
Contributor

@cpu cpu commented Apr 5, 2017

Updates the README
5491a24 updates the README to include Chisel usage instructions
& reference the current-most draft.

Adds multi-validation and true async issuance.
7d615b5 splits the operation of the CA and VA across multiple
go-routines instead of doing all work on the WFE's handler routine.

The VA now performs multiple validations, sleeping a random amount of
time between each. This forces the ACME client to poll the
authorizations to learn when they have switched from pending to invalid
or valid. All validation attempts must succeed for the authorization to
be valid.

The WFE no longer communicates directly with the CA, instead when the VA
updates the last authorization on an order to valid status it invokes
the CA's CompleteOrder function in a separate goroutine. This will
complete the order & update it with a certificate pointer. The WFE uses
this pointer to construct the certificate URL as required.

Daniel added 2 commits March 23, 2017 10:49
This commit splits the operation of the CA and VA across multiple
go-routines instead of doing all work on the WFE's handler routine.

The VA now performs multiple validations, sleeping a random amount of
time between each. This forces the ACME client to poll the
authorizations to learn when they have switched from pending to invalid
or valid. All validation attempts must succeed for the authorization to
be valid.

The WFE no longer communicates directly with the CA, instead when the VA
updates the last authorization on an order to valid status it invoke's
the CA's CompleteOrder function in a separate goroutine. This will
complete the order & update it with a certificate URL.
@cpu cpu requested a review from jsha April 5, 2017 18:18
@cpu
Copy link
Contributor Author

cpu commented Apr 5, 2017

Note: CI is going to fail until Chisel is updated to handle multiple validations & async issuance.

Copy link
Contributor

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mention locking for orders in the comments below. I think we also need locking for Authzs and Challenges. I noticed you flagged a race condition with a TODO, but I think it's worth handling the race conditions now, since they're hard to debug when they actually happen. Better to add the concurrency safety while we're thinking about it.

ca/ca.go Outdated
}

if order.Status != acme.StatusPending {
ca.log.Printf("Error: Asked to complete orrder %s is not status pending, was status %s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*order %s, which is not

ca/ca.go Outdated

ca.log.Printf("Order %s is fully authorized. Ready to issue", order.ID)
// Update the order to reflect that we're now processing it
order.Status = acme.StatusProcessing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to hold a lock on the order.

ca/ca.go Outdated
order.Status = acme.StatusProcessing

csr := order.ParsedCSR
domains := make([]string, len(csr.DNSNames))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to make a copy?

core/types.go Outdated
@@ -19,6 +19,7 @@ type Order struct {
ParsedCSR *x509.CertificateRequest
ExpiresDate time.Time
AuthorizationObjects []*Authorization
CertPathPrefix string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than embedding a public-facing path (WFE's responsibility) in the internal Order object, let's embed a certificate ID, and make WFE responsible for rendering that into a certificate path.

va/va.go Outdated
va.log.Printf("Pulled a task from the Tasks queue: %#v", task)
va.log.Printf("Starting %d validations.", concurrentValidations)

var wg sync.WaitGroup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively serializes the tasks; If one authorization is validating, another authorization can't be validating at the same time. I think that might potentially mask some client bugs. Also, this loop seems a little long, and I think the goto may not be necessary.

What if instead, this was a very short loop:

for task := range va.tasks {
  go process(task)
}

And the bulk of this function could be moved into process(..).

va/va.go Outdated
// concurrent validations into a slice for processing.
for i := 0; i < concurrentValidations; i++ {
va.log.Printf("Reading a result from task.Results")
result := <-task.Results
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we wait on the WaitGroup above, task.Results is guaranteed to be exactly filled, and there is guaranteed to be no concurrent access. So there's no reason to make it a channel. It can just be a slice, and then you can range over it. Alternately you could get rid of the WaitGroup and use just the channel, which would allow you to surface errors without waiting for all validations to come in. Basically, we need just one of these concurrency primitives here.

Also, I'd recommend factoring this out into a firstError function or similar, that returns either an error or nil. That way you can call that function and have an if/else that very clearly handles both the error and non-error cases in one place.

if err != nil {
return err
result.Error = err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think switching this function from one that returns an error, to one that mutates its input to set an error, is a step backwards in clarity. We had similar code in Boulder, but moved towards the "return an error" approach.

Stepping back a bit, this is related to the choice to have a vaTask contain both input and output. What if we defined vaTask to be only the input to a task, and the output from a task to be a ValidationRecord? Or in other words, treat vaTask as immutable. This might be an argument to use a chan *ValidationRecord in place of a WaitGroup. The wg argument to performValidation could be replaced with a chan, and the switch statement could do ch <- va.validateHTTP01(task).

Daniel added 3 commits April 10, 2017 13:29
Based on review feedback this commit updates the CA to stash
a certificate ID in the internal order object instead of a path prefix.
The WFE then constructs a certificate URL for the certificate based on
the ID in the Order() handler. This keeps public facing paths as a WFE
responsibility.
This commit updates the core internal types to embed RWMutex's to
protect against concurrent reads/updates. The relevant wfe, db, va, and
ca functions were updated to acquire read/write locks as appropriate.

This commit also reworks parts of the CA and VA to have shorter, easier
to read functions.

The VA is updated to not serialize the validation work as it reads
vaTasks off of a channel. The WaitGroup was removed and results from
validation attempts are processed as they occur.
@cpu
Copy link
Contributor Author

cpu commented Apr 12, 2017

@jsha I believe all of your feedback is addressed, can you give this PR another pass when you have a chance? Thanks!

ca/ca.go Outdated
authz.RUnlock()
}

order.RLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to lock earlier, before you fetch order.AuthorixationObjects, Also I think a full lock would be fine; we're not worried about performance here, and there's no deadlock possibility, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on locking too late here. Thanks.

ca/ca.go Outdated
// Update the order to valid status and store a cert ID for the wfe to use to
// render the certificate URL for the order
order.Status = acme.StatusValid
order.CertID = cert.ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to store a copy of cert.ID? Why not use it directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Why not use it directly" -> You mean a pointer to the certificate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah; I wrote this comment before the earlier one. Is the certificate stored in memory separately from the order? If so, then yes a pointer to the certificate would be good. Otherwise, DER should work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, should have read the latest diffs before replying. This looks great.

ca/ca.go Outdated
order.Status = acme.StatusValid
order.Certificate = order.CertPathPrefix + cert.ID
ca.log.Printf("Order %s has Certificate %s", order.ID, order.Certificate)
order.CertID = cert.ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this field is only used to check whether a certificate is done, maybe it would be better to create a field that contains the cert DER, since we'll actually need that later. Otherwise, it's a bit confusing why there are two "ID" fields.

Daniel added 2 commits April 13, 2017 11:27
Per review feedback its strange to have two IDs when one is just used to
indicate that an order is complete. Instead the internal Order type
should hold a pointer to the internal Certificate type. The ID and DER
encoding can be read from that as required for delivering the order.
Per review feedback the `order.RLock()` on L38 was happening too late,
after the loop over the `order.AuthorizationObjects`. This commit moves
the lock earlier & simplifies by acquiring one write lock for the
duration of the function vs a read lock and a subsequent write lock.
@cpu
Copy link
Contributor Author

cpu commented Apr 13, 2017

@jsha Another round of feedback applied - thanks!

@jsha jsha merged commit f471a5b into master Apr 13, 2017
@jsha jsha deleted the cpu-go-go-gadget-goroutine branch April 13, 2017 22:44
mcpherrinm pushed a commit to mcpherrinm/pebble that referenced this pull request May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants