Skip to content
This repository has been archived by the owner on Jun 13, 2019. It is now read-only.

[WIP] Add Data Fixtures for Lambda Event Types #31

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

naftulikay
Copy link
Contributor

Tests and documentation forthcoming. I am using these data structures for my serverless event bus in Rust using Crowbar.

My lambda handler looks something like this:

lambda!(|event, _context| {
    // dispatch event based on its type
    match serde_json::from_value::<Event>(event) {
        Ok(Event::Auth(event)) => Ok(to_value(&service::auth::route(&context, &event)).unwrap()),
        Ok(Event::CloudWatch(event)) => Ok(service::cloudwatch::route(&context, &event)),
        Ok(Event::Http(event)) => Ok(to_value(&service::http::route(&context, &event)).unwrap()),
        Ok(Event::Records(records)) => Ok(service::multi::route_all(&context, records.entries)),
        Ok(Event::Unknown(event)) => Ok(service::unknown::route(&context, &event)),
        Err(_) => {
            error!("Unable to convert event.");
            Ok(json!({ "statusCode": 400 ,"message": "Unable to convert event." }))
        }
    }
});

Thus, I do routing based on event type. For example, SNS:

pub fn route(context: &Context, record: &sns::Record) {
    info!("Received SNS Event: {}", record.event.message_id);

    match from_str(&record.event.message) {
        Ok(SnsEventKind::Ses(message)) => service::ses::receive(&context, &message),
        Ok(SnsEventKind::Unknown(value)) => warn!("Unknown JSON Event: {}", to_string_pretty(&value).unwrap()),
        Err(_) => error!("Unknown non-JSON event: {}", record.event.message),
    };
}

My multi-event router:

/// Dispatch multiple events.
pub fn route_all(context: &Context, events: Vec<Record>) -> Value {
    for event in events {
        match event {
            Record::S3(event_record) => s3::route(&context, &event_record),
            Record::Ses(event_record) => {
                // if it's an SES request/response, a response may be required, so only process one
                // of them if the type is RequestResponse
                match event_record.event.receipt.action {
                    Action::Lambda(ref action) if action.invocation_type == RequestResponse => {
                        return to_value(ses::filter(&context, &event_record)).unwrap();
                    },
                    _ => ses::route(&context, &event_record.event),
                };
            },
            Record::Sns(event_record) => sns::route(&context, &event_record),
            Record::Unknown(value) => {
                error!("Unknown event record type: {}", to_string_pretty(&value).unwrap())
            }
        };
    }

    Value::Null
}

This dispatching occurs in around 400-500 microseconds and works extremely well to make things type safe.

Obviously there's a lot here and it's not done. If @ilianaw or others have any input, RFC is open to comments 😃

Tests and documentation forthcoming. I am using these data
structures for my serverless event bus in Rust using Crowbar.
@iliana
Copy link
Owner

iliana commented Mar 28, 2018

I like this. :)

Can you add what has been done and what remains as checkboxes to the pull request description so we can see the path to this leaving a WIP state?

@naftulikay
Copy link
Contributor Author

🎉 so happy that you're welcoming this PR. I will create a separate issue and track things there. There are so many different data types to support, so this PR will maybe only cover a subset with future on the way, hope that's okay.

@iliana
Copy link
Owner

iliana commented Mar 28, 2018

Yeah, I don't have an issue with incremental development over time.

I'm hoping I'll have a chance to do a good read through of all of this soon.

@naftulikay
Copy link
Contributor Author

naftulikay commented Mar 28, 2018

@ilianaw can you please go turn on Travis CI for this project? I have updated with a build badge and Travis configuration for building and testing.

EDIT: Actually, I will submit Travis support in a separate PR.

@naftulikay
Copy link
Contributor Author

Latest adds support for CloudWatch AutoScaling events.

@softprops
Copy link
Contributor

I'd love to see this branch ship fast with additional followups to add other events types. Commented with extra thoughts on the checkbox ticket

#[derive(Serialize,Deserialize)]
pub struct HttpEvent {
pub body: Option<String>,
pub headers: Option<HashMap<String, String>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small thought. I did something similar initially in a separate project and found that the extra option type to be a bit annoying from a usability perspective. Espectially those new to rust

9 |     println!("{:?}", ev.query_string_parameters.and_then(|m| m.get("foo")));
  |     ---------------------------------------------------------^--------------
  |     |                                                        |          |
  |     |                                                        |          `m` dropped here while still borrowed
  |     |                                                        borrowed value does not live long enough
  |     borrowed value needs to live until here

Was't a very great experience.

I found a way to deserialize the nullable maps into default values greatly improved usability for new comers

as well as reduced verbosity while retaining the same level of rust runtime saftey.

 println!("{:?}", ev.query_string_parameters.and_then(|m| m.get("foo")))
println!("{:?}", ev.query_string_parameters.get("foo"));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's deserialize things like maps and vectors into default (empty) values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really think that kind of thing helps the ergonomics for people using this.

}

#[derive(Serialize,Deserialize)]
pub struct HttpEventRequestContext {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can largely get a way with a struct level #[serde(rename_all = "camelCase")] here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: comments like this are not really blockers for me and could easily fit into followups


impl HttpResponse {

pub fn empty(status: HttpStatus) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to convince myself to drop an existing external effort to add types to crowbar. Something I'd be missing here is a Default impl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}
}

serde_aux_enum_number_declare!(HttpStatus {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd maybe suggest not going too overboard with type mappings here. I see crowbar being something as a low level mapping to raw primitive types, in this case a u16 would do. The reasoning behind this line of thinking is that you could then more easily adapt to and not duplicate efforts to create a typed interface like http on top of crowbar. As is you couldn't build a mapping to the http crate on top of this because if the incomplete mapping of http statuses. I'd much prefer the crowbar retain the most primitive types and enable layering of extra layers on top in another crate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should we rely on another crate for this? I do actually like having strongly typed things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd actually not recommend adding any more crate dependencies on crowbar. I don't see crowbar as a framework but as an faithful representation of lambda. I'm thinking now it its really api gateway, the service, that defines what status codes you can dispatch lambdas on so that may actually be a more restrictive set. I redact my previous comment!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that a u16 would do for the HTTP status.

I experimented a lot with the http crate for another project involving Lambda@Edge. It helped with initial development but I eventually ripped it out because it was doing a lot of validation that CloudFront already does for you.

impl HttpEventRequestContext {

pub fn time(&self) -> Option<DateTime<Utc>> {
// Utc::datetime_from_str(&self.request_time, "").ok()e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could probably ditch a hard coupling of crowbar to a crate by letting it be a user decision to select a time parsing crate and let crowbar be smaller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we genericize this to work like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we didn't want to represent this as a string that lambda events do, you could maybe use a feature + cfg. I don't feel really strongly about this. Only that I'd like to see crowbar have a solve only the minimal set of problems with a minimal set of dependencies. Motivations are that it lets users pick a crate they prefer for representing time typically so that they only have one crate for dealing with time in their projects ( think about what hyper did with tls ) and it frees up crowbar up from a bit of maintenance by not worrying falling behind on its dependencies. I also don't feel too strongly here but that's my two cents. I'd feel otherwise if rust had a std lib way of representing timestamps.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think genericizing the time crate is necessary to land this PR.


impl HttpEvent {

pub fn get_header(&self, key: &str) -> Option<&str> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some some growing community recommendations on avoiding get_ where you can https://rust-lang-nursery.github.io/api-guidelines/naming.html#getter-names-follow-rust-convention-c-getter

Instead a simple .header("Foo") will do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roger that.


impl Response {

pub fn from(action: ResponseAction) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not impl the From trait instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it 👍

@naftulikay
Copy link
Contributor Author

I will start breaking this apart into a PR per service supported, I think that's probably the most prudent course of action.

One thing that is probably a really good idea is to find a way to efficiently nest CloudWatch style events and deserialize them more efficiently than I'm currently doing.

@softprops
Copy link
Contributor

One thing that is probably a really good idea is to find a way to efficiently nest CloudWatch style events and deserialize them more efficiently than I'm currently doing.

Can you link to an example in this pull or sketch out in a comment what you mean?

@iliana
Copy link
Owner

iliana commented Apr 6, 2018

Overall I'm happy with this, and thanks to @softprops for the additional review.

I'm fine with continuing with this PR, or breaking it apart as a PR per service (the latter would be great as its easier for others to help write the structs out).

I have two thoughts:

  • What happens when AWS inevitably adds another field to these objects? Does serde correctly ignore extra fields?
  • Would it be reasonable to make all of these an enabled-by-default feature to allow folks who aren't using this to disable it and speed up compile times?

@softprops
Copy link
Contributor

What happens when AWS inevitably adds another field to these objects? Does serde correctly ignore extra fields?

Yes. This used to not be true but is now a default of serde.

@naftulikay
Copy link
Contributor Author

I apologize for my lateness, just getting back from vacation. I will get caught up this week.

@iliana
Copy link
Owner

iliana commented Apr 7, 2018

No worries! ^_^;

@naftulikay
Copy link
Contributor Author

TL;DR the events received from Lambda can come via a few sources:

  • SNS: a fairly consistent data type; you'd basically just deserialize the message property into the struct(s) that you expect.
  • CloudWatch: there are a lot (Create Data Mappings for Lambda Input and Output Events #32) of different CloudWatch data types.
  • API Gateway: requests, auth requests, and responses.

Therefore, our enums for representing all kinds of incoming events should basically include these three, and then users can disambiguate between the implementations (ie cloudwatch::BatchEvent, cloudwatch::ApiCallEvent, apigateway::Request, apigateway::AuthRequest, etc.)

I will probably break this PR down into tiny pieces and make one PR for each service to add. I wholeheartedly agree with removing Option<T> wherever possible and using the default for things like Vec<T> and BTreeMap<K, V>.

@softprops
Copy link
Contributor

Is there any way to help move this forward? I'd love to help

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants