Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] make the asset bundler able to re-use assets #8882

Closed
2 tasks
misterjoshua opened this issue Jul 2, 2020 · 1 comment · Fixed by #8916 or #9576
Closed
2 tasks

[core] make the asset bundler able to re-use assets #8882

misterjoshua opened this issue Jul 2, 2020 · 1 comment · Fixed by #8916 or #9576
Assignees
Labels
@aws-cdk/core Related to core CDK functionality effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. in-progress This issue is being actively worked on.

Comments

@misterjoshua
Copy link
Contributor

misterjoshua commented Jul 2, 2020

I'd like to request that the asset bundler re-use assets that it has already created if it's instructed to use a source asset hash type.

Use Case

As it stands, the bundler runs the bundling containers before it checks to see if the assets have already been emitted. This is a problem for me when assets are slow to bundle and when there are several stacks (say, one per deployment environment) in a single app. Having the bundler skip already-emitted assets could save a lot of build and test time. It would also tighten the dev loop, which is slowed by unit tests that drive any of the custom bundling assets that invoke Docker.

Proposed Solution

I'd suggest that the bundler containers not be re-run when the source asset hash matches an already-existing asset in cdk.out.

Other

Here's an example project that would benefit from reusing the bundled asset. The stacks run the bundler at least three times to generate the same lambda layer. https://github.com/misterjoshua/aws-cdk-bundle-reuse (And a github workflow)

I've defined this app:

const app = new cdk.App();

new BundlerStack(app, 'BundlerStack-Test');
new BundlerStack(app, 'BundlerStack-Stage');
new BundlerStack(app, 'BundlerStack-Production');

bundler-stack.ts:

export class BundlerStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const pythonDir = path.join(__dirname, 'python');

    const libLayer = new lambda.LayerVersion(this, 'LibLayer', {
      code: lambda.Code.fromAsset(pythonDir, {
        exclude: [ '*', '!requirements.txt' ],
        bundling: {
          image: lambda.Runtime.PYTHON_3_8.bundlingDockerImage,
          command: [ 'bash', '-c', `pip install -r requirements.txt -t /asset-output/python`],
        }
      })
    });

    new lambda.Function(this, 'FunOne', {
      code: lambda.Code.fromAsset(pythonDir, { exclude: [ '.venv' ] }),
      handler: 'lambda_one.handler',
      runtime: lambda.Runtime.PYTHON_3_8,
      layers: [ libLayer ],
    });

    new lambda.Function(this, 'FunTwo', {
      code: lambda.Code.fromAsset(pythonDir, { exclude: [ '.venv' ] }),
      handler: 'lambda_two.handler',
      runtime: lambda.Runtime.PYTHON_3_8,
      layers: [ libLayer ],
    });
  }
}

cdk list takes over a minute for me in this example. The bundler is running three separate pip install containers (one for each stack) even though the results should be identical, based on the source hash. I believe this long time to show a cdk list happens because the bundling occurs during the construction phase, rather than during synthesis.

  • 👋 I may be able to implement this feature request
  • ⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

@misterjoshua misterjoshua added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jul 2, 2020
@github-actions github-actions bot added the @aws-cdk/core Related to core CDK functionality label Jul 2, 2020
@eladb
Copy link
Contributor

eladb commented Jul 5, 2020

Copy @jogold

misterjoshua added a commit to misterjoshua/aws-cdk that referenced this issue Jul 6, 2020
@SomayaB SomayaB added in-progress This issue is being actively worked on. and removed needs-triage This issue or PR still needs to be triaged. labels Jul 7, 2020
@eladb eladb added the effort/small Small work item – less than a day of effort label Jul 8, 2020
@mergify mergify bot closed this as completed in #8916 Jul 13, 2020
mergify bot pushed a commit that referenced this issue Jul 13, 2020
…8916)

This PR changes `AssetStaging` so that the bundler will re-use pre-existing output. Before, the bundler would re-run Docker without considering pre-existing assets, which was slow. Now, when handling a `SOURCE` hash type, the bundler detects and returns pre-existing asset output without re-running Docker. For all other hash types, the bundler outputs to a temp directory before calculating asset hashes.

Closes #8882 

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
misterjoshua added a commit to misterjoshua/aws-cdk that referenced this issue Jul 13, 2020
curtiseppel pushed a commit to curtiseppel/aws-cdk that referenced this issue Aug 11, 2020
…ws#8916)

This PR changes `AssetStaging` so that the bundler will re-use pre-existing output. Before, the bundler would re-run Docker without considering pre-existing assets, which was slow. Now, when handling a `SOURCE` hash type, the bundler detects and returns pre-existing asset output without re-running Docker. For all other hash types, the bundler outputs to a temp directory before calculating asset hashes.

Closes aws#8882 

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
mergify bot pushed a commit that referenced this issue Aug 26, 2020
…(revisited) (#9576)

This PR changes `AssetStaging` so that the bundler will re-use pre-existing output. Before, the bundler would re-run Docker without considering pre-existing assets, which was slow. Now, when handling a `SOURCE` hash type, the bundler detects and returns pre-existing asset output without re-running Docker. For all other hash types, the bundler outputs to an intermediate directory before calculating asset hashes, then renames the intermediate directory into its final location.

This PR revisits #8916 which originally closed #8882. Here are some details from the previous PR which have been addressed in this PR:

- The bundler now outputs directly into the assembly directory
- The bundler's assets can be reused between multiple syntheses 
- The bundler keeps output from failed bundling attempts for diagnosability purposes (renamed with an `-error` suffix)
- Bundler options are hashed together with custom and source hashes
- Removed the check for a docker run from `throws with assetHash and not CUSTOM hash type` as docker is no longer run before the AssetStaging props are validated.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/core Related to core CDK functionality effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. in-progress This issue is being actively worked on.
Projects
None yet
3 participants