[core] make the asset bundler able to re-use assets #8882

misterjoshua · 2020-07-02T23:14:45Z

I'd like to request that the asset bundler re-use assets that it has already created if it's instructed to use a source asset hash type.

Use Case

As it stands, the bundler runs the bundling containers before it checks to see if the assets have already been emitted. This is a problem for me when assets are slow to bundle and when there are several stacks (say, one per deployment environment) in a single app. Having the bundler skip already-emitted assets could save a lot of build and test time. It would also tighten the dev loop, which is slowed by unit tests that drive any of the custom bundling assets that invoke Docker.

Proposed Solution

I'd suggest that the bundler containers not be re-run when the source asset hash matches an already-existing asset in cdk.out.

Other

Here's an example project that would benefit from reusing the bundled asset. The stacks run the bundler at least three times to generate the same lambda layer. https://github.com/misterjoshua/aws-cdk-bundle-reuse (And a github workflow)

I've defined this app:

const app = new cdk.App();

new BundlerStack(app, 'BundlerStack-Test');
new BundlerStack(app, 'BundlerStack-Stage');
new BundlerStack(app, 'BundlerStack-Production');

bundler-stack.ts:

export class BundlerStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const pythonDir = path.join(__dirname, 'python');

    const libLayer = new lambda.LayerVersion(this, 'LibLayer', {
      code: lambda.Code.fromAsset(pythonDir, {
        exclude: [ '*', '!requirements.txt' ],
        bundling: {
          image: lambda.Runtime.PYTHON_3_8.bundlingDockerImage,
          command: [ 'bash', '-c', `pip install -r requirements.txt -t /asset-output/python`],
        }
      })
    });

    new lambda.Function(this, 'FunOne', {
      code: lambda.Code.fromAsset(pythonDir, { exclude: [ '.venv' ] }),
      handler: 'lambda_one.handler',
      runtime: lambda.Runtime.PYTHON_3_8,
      layers: [ libLayer ],
    });

    new lambda.Function(this, 'FunTwo', {
      code: lambda.Code.fromAsset(pythonDir, { exclude: [ '.venv' ] }),
      handler: 'lambda_two.handler',
      runtime: lambda.Runtime.PYTHON_3_8,
      layers: [ libLayer ],
    });
  }
}

cdk list takes over a minute for me in this example. The bundler is running three separate pip install containers (one for each stack) even though the results should be identical, based on the source hash. I believe this long time to show a cdk list happens because the bundling occurs during the construction phase, rather than during synthesis.

👋 I may be able to implement this feature request
⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

The text was updated successfully, but these errors were encountered:

eladb · 2020-07-05T05:58:23Z

Copy @jogold

…8916) This PR changes `AssetStaging` so that the bundler will re-use pre-existing output. Before, the bundler would re-run Docker without considering pre-existing assets, which was slow. Now, when handling a `SOURCE` hash type, the bundler detects and returns pre-existing asset output without re-running Docker. For all other hash types, the bundler outputs to a temp directory before calculating asset hashes. Closes #8882 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*

…ws#8916) This PR changes `AssetStaging` so that the bundler will re-use pre-existing output. Before, the bundler would re-run Docker without considering pre-existing assets, which was slow. Now, when handling a `SOURCE` hash type, the bundler detects and returns pre-existing asset output without re-running Docker. For all other hash types, the bundler outputs to a temp directory before calculating asset hashes. Closes aws#8882 ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*

…(revisited) (#9576) This PR changes `AssetStaging` so that the bundler will re-use pre-existing output. Before, the bundler would re-run Docker without considering pre-existing assets, which was slow. Now, when handling a `SOURCE` hash type, the bundler detects and returns pre-existing asset output without re-running Docker. For all other hash types, the bundler outputs to an intermediate directory before calculating asset hashes, then renames the intermediate directory into its final location. This PR revisits #8916 which originally closed #8882. Here are some details from the previous PR which have been addressed in this PR: - The bundler now outputs directly into the assembly directory - The bundler's assets can be reused between multiple syntheses - The bundler keeps output from failed bundling attempts for diagnosability purposes (renamed with an `-error` suffix) - Bundler options are hashed together with custom and source hashes - Removed the check for a docker run from `throws with assetHash and not CUSTOM hash type` as docker is no longer run before the AssetStaging props are validated. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*

misterjoshua added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jul 2, 2020

github-actions bot added the @aws-cdk/core Related to core CDK functionality label Jul 2, 2020

github-actions bot assigned eladb Jul 2, 2020

misterjoshua added a commit to misterjoshua/aws-cdk that referenced this issue Jul 6, 2020

Allow the bundler to re-use assets aws#8882

cee16cf

misterjoshua mentioned this issue Jul 6, 2020

chore(core): allow the bundler to re-use pre-existing bundler output #8916

Merged

SomayaB added in-progress This issue is being actively worked on. and removed needs-triage This issue or PR still needs to be triaged. labels Jul 7, 2020

eladb added the effort/small Small work item – less than a day of effort label Jul 8, 2020

mergify bot closed this as completed in #8916 Jul 13, 2020

misterjoshua added a commit to misterjoshua/aws-cdk that referenced this issue Jul 13, 2020

Allow the bundler to re-use assets aws#8882

d57dd76

jogold mentioned this issue Jul 14, 2020

POC: upgrade to constructs 4.x #8962

Closed

jogold mentioned this issue Aug 3, 2020

[@aws-cdk/aws-lambda-python] Support caching of built assets #9406

Closed

2 tasks

misterjoshua mentioned this issue Aug 10, 2020

chore(core): allow the bundler to re-use pre-existing bundler output (revisited) #9576

Merged

maoosi mentioned this issue Aug 12, 2020

bundling - multiple docker image builds #9424

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] make the asset bundler able to re-use assets #8882

[core] make the asset bundler able to re-use assets #8882

misterjoshua commented Jul 2, 2020 •

edited

Loading

eladb commented Jul 5, 2020

[core] make the asset bundler able to re-use assets #8882

[core] make the asset bundler able to re-use assets #8882

Comments

misterjoshua commented Jul 2, 2020 • edited Loading

Use Case

Proposed Solution

Other

eladb commented Jul 5, 2020

misterjoshua commented Jul 2, 2020 •

edited

Loading