Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): natively support .dockerignore #10922

Merged
merged 33 commits into from
Nov 10, 2020
Merged

Conversation

blaenk
Copy link
Contributor

@blaenk blaenk commented Oct 17, 2020

Patterns in .dockerignore used to be interpreted as globs, which has different behavior from how .dockerignore is supposed to be interpreted. Specifically it was hard to properly ignore whole directories at once (such as a large node_modules).

This PR adds an ignoreMode flag which controls how the ignore patterns should be interpreted. The choices are:

  • IgnoreMode.GLOB - interpret as before
  • IgnoreMode.GIT - interpret as in .gitignore
  • IgnoreMode.DOCKER - interpret as in .dockerignore

Users might have dependencies on the current behavior, which is why the default ignoreMode depends on a feature flag: @aws-cdk/aws-ecr-assets:dockerIgnoreSupport (which is set to true for new projects) enables the new, correct Docker-ignore behavior by default.


Closes #10921

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@gitpod-io
Copy link

gitpod-io bot commented Oct 17, 2020

@blaenk blaenk changed the title feat(core): add CopyOptions.IgnoreMode feat(core): natively support .dockerignore Oct 17, 2020
Copy link
Contributor

@rix0rrr rix0rrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing all this hard work! Got a few minor comments on the implementation, but overall I like where this is going.

On the topic of trust: we've made changes here before and then accidentally broke people. How solidly do you feel about this implementation? And convince me how that's justified please? 😅

@@ -97,20 +96,30 @@ export class DockerImageAsset extends CoreConstruct implements assets.IAsset {
throw new Error(`Cannot find file at ${file}`);
}

let ignoreMode = props.ignoreMode || IgnoreMode.DOCKER;

if (ignoreMode != IgnoreMode.DOCKER) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this restriction? I am okay with the default being DOCKER, but can't we support the old ones as well in case people depend on it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I originally did this is because it feels plain wrong to process .dockerignore patterns with anything other than IgnoreMode.DOCKER.

This is the core thing that relates to the questions of backward compatibility, meaning, I'm not super confident how existing user-supplied exclude patterns will behave when flipped from the current IgnoreMode.GLOB (minimatch) to IgnoreMode.DOCKER, because both have different nuanced behavior.

So with this in mind, in the event that users may have become inadvertently dependent on the current broken behavior, maybe it is a good idea to still allow that old behavior as an option. Must we go even further and leave legacy as the default, with IgnoreMode.DOCKER as opt-in? It would be safest, though it would leave major out-of-the-box ergonomics on the table.

See 2f0b88f where I removed the restriction.

Another option is to default ignoreMode to the legacy IgnoreMode.GLOB if exclude was passed by the user but ignoreMode wasn't, since it likely means that they may be dependent on the legacy behavior.

Then again, maybe just letting users opt back into the legacy behavior if they run into issues is good enough. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went a little more into this in the comment below.

#10922 (comment)

Comment on lines 2 to 3
!Dockerfile
!.dockerignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these two since they're implicit, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. f768aa1

import { nodeunitShim, Test } from 'nodeunit-shim';
import { IgnoreStrategy } from '../../lib/fs';

nodeunitShim({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask you to rewrite these to Jest tests and to use expect(...).toEqual(...) ? It'll generate better error messages.

In fact, I think I'd like to see tests of the form:

describe('globIgnorePattern', () => {

  test('excludes nothing by default', () => {
      const ignore = IgnoreStrategy.glob([]);
      expect(filterFiles(ignore, [
        'some/file/path',
      ]).toEqual([
        'some/file/path',
      ]);
  });

});

function filterFiles(strat: IgnoreStrategy, files: string[]) {
  return files.filter(file => !strat.ignores(file));
}

Feels like this would produce understandable outputs if the tests fail for some reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. See b47b404

* @param filePath file path to be assessed against the pattern
* @returns `true` if the file should be ignored
*/
public abstract ignores(filePath: string): boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all of these ignore functions, I'd like very much to be very clear about the expectations of absoluteness/relativeness of these file paths.

It seems like right now you're implicitly expecting all paths to be relative to the directory that contains the ignore file.

Please be very clear about that, documentation-wise, and add assertions/checks wherever it makes sense: reject absolute paths, etc.

Or, add in a way to normalize paths which can handle both absolute and relative. Or, expect absolute and relativize before checking. As long as the contract is very very clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The root path against which paths passed to ignores() were being relativized was an implicit dependency of an IgnoreStrategy.

See fe58be1

I made this an explicit dependency. All paths are expected to be absolute paths. This is documented and reflected in the variable naming. I also throw exceptions if relative paths are passed where absolute paths are expected (which is everywhere in this PR).

@mergify mergify bot dismissed rix0rrr’s stale review October 28, 2020 10:44

Pull request has been modified.

@blaenk
Copy link
Contributor Author

blaenk commented Oct 28, 2020

On the topic of trust: we've made changes here before and then accidentally broke people. How solidly do you feel about this implementation? And convince me how that's justified please? 😅

I went a little over this in the section on Backward Compatibility in the original post.

Note: To be clear, everything here that follows is specifically concerning DockerImageAsset when a user has explicitly supplied extra exclude patterns. Every other use of IgnoreMode defaults to the existing behavior (.GLOB), and a DockerImageAsset that has no exclude patterns will work out just fine since there's either nothing to exclude or the .dockerignore will be processed correctly with IgnoreMode.DOCKER (in the current state of the PR).

I am confident about the implementation because it's pretty much entirely delegating to packages that purely focus on implementing this behavior, compared to currently mashing .dockerignore patterns through minimatch.

I am not super confident about what would happen if we have user-supplied exclude patterns and they're suddenly implicitly flipped to be processed with IgnoreMode.DOCKER behavior when they were previously processed with IgnoreMode.GLOB (minimatch). This is not out of lack of confidence in the implementation, but because both GLOB (minimatch) and .dockerignore inherently have different pattern matching behavior.

The reason this matters is because ideally we would make IgnoreMode.DOCKER the default for DockerImageAsset for maximal out-of-the-box ergonomic gains and correctness, but that would mean the situation I just described for existing user code which may explicitly supply patterns through exclude to DockerImageAsset (specifically) where they would be evaluated differently under IgnoreMode.DOCKER.

The consequences of differing matching behavior is:

  1. files permitted that were previously ignored
  2. files ignored that were previously permitted

(On the flip side, it may help to keep in mind and consider that we're already witnessing something like this, where we're processing .dockerignore patterns with minimatch (what is now IgnoreMode.GLOB) which can and does produce different results from what would be expected.)

In the first case, this would mean that the resulting asset is a little bloated. (Are there other security concerns?)

In the second case, this would mean that some downstream process would be missing a file that it requires (e.g. to build the docker image). That is, downstream could break.

The options I see are:

  1. Make ignoreMode a required property (breaking change) to force people to read the docs and choose. Probably not an option if it's deemed too disruptive, but I do see CDK make breaking changes at times so maybe it's an option.

    However, adding a new required property to CopyOptions may be pretty invasive internally throughout CDK with code suddenly requiring this option to be supplied even when not relevant.

  2. Leave IgnoreMode.GLOB as the default behavior even in DockerImageAsset (it's currently left as the default everywhere else), leaving ergonomics gains on the table and requiring people to know that they can enable it to get those gains.

  3. Use a hybrid in DockerImageAsset where .dockerignore patterns are processed with IgnoreMode.DOCKER and user-supplied exclude patterns are processed with IgnoreMode.GLOB.

    At first glance this option doesn't seem trivial since DockerImageAsset only passes the CopyOptions through to assets.Staging, so this would require changes to assets.Staging probably to be able to directly pass it an IgnoreStrategy or something.

I think users, especially future users, would most benefit from having IgnoreMode.DOCKER as the default (the current state of the PR), since everything just works out naturally that way. 2f0b88f also removes the restriction where DockerImageAsset could only use IgnoreMode.DOCKER, so if users found that things broke, they could explicitly add ignoreMode: IgnoreMode.GLOB to get the legacy behavior (this can be added to the docs). Personally I think it doesn't make sense to process .dockerignore patterns with anything other than IgnoreMode.DOCKER, but it presents a compromise to allow users to opt into the legacy behavior if we go this route (.DOCKER default for DockerImageAsset) and things break for certain setups.

However, I recognize that you all may want to take a more conservative approach, in which case it seems like the second option is the only option.

I'll be thinking about the third option because it may get us the best of both worlds: default IgnoreMode.DOCKER for DockerImageAsset while continuing to process user-supplied exclude patterns with IgnoreMode.GLOB by default (unless ignoreMode is explicitly passed), requiring no intervention from the user.

Hopefully I've explained the situation well enough, but if not please let me know if you'd like some rephrasing or elaboration. I'm also happy to hear of other options that may not have occurred to me.

@rix0rrr
Copy link
Contributor

rix0rrr commented Nov 9, 2020

I think to be safe I'm going to add a feature flag to trigger the new behavior.


EDIT: Before we do that, I'm trying to reason through the consequences. The biggest issue I can see is how files that used to be present would now be missing, as that would lead users to deploying broken applications.

It's hard to predict whether or not this will break users, and it seems critical. The flag is probably the safe way to go.

@rix0rrr
Copy link
Contributor

rix0rrr commented Nov 9, 2020

@blaenk how do you feel about the current proposal?

@blaenk
Copy link
Contributor Author

blaenk commented Nov 10, 2020

The biggest issue I can see is how files that used to be present would now be missing, as that would lead users to deploying broken applications.

It's hard to predict whether or not this will break users, and it seems critical. The flag is probably the safe way to go.

Agreed. I don't know for sure that this can happen, but it seems like a possibility.

I agree with you that it's probably best to keep it as a feature flag, I totally forgot that CDK had that concept.

It seems like the best approach:

  • new projects will be able to reap the ergonomic benefits out of the box without having to enable it
  • existing users will remain with the behavior they've been using
  • existing users can to opt-into it if they know of it and want to

I think that's the best we can hope for, and actually seems reasonable to me.

@mergify
Copy link
Contributor

mergify bot commented Nov 10, 2020

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 860d048
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify
Copy link
Contributor

mergify bot commented Nov 10, 2020

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@mergify mergify bot merged commit cdb9942 into aws:master Nov 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[core] natively support .dockerignore
3 participants