google / addlicense

A program which ensures source code files have copyright license headers by scanning directory patterns recursively

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ignore subdirectories pattern

konsalex opened this issue · comments

Hey folks,

Is there any way to ignore subdirectories based on the directory name? We currently have a lerna monorepo and I find it difficult to exclude multiple node_modules folders.

The optimal would be to use it like this

addlicense -c "Neo4j Inc." -f license.txt -check ./**/*.{js,ts,css} -ignore **/node_modules/*

but it doesn't seem to work.

The directories also are in the following tree structure

.
├── CONTRIBUTING.md
├── README.md
├── lerna.json
├── license.js
├── license.txt
├── node_modules
├── package.json
├── packages
│   ├── base
│   │   ├── node_modules
│   │   ├── package.json
│   │   ├── src
│   ├── html-storybook
│   │   ├── node_modules
│   │   ├── package.json
│   ├── react
│   │   ├── node_modules
│   │   ├── package.json
│   │   ├── src
│   │   └── tsconfig.json
│   └── react-storybook
│       ├── node_modules
│       ├── package.json
│       └── src
├── tsconfig.json
└── yarn.lock

I know this is an old issue, so chances are this is long solved. However in case anyone else happens to look at this issue for a solution:

Not exactly node_modules, but when I want to ignore everything in my submodules folder I use the pattern submodules/**/* and that appears to do what you're describing.

I think doing **/node_modules/* means "starting from any directory, in a folder called node_modules, any file inside it", whereas what you want is **/node_modules/**, which translates to "starting from any directory, in a folder called node_modules, any directory inside it, any file inside any directory inside it".

I will try this @braydonk , but as the team of doublestar mentioned they do not support negative matching at the moment.

I think I tried **/node_modules/** this pattern, but will give it a go soon again

@braydonk I make the hypothesis that it does not based on the result after executing: addlicense -c "Neo4j Inc." -f license.txt -check ./**/*.{js,ts,css} -ignore **/node_modules/**

zsh: argument list too long: addlicense

I made a quick test project that only installed express and made a hello world index.js.

I got the command to run with a couple changes:

braydonk@bk:~/Git/test-node$ addlicense -check -ignore "**/node_modules/**" -c "Neo4j Inc." -f license.txt -check ./**/*.{js,ts,css} .
2022/08/27 11:53:57 skipping: node_modules/ipaddr.js/LICENSE
2022/08/27 11:53:57 skipping: node_modules/ipaddr.js/README.md
2022/08/27 11:53:57 skipping: node_modules/ipaddr.js/ipaddr.min.js
2022/08/27 11:53:57 skipping: node_modules/ipaddr.js/lib/ipaddr.js
2022/08/27 11:53:57 skipping: node_modules/ipaddr.js/lib/ipaddr.js.d.ts
2022/08/27 11:53:57 skipping: node_modules/ipaddr.js/package.json
...

2 things I needed to fix:

  • putting quotes around the ignore path
  • Telling it which directory to use (see the . at the end)

The ignore path in quotes thing could use some documentation, and probably some usage examples.

  • putting quotes around the ignore path

yeah, this is one thing about ignore that I really don't love, but there's also little we can do about it. The difference between a quoted and unquoted ignore value have completely different meanings.... unquoted is interpreted by your shell, and quoted is passed to doublestar. I recall there being another issue where someone ran into this, bit I can't find it. I just noticed the readme file uses unquoted values in example and doesn't discuss this issue at all, which I'll work on fixing.

@braydonk did you include nested packages with node_modules?

I execute the command you pasted: addlicense -check -ignore "**/node_modules/**" -c "Neo4j Inc." -f license.txt -check ./**/*.{js,ts,css,tsx} . but still received zsh: argument list too long: addlicense.

Tried to put the match pattern ./**/*.{js,ts,css,tsx} in quotes but then yml, sh and other file types were matched.

Also tried matching with ls ./**/*.{js,ts,css,tsx} and received a too long error.

To replicate also, a similar yarn workspace is here: https://github.com/konsalex/enterprise-design-system-course/tree/l6-3
Cloning and running yarn will do the job to then test, in case you want to reproduce.

The reason it matched everything is because you also had . at the end of your command; that's the directory to search, and since the .yml and .sh paths aren't ignored it was found there. Probably buried in the output of the command was an error like this:

2022/08/29 11:33:39 ./**/*.{js,ts,css,tsx} error: lstat ./**/*.{js,ts,css,tsx}: no such file or directory

That's because the implementation uses filepath.Walk when searching for files, and doublestar globs for ignores. This means you can't call the command with only the extensions you want, but rather ignoring everything you don't want. You'll need to do something like this:

addlicense -check -ignore "**/node_modules/**" -ignore "**/*.{yml}" -c "Neo4j Inc." -f license.txt .

However it always matches the .yarnrc.yml with everything I tried, so I'm not sure if that might be a bug or not.

I believe there is a bug or discussion somewhere about using doublestar for searching as well. I can't remember what the blocker at the time was 😕

Maybe a bug, not sure @braydonk, but at least I have a workaround that works.

In any case thanks for the help, @braydonk & @willnorris 😄