amqp-node / amqplib

AMQP 0-9-1 library and client for Node.JS

Home Page:https://amqp-node.github.io/amqplib/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Duplicate packages detected in the amqplib project on Tag: v0.8.0

mahirkabir opened this issue · comments

Issue: We say a project has duplicated dependencies if any package dependency occurs multiple times in the dependency tree. After analyzing the dependency tree, we have detected duplicate packages in your project.
minimatch
source-map
debug
glob
mkdirp
uglify-js
wordwrap

Questions: We are conducting a research study on the duplicated package dependencies in JS projects. We were curious:

  1. Will you remove the duplicates mentioned above? (Yes/No), and why?:
  2. Do you have any additional comments? (If so, please write it down):

For any publication or research report based on this study, we will share all responses from developers in an anonymous way. Both your projects and personal information will be kept confidential.

Rationale: When a JS application depends on too many packages or on multiple versions of the same package, its attack surface can grow dramatically; hackers can get a higher chance of successfully exploiting the vulnerabilities inside those packages (or versions), and escalating the potential damage. The unnecessary and duplicated dependencies can also make JS projects bloat and lead to extra memory/computation overhead. Therefore, JS application developers are recommended to remove unused and duplicated packages from their projects, in order to eliminate the security risks unnecessarily incurred by those dependencies.

Steps to reproduce:

  • Execute the “npm ls --all” command to print the dependency tree of the project containing all the libraries and their corresponding versions
  • Check if any library exists more than once in the tree

Suggested Solution: Execute the “npm dedupe” command to reduce the number of duplicate packages, or to manually modify package.json files

Resource:
https://docs.npmjs.com/cli/v7/commands/npm-dedupe

Hi @mahirkabir,

Thank you for raising the issue and providing the additional information / links about duplicate packages.

You mention that this is an issue with v0.8.0, which is now quite old version of amqplib. Can you advise whether this is also a problem with the current version of amqplib please?

Hi @cressie176 , Thank you for getting back to me. I ran the similar experiment for the current version of the project, and found the following duplicates -
debug
ms
minimatch
supports-color
ansi-styles
color-convert
color-name
camelcase
argparse
safe-buffer
decamelize
find-up
locate-path
p-locate
p-limit
which
glob
chalk
escape-string-regexp
has-flag
cliui
wrap-ansi
y18n
source-map

Hope this information helps.

Thanks again!

Thanks @mahirkabir,

Here's the output from npm dedupe

added 1 package, removed 10 packages, changed 9 packages, and audited 246 packages in 1s

I ran a diff of the node_modules folder before and after running npm dedupe and npm ci.

50d49
< bitsyntax/node_modules/safe-buffer 5.1.2
76d74
< cross-spawn/node_modules/which 2.0.2
132d129
< istanbul-lib-report/node_modules/has-flag 4.0.0
137d133
< istanbul-lib-source-maps/node_modules/source-map 0.6.1
187,188d182
< nyc/node_modules/glob 7.2.0
< nyc/node_modules/minimatch 3.1.2
216,217d209
< rimraf/node_modules/glob 7.2.0
< rimraf/node_modules/minimatch 3.1.2
226c218
< spawn-wrap/node_modules/which 2.0.2
---
> source-map 0.6.1
235d226
< test-exclude/node_modules/glob 7.2.0

It looks like while npm has reorganised the dependencies, thus slightly reducing the install footprint, however all of the "removed" packages still exist in the dependency tree, just in higher up locations. Therefore npm dedupe doesn't appear to reduce the attack surface since any vulnerable packages would still be present. Do you agree?

@cressie176 Isn't the idea to reduce amount of occurences, as opposed to flat-out remove dependencies? (as they are still used)

Hi @cressie176, as @kibertoad pointed out, the idea is to reduce unnecessary occurrences. The libraries still need to be there.

@kibertoad I suspect this is what npm dedupe is aiming for, but one of the benefits in the OP rationale was that it would reduce attack surface.

Rationale: When a JS application depends on too many packages or on multiple versions of the same package, its attack surface can grow dramatically; hackers can get a higher chance of successfully exploiting the vulnerabilities inside those packages (or versions), and escalating the potential damage.

It doesn't do this, since even though deduped, the same vulnerabilities will still exist

What would have reduced the attack surface is if npm dedupe had an option to squash minor and patch versions of the same module, since these should be semantically compatible. I wouldn't trust it though, since module authors occasionally publish incompatible changes without a major version bump.

Idea is that the less copies of same dependencies you have, the easier it is to keep them up-to-date, I believe. Either way it seems like a zero cost improvement, why not do it?

I wouldn't trust it though, since module authors occasionally publish incompatible changes without a major version bump.

Proper test suites would catch majority of regressions, though.

Idea is that the less copies of same dependencies you have, the easier it is to keep them up-to-date, I believe.

How does deduping the transitive dependencies make it easier to keep them up-to-date?

Either way it seems like a zero cost improvement, why not do it?

I'm not saying we shouldn't implement it. Reducing the install footprint is worthwhile by itself.

Proper test suites would catch majority of regressions, though

It would help, but I still wouldn't trust it.

How does deduping the transitive dependencies make it easier to keep them up-to-date?

Less entries in package-lock that can get outdated?

Less entries in package-lock that can get outdated?

I don't think it helps. Since npm dedupe doesn't squash compatible packages, the number of outdated dependencies would still be the same. e.g.

if module A and module B depend on module C@1.1.1 and module C@1.1.1 is superseded by C@1.1.2, both module A and module B still need to be updated.

This may be automated using something like npm audit fix but that would still work if the packages hadn't been deduped.

Once again, I'm no saying that dedupe isn't useful for reducing the install footprint, just that I don't see it provides a material benefit to security.

Yup, makes sense!

I've created #702 to discuss how we resolve duplicates.

To answer the two questions in the OP

  1. Will you remove the duplicates mentioned above? (Yes/No), and why?:
  • Yes, because it will reduce the install footprint, memory footprint and CPU cycles. I have no idea what the combined impact would be every popular package on npm did the same, but I suspect it would be significant.

  • Maybe. @kibertoad pointed out that package-lock.json gets ignored when amqplib is installed by other libraries, so I'm not sure there is any point in deduping, or even committing package-lock.json!

  1. Do you have any additional comments? (If so, please write it down):
  • The number of duplicates, and therefore I suspect transitive dependencies has grown considerably between amqpli@0.8.0 and amqpli@latest. If this trend is common across other modules, then dependency trees and number of duplicates are increasing generally. This would not surprise me as there is a rarely challenged belief that using 3rd party code is always more efficient, reliable and secure than writing your own. This view certainly has merit, however as modules become mature, they can be subject to feature bloat, which increase the attack surface without adding significant value. An example of this is log4shell. Anecdotally libraries like nyc, eslint and mocha also seem to suffer from this, and frequently trigger audit warnings. In some cases, selecting simpler libraries with fewer features and fewer dependencies may be a better option.
  • Unless I have misunderstood something, your statement "When a JS application depends on too many packages or on multiple versions of the same package, its attack surface can grow dramatically; hackers can get a higher chance of successfully exploiting the vulnerabilities inside those packages (or versions), and escalating the potential damage" is true but misleading in this context since npm dedupe does not appear to materially reduce this attack surface, because it does not reduce the number of versions of a package that are installed, and because package-lock.json is ignored.
  • I assume you have opened similar issues in other repositories, so if I am correct (that npm dedupe does not solve the problem you are presenting), but other maintainers have not remarked on this, then it suggests a degree of naivety from the community at large, which is interesting in and of itself.

@cressie176 thank you so much for taking time to share your thoughts on this. I must say, it helped me a lot in understanding the thought process of a developer.

Thank you both for this thread!

Hi @cressie176 , thank you again for taking time to share your views on npm dedupe. However, I would like to clarify something related to what you said in the last comment -

since npm dedupe does not appear to materially reduce this attack surface, because it does not reduce the number of versions of a package that are installed

If you take a look at the npm-dedupe documentation, you will see that they are claiming they indeed reduce the number of versions of a package that are installed.

In the documentation, two versions of c - c@1.0.3, and c@1.0.10 got reduced into one - c@1.0.10.

Please let me know if this information helps. Also, please let me know if you have questions.

Thanks @mahirkabir, I had missed that in the documentation. I'll run another test and see if this is indeed the case, however it won't actually have any impact amqplib's user base since the package-lock.json is excluded from the files published to npm.

Hi @cressie176. I understand. Thank you again for getting back to me. Let me know if you have more questions.

For completeness npm can squash libraries with compatible version. If I install glob@7.2.3 to amqplib directly, then run npm dedupe, it upgrades the glob packages introduced through transitive dependencies where the version specification allows.

Before npm i glob@^7.2.0

amqplib@0.10.3
├─┬ mocha@9.2.2
│ └── glob@7.2.0
└─┬ nyc@15.1.0
  ├── glob@7.2.0
  ├─┬ rimraf@3.0.2
  │ └── glob@7.2.0
  └─┬ test-exclude@6.0.0
    └── glob@7.2.0

After npm i glob@^7.2.0

amqplib@0.10.3 /Users/personal/Development/amqp.node/amqplib
├── glob@7.2.3
├─┬ mocha@9.2.2
│ └── glob@7.2.0
└─┬ nyc@15.1.0
  ├── glob@7.2.3 deduped
  ├─┬ rimraf@3.0.2
  │ └── glob@7.2.3 deduped
  └─┬ test-exclude@6.0.0
    └── glob@7.2.3 deduped

The only reason mocha's glob package wasn't upgraded is because the package.json depends on "7.2.0" exactly.

Thanks @mahirkabir, worth knowing