[Feature] 'yarn list --prod' support

Question

[Feature] 'yarn list --prod' support

rickity-cricket opened this issue 6 years ago · comments

Hi there,

I'm looking to add the above feature through a fork/pull-request and would like some guidance

I intend to add the ability for hub-detect to generate some bdio information from the output of the 'yarn list' command. To make this whole process easier, I thought I'd open a dialogue early to get your advice/recommendations/guidance.

Expected behavior

Take an argument (e.g. --detect.yarn.prod.only=true) and return/send bdio files to Black Duck

Actual behavior

Currently the tool only seems to support yarn.lock parsing, which is a bit over-zealous for external distribution projects

Having edited some of the code myself, I think I have a reasonable idea of the best way to achieve this, but would like some feedback from recent/regular contributors here. (CC: @ekerwin, @bamandel, @JakeMathews, etc)

Which data structure should be used to grab the CLI output?

I was trying to build a DependencyGraph, as is done in YarnPackager.groovy. There it is done through NameVersionNode objects. Is this the preferred way to build a parent-child data structure internally to hub-detect? Some detail here would be appreciated... This is best done through NVNBuilder? Is there a specific order in which linking should take place (parents added to root, then children to parent, or children to parent then parent to root? Does it matter?)? I assume grandchildren can be associated quite easily to existed parent-child relationships?

I also tried to use Dependency objects add them directly to a MutableDependencyGraph, but it seems this class is impossible to unit-test as it stands.
This would be my preferred method (making a simultaneous pull-request on integration-bdio for testability) but would this leave the resulting DependencyGraph without any essential information (e.g. linking data)?

Thanks for taking my bombardment of questions 😄 I'm just trying to open a dialogue because I would like to take a route you would prefer/expect (and I've gotten a bit lost in the data structures used for bdio 😏)

-- Jake

Jordan Piscitelli · Answer 1 · Fri Apr 13 2018 00:05:47 GMT+0800 (China Standard Time)

Hi Jake,

The Executable class can be used to run the Yarn cli and you can create any intermediate data structures you need to parse it. For example the NugetBomTool uses Gson to deserialize model classes while the GradleBomTool parses line by line. The implementation is up to the BomTool.

The preferred way to build a graph is with a MutableDependencyGraph. The NameVersionNode and it's transformers are to be (hopefully eventually) migrated to a graph or the graph builders. The GradleBomTool is a good example of a graph implementation.

When using the graph the order that you add dependencies does not matter and all relationships are maintained. I'm not sure what essential information you think you might lose?

Unit testing the graph is also our preferred approach. The best way to test the graph is just assert that it contains the dependency you are looking for with the relationship you are looking for. There are utility methods around GAV style dependency assertions. We have plans for more, the graph is fairly new. Gradle is a good example GradleDependenciesParserTest. We can add any test utility or tools you need.

I think that answers the bulk of your questions, but feel free to keep them coming. I'm happy to have a conversation!

jake · Answer 2 · Fri Apr 13 2018 00:55:13 GMT+0800 (China Standard Time)

This really helps! Thanks, @taikuukaits.
I'll be working on this again soon and will come back if I have more questions 😀

James Richard · Answer 3 · Fri Apr 13 2018 23:35:15 GMT+0800 (China Standard Time)

@rickity-cricket when you do start working on this again, Yarn has become one of our "nested" bom tools. Meaning it will check the source path for yarn and the sub directories to a specified depth.
So you will need to update the YarnBomToolSearcher if you want to change how we determine if yarn applies to a given directory.
Let us know if you need any other help!

jake · Answer 4 · Thu Apr 19 2018 00:01:31 GMT+0800 (China Standard Time)

Hi guys,

I have the logic nearly complete and tested for this feature, though there is still one hangup. Yarn list -prod output contains unresolved version numbers, e.g.

> ...
> │  ├─ array-equal@^1.0.0
> │  ├─ content-type-parser@^1.0.1
> │  ├─ cssom@>= 0.3.2 < 0.4.0
> │  ├─ cssstyle@>= 0.2.37 < 0.3.0
> │  ├─ escodegen@^1.6.1
> │  ├─ html-encoding-sniffer@^1.0.1
> │  ├─ nwmatcher@>= 1.3.9 < 2.0.0
> ...

I am not a yarn expert, so the best (only?) way I can think of to get a resolved version is in the yarn.lock file. So before I finish off the feature by parsing the yarn lock file into a map and searching for the resolved version, I thought I'd ask if you had any better ideas?

Thanks in advance!

Jake

jake · Answer 5 · Thu Apr 19 2018 00:03:06 GMT+0800 (China Standard Time)

Also, to answer you, @jamesrichard91, there wasn't a need to modify how yarn usage was determined as I simply took what was already used and simply checked for a new option I created in DetectConfiguration 😄

James Richard · Answer 6 · Thu Apr 19 2018 01:34:00 GMT+0800 (China Standard Time)

@rickity-cricket I can't think of a better way to get a resolved version, but then you may have to ask why not just parse the lock file in the first place?

James Richard · Answer 7 · Thu Apr 19 2018 01:35:39 GMT+0800 (China Standard Time)

What does 'yarn list' provide you over the yarn.lock?

jake · Answer 8 · Thu Apr 19 2018 17:27:02 GMT+0800 (China Standard Time)

@jamesrichard91 This feature request came from our javascript developers, who use the 'yarn list -prod' command to produce the dependencies that are used in production. This will eventually aid us in automating 3rd party checks for 'external/distributed' javascript/yarn projects produced internally by reducing noise. (e.g. yarn.lock file contains ~ 1400 components, 'yarn list -prod' gives ~ 200)

James Richard · Answer 9 · Fri Apr 20 2018 22:53:08 GMT+0800 (China Standard Time)

Your approach seems far superior if that is the case, and using the yarn.lock to resolve fuzzy versions seems like the best solution for now

James Richard · Answer 10 · Wed May 02 2018 00:20:47 GMT+0800 (China Standard Time)

Closing this issue since the code has been merged