Wilfred / difftastic

a structural diff that understands syntax πŸŸ₯🟩

Home Page:https://difftastic.wilfred.me.uk/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request: Compare Subdirectories, ignore directory name difference

miketheman opened this issue Β· comments

Hello! Thanks for making this tool - it's quite cool and fast.

I've been using difft as well as diffoscope and was curious if you had a solution to a problem diffoscope solves, that I don't think difft handles today.

Here's my current use case, let me know if it doesn't make sense:

I often expand multiple zip files containing a variety of files, and want to establish a few things:

a. How "same" are these zip file contents?
b. What are the distinct differences?

For a, I'd love a way to emit only a percentage as output, somewhat similar to --check-only - that way I could run a lot of diffs and only stop when something is either too similar or too divergent.

I think difft solves most of b, but doesn't handle path name differences yet, unless I don't have the right flags.

Here's an example expanded layout of two almost-identical zip files:

analyticsclient-6502
β”œβ”€β”€ LICENSE.txt
β”œβ”€β”€ MANIFEST.in
β”œβ”€β”€ PKG-INFO
β”œβ”€β”€ README.md
β”œβ”€β”€ analyticsclient.egg-info
β”‚   β”œβ”€β”€ PKG-INFO
β”‚   β”œβ”€β”€ SOURCES.txt
β”‚   β”œβ”€β”€ dependency_links.txt
β”‚   β”œβ”€β”€ requires.txt
β”‚   └── top_level.txt
β”œβ”€β”€ data
β”‚   └── data_file
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ setup.cfg
β”œβ”€β”€ setup.py
└── tests
    β”œβ”€β”€ __init__.py
    └── test_simple.py

4 directories, 15 files

brotli-bin-0.0.1
β”œβ”€β”€ LICENSE.txt
β”œβ”€β”€ MANIFEST.in
β”œβ”€β”€ PKG-INFO
β”œβ”€β”€ README.md
β”œβ”€β”€ brotli_bin.egg-info
β”‚   β”œβ”€β”€ PKG-INFO
β”‚   β”œβ”€β”€ SOURCES.txt
β”‚   β”œβ”€β”€ dependency_links.txt
β”‚   β”œβ”€β”€ requires.txt
β”‚   └── top_level.txt
β”œβ”€β”€ data
β”‚   └── data_file
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ setup.cfg
β”œβ”€β”€ setup.py
└── tests
    β”œβ”€β”€ __init__.py
    └── test_simple.py

4 directories, 15 files

Both contain same files, but have slightly differing different paths.

Using diffoscope I execute: diffoscope --exclude-directory-metadata=yes analyticsclient-6502 brotli-bin-0.0.1 and get an output of the differences for the PKG_INFO files in the *-info sudirectories side by side - but difft can't compare those yet.

(--exclude-directory-metadata=yes is to remove comparing the times and dates on the files, but I think that's diffoscope-specific and difft doesn't care about that yet.)