Header matching fails on unicode filenames (git diff)
thoward opened this issue · comments
Troy Howard commented
I came across a git header that contained filenames with unicode. Git chose to wrap those in quotes. The regex git_diffcmd_header
in patch.py
couldn't parse the filenames out of the header due to the quotes.
I monkey patched this in my code with the following line:
whatthepatch.patch.git_diffcmd_header = re.compile('^diff --git "?a/(.+)"? "?b/(.+)"?$')
That seems to work!
Here's an example of the offending diff header (could be used as a fixture):
diff --git "a/somedata\340\270\202\340\270\233.json" "b/somedata\340\270\202\340\270\233.json"
deleted file mode 100644
index 817de57..0000000
--- "a/somedata\340\270\202\340\270\233.json"
+++ /dev/null
@@ -1 +0,0 @@
-[...snip...]
Thomas Grainger commented
@thoward can you make a PR with these changes?