AlDanial / cloc

cloc counts blank lines, comment lines, and physical lines of source code in many programming languages.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

error when there are Chinese files in cloc --diff --git

zhuheng-mark opened this issue · comments

hi, I use cloc --git --diff compare 2 committed diffs ,but get a error when changed file name is Chinese

versioin: 1.8.4

os: ubuntu

git add 使用说明.txt

git commit -m "add chinese txt file"

cloc -v  --git --diff HEAD^ HEAD

git ls-tree --name-only -r HEAD
git ls-tree --name-only -r HEAD^
git archive -o /tmp/t4rZiA42Ro.tar HEAD 
'"\344\275\277\347\224\250\350\257\264\346\230\216.txt"' 'test.c'
fatal: pathspec '"\344\275\277\347\224\250\350\257\264\346\230\216.txt"' did not match any files
Failed to create tarfile of files from git. at /var/lib/jenkins/workspace/cloc-1.84/cloc line 4703.

使用说明.txt

I view this as a git problem, not a cloc problem.

The issue is that the output from git ls-files or git ls-tree cannot be ingested by the git archive command. The filename 使用说明.txt goes in, but when queried, the file name "\344\275\277\347\224\250\350\257\264\346\230\216.txt" comes out--and git claims the repo has no files named "\344\275\277\347\224\250\350\257\264\346\230\216.txt".

Unless you know of a way to revert "\344\275\277\347\224\250\350\257\264\346\230\216.txt" back to 使用说明.txt, there's little for me to do.

I see what you mean and i get the answer
https://stackoverflow.com/questions/22827239/how-to-make-git-properly-display-utf-8-encoded-pathnames-in-the-console-window

jenkins@Xen11-Build-01:~/workspace/tmp/test$ git config --global core.quotepath off
jenkins@Xen11-Build-01:~/workspace/tmp/test$ git ls-tree HEAD
100644 blob 60442406112bcec13fe37346f25fef867f6dad82	test.c
100644 blob 60442406112bcec13fe37346f25fef867f6dad82	test.java
100644 blob 16d5b0ad9056fc08445b2237b1e302d4daf60669	使用说明.txt

then cloc runs normally and gets the desired result 😄
Thank you again for your help

I appreciate the research you did on this. I'll add mention the git config --global core.quotepath off in the README.md.

Thanks for solving. I also ran into exact same problem with Danish characters.