learnbyexample / learn_gnugrep_ripgrep

Example based guide to mastering GNU grep and ripgrep

Home Page:https://learnbyexample.github.io/learn_gnugrep_ripgrep/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong Count on Chapter 2: Question g

jusuchin85 opened this issue · comments

The question:

Find total count of whole word the (irrespective of case).

$ grep ##### add your solution here
8090

does not return 8090, but rather 7941:

20221121_162154

Hmm, not sure what could be the issue. What do you get if you try LC_ALL=C grep -iow 'the' dracula.txt | wc -l ?

Also, what is your GNU grep version? I just checked with 3.4 and 3.6, and I got 8090 (rg -ciow 'the' dracula.txt also matched this number).

$ wget https://www.gutenberg.org/files/345/old/345.txt -O dracula.txt
--2022-11-21 14:53:55--  https://www.gutenberg.org/files/345/old/345.txt
Resolving www.gutenberg.org (www.gutenberg.org)... 2610:28:3090:3000:0:bad:cafe:47, 152.19.134.47
Connecting to www.gutenberg.org (www.gutenberg.org)|2610:28:3090:3000:0:bad:cafe:47|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 883158 (862K) [text/plain]
Saving to: ‘dracula.txt’

dracula.txt          100%[======================>] 862.46K   287KB/s    in 3.0s    

2022-11-21 14:53:59 (287 KB/s) - ‘dracula.txt’ saved [883158/883158]

$ grep -iow 'the' dracula.txt | wc -l
8090

Hey @learnbyexample ! I've tried again with the LC_ALL=C grep -iow 'the' dracula.txt | wc -l command, and same result. I do think it's something to do with my grep version (currently on 2.6?):

20221122_100001


You're right! I've upgraded my grep to 3.8 via Homebrew, and now I'm getting what you've mentioned:

20221122_100316

Sorry, this is a non-issue. I didn't realise that different versions of grep would output differently. 😃