[feature req] choose print text between or after matches
daniejstriata opened this issue · comments
In grep I can print what it finds between values using easy to remember syntax for perl-regexp. Could choose
be extended to also find text between matches that is easier to use than perl-regexp?
choose -m string1 string2
to print the value found after matching string1 and stopping at string2.
choose -M string1 string2
to print the value including the matched string1 and string2. adding 0 1 2 ... could then print specific characters inside the matched result.
choose -a string1
will look for string1 and print all the text on that line after matching string1.
choose -A string1
will look for string1 and print all the text on that line including string1 that matched. adding 0 1 2 ... could then print specific characters inside the matched result.
Here are some grep examples:
Grep from match to end
Example 1
Text to match
$ gdu --version
Version: v5.20.0
Built time: Sat Oct 22 10:48:31 PM CEST 2022
Built user: dundee
gdu --version | grep -oP '(?<=Version:\t\s).*'
Output:
v5.20.0
Example 2
Text to match
$ openssl x509 -noout -enddate -in /etc/ssl/certs/COMODO_Certification_Authority.pem
notAfter=Dec 31 23:59:59 2029 GMT
Keep only value
openssl x509 -noout -enddate -in /etc/ssl/certs/COMODO_Certification_Authority.pem | grep -oP '(?<=notAfter=).*'
Dec 31 23:59:59 2029 GMT
Grep between matches
Text to match
docker inspect 9512b532dcaf1 | grep tls
"/etc/dockers/conf/web/tls:/etc/ssl/nginx:ro",
"Source": "/etc/dockers/conf/web/tls",
docker inspect 9512b532dcaf1 | grep -oP '(?<="Source": ").*(tls)'
/etc/dockers/conf/web/tls
vs
docker inspect 9512b532dcaf1 | grep -oP '(?<="Source": ").*(?=tls)'
/etc/dockers/conf/web
hey @daniejstriata, this is an interesting idea. I think that you can already do this. Take a look at these examples (based on your examples):
$ echo 'Version: v5.20.0' | choose -f 'Version:\s+' 0
v5.20.0
You could also replace the 0 with : here. The same solution applies to the openssl example.
> echo ' "Source": "/etc/dockers/conf/web/tls",' | choose -f '("Source": "|tls)' 1
/etc/dockers/conf/web/
Using regex field separators with an or condition (|
) lets you effectively set a beginning and end. Assuming the text only appears once in a line, the content between the start and end will always be index 1.
So returning to the string1/string2 examples at the top of your comment, the line lorem ipsum string1 dolor sit string2 amet
can be split with choose -f '(string1|string2)'
, to select the text between the separators, or choose -f 'string1'
to select the text after the separator. I believe this addresses the lower case -m, -a proposal.
Admittedly, this does not solve your suggested case of including the start and end strings, but I see no reason not to use grep "string1.*string2"
or grep "string1.*" for that case. Then you can use
choose -c` to select characters within that. This should address the upper case -M, -A proposal.
Based on these alternatives, I don't think any further change is needed to support your usecase. However, I'd be interested to hear if you feel differently and have more examples.