learnbyexample / Command-line-text-processing

:zap: From finding text to search and replace, from sorting to beautifying text and more :art:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

XML/HTML parsing

larrykollar opened this issue · comments

What did you have in mind for this? I could maybe throw something together for the xmllib tools (xmllint, xsltproc) and/or xmlgawk (gawk with a SAX-like XML parser). Xmllib tools also have a —html option to parse XHTML (maybe regular HTML as well, but haven’t tried).

I’ve also wrote awk scripts to convert various patterned text to XML, if that sounds useful.

I was thinking of xmlstarlet for xml, jq for json and so on.

But the problem is I don't have any serious experience working with xml/json in general, let alone those tools. I'll need plenty of time to learn but I'm currently busy with other stuffs. That's why those topics are suspended.

If you wish to write a tutorial, I'd suggest to start on your own, as a repo like mine or as blog post and so on.