LibreCat / Catmandu-HTML

Catmandu modules to process HTML files

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NAME

Catmandu::HTML - Modules for handling HTML data within the Catmandu framework

SYNOPSIS

Command line client catmandu:

catmandu convert HTML to JSON < ex/test.html > data/test.json

catmandu convert JSON to HTML < data/test.json > ex/test.html

# Keep the text
curl -s http://librecat.org |
  catmandu convert HTML to Text --fix "html_text()"

# Filter out the anchors
curl -s http://librecat.org |
  catmandu convert HTML to Text --fix "html_filter_tag(a); html_text(join:\"\n\")"

# Filter out the meta tags as YAML
curl -s http://librecat.org |
  catmandu convert HTML to YAML --fix "html_filter_tag(meta, group_by:name)"

See documentation of modules for more examples.

AVAILABLE MODULES

SEE ALSO

This module is based on Catmandu

AUTHORS

  • Patrick Hochstenbach, patrick.hochstenbach at ugent.be

COPYRIGHT AND LICENSE

The Perl software is copyright (c) 2018 by Patrick Hochstenbach. This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

About

Catmandu modules to process HTML files

License:Other


Languages

Language:HTML 66.9%Language:Perl 33.1%