RichMorin / tidy_ex

Elixir binding to the granddaddy of HTML tools

Home Page:http://www.html-tidy.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Build status ModestEx version Hex.pm

Broom by faisalovers from the Noun Project

TidyEx

TidyEx corrects and cleans up HTML content by fixing markup errors.

Elixir/Erlang bindings for htacg's tidy-html5

The granddaddy of HTML tools, with support for modern standards http://www.html-tidy.org

The binding is implemented as a C-Node following the excellent example in Overbryd's package nodex. If you want to learn how to set up bindings to C/C++, you should definitely check it out.

  • nodex
    • distributed Elixir
    • save binding with C-Nodes

C-Nodes are external os-processes that communicate with the Erlang VM through erlang messaging. That way you can implement native code and call into it from Elixir in a safe predictable way. The Erlang VM stays unaffected by crashes of the external process.

Example

For more examples please checkout tests.

test "can parse broken html" do
  result = TidyEx.parse("<div>Hello<span>World")
  assert result == "<div>Hello<span>World</span></div>"
end

test "can clean and repair broken html" do
  result = TidyEx.clean_and_repair("<div>Hello<span>World")
  assert result == "<div>Hello<span>World</span></div>"
end

test "can run diagnostics invalud html" do
  result = TidyEx.run_diagnostics("<pp>Hello World</p>")
  assert result == "line 1 column 1 - Error: <pp> is not recognized!\nThis document has errors that must be fixed before\nusing HTML Tidy to generate a tidied up version."
end

Installation

Available on hex.

def deps do
  [
    {:tidy_ex, "~> 0.1.0-dev"}
  ]
end

Target dependencies

cmake 3.x
erlang-dev
erlang-xmerl
erlang-parsetools

Compile and test

mix deps.get
mix compile
mix test

Cloning

git clone git@github.com:f34nk/tidy_ex.git
cd tidy_ex

All binding targets are added as submodules in the target/ folder.

git submodule update --init --recursive --remote
mix deps.get
mix compile
mix test
mix test.target

Cleanup

mix clean

Roadmap

See CHANGELOG.

  • Bindings
    • Call as C-Node
    • Call as dirty-nif
  • Tests
    • Call as C-Node
    • Call as dirty-nif
    • Target tests
    • Feature tests
    • Package test
  • Features
    • Set tidy-html5 options
    • Serialize any string with valid or broken html
    • Clean and repair
    • Run diagnostics
  • Documentation
  • Publish as hex package

Icon Credit

Broom by faisalovers from the Noun Project

About

Elixir binding to the granddaddy of HTML tools

http://www.html-tidy.org/

License:GNU Lesser General Public License v2.1


Languages

Language:C 60.7%Language:Elixir 13.1%Language:C++ 10.9%Language:CMake 6.3%Language:HTML 4.8%Language:Shell 4.2%