Tamschi / yolo-xml

[WIP] A (hardened, validating, asynchronous) XML pull parser that respects your time.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

yolo-xml

Lib.rs Crates.io Docs.rs

Rust 1.54 CI Crates.io - License

GitHub open issues open pull requests good first issues

crev reviews Zulip Chat

An XML parser that respects your time.

yolo-xml aims to be an easy-to-use XML 1.1 and optionally Namespaces in XML 1.1 parsing library that is strictly validating according to the respective specifications (each version 1.1, including errata as of 2021-05) and safe (also in the security sense) to run against potentially malicious inputs.

These go hand-in-hand; once yolo-xml has been sufficiently audited, you should be able to use it as barrier against invalid XML format confusion attacks due to its strictness, for example.

In an ideal world nearly all parsers would be validating of course, but sometimes that's just not an option for one reason or another. (It should probably be more common though, even if the specification says it's optional.)

Apart from this, the library should be usable in as many ways as possible, for example with streamed XML as used in the XMPP protocol (which is the main motivation for creating yolo-xml).

A few notes:

  • yolo-xml operates on &mut futures_core::Stream<Item = Result<char, Box<E>>>.
  • It is likely slower than other available XML parsers written in Rust.

    Safety (in the general sense), correctness and reasonably small code size are given higher priority. Optimization pull requests are still appreciated.

  • It is Unicode-ignorant, that is by itself not fully normalizing and unable to check full normalization as per section 2.13 Normalization Checking.

    It's possible to provide such a normalizer or validator per document. Finer granularity must be implemented more explicitly.

    Note that this distinction only concerns Unicode character sequences; entity includes according to section 4.4.2 Included are always normalized by yolo-xml.

  • It is encoding-ignorant, i.e. neither able to detect nor validate the Unicode character encoding of a document. Encoding detection, including for each external entity, must be performed by an upstream decoder or similar mechanism, and a meaningful Byte Order Mark must have been consumed if present. See appendix E Autodetection of Character Encodings (Non-Normative) and informationally erratum E07 for more information.

    An application using yolo-xml can still easily validate the encoding, as meta data like the encoding declaration is visible through its consumer API.

Installation

Please use cargo-edit to always add the latest version of this library:

cargo add yolo-xml

Example

// TODO_EXAMPLE

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

See CONTRIBUTING for more information.

Versioning

yolo-xml strictly follows Semantic Versioning 2.0.0 with the following exceptions:

  • The minor version will not reset to 0 on major version changes (except for v1).
    Consider it the global feature level.
  • The patch version will not reset to 0 on major or minor version changes (except for v0.1 and v1).
    Consider it the global patch level.

This includes the Rust version requirement specified above.
Earlier Rust versions may be compatible, but this can change with minor or patch releases.

Which versions are affected by features and patches can be determined from the respective headings in CHANGELOG.md.

Note that dependencies of this crate may have a more lenient MSRV policy! Please use cargo +nightly update -Z minimal-versions in your automation if you don't generate Cargo.lock manually (or as necessary) and require support for a compiler older than current stable.

About

[WIP] A (hardened, validating, asynchronous) XML pull parser that respects your time.

License:Other


Languages

Language:Rust 100.0%