esafirm / skrape

Kotlin DSL to scrape HTML and convert it to JSON

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OSS Skrape Banner

Skrape

Turn your HTML to JSON with graph based Kotlin DSL 💪

Support Me!

I would make myself more commited to this repo and OSS works in general.

Would you help me achieving this goals?

Buy Me a Coffee at ko-fi.com

Getting Started

Define your query in type-safe Kotlin DSL

Page("https://news.ycombinator.com/") {
    "items" to query("td a.storylink") {
    "text" to text()
      "info" to container {
        "link" to attr("href")
      }
    }
  }.run {
    Skrape(JsoupDocumentParser()).request(this)
  }

To predictable JSON result

{
    "items": [
        {
            "text": "SFO near miss could have triggered \u2018greatest aviation disaster in history'",
            "detail": {
                "link": "http://www.mercurynews.com/2017/07/10/exclusive-sfo-near-miss-might-have-triggered-greatest-aviation-disaster-in-history/"
            }
        },
        {
            "text": "Taking control of all .io domains with a targeted registration",
            "detail": {
                "link": "https://thehackerblog.com/the-io-error-taking-control-of-all-io-domains-with-a-targeted-registration/"
            }
        }
    ]
    ...
}

Binaries

Add to your root build.gradle

allprojects {
    repositories {
        ...
        maven { url 'https://jitpack.io' }
	}
}

Then add the dependency

dependencies {
    compile 'com.github.esafirm:skrape:x.y.z'
}

Where x.y.z is the latest release (can be viewed from Github release page or Badge.

License

MIT

About

Kotlin DSL to scrape HTML and convert it to JSON

License:MIT License


Languages

Language:Kotlin 53.8%Language:HTML 46.0%Language:Shell 0.2%