dinubs / jam-api

Parse web pages using CSS query selectors

Home Page:http://www.jamapi.xyz

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

More complex example please

NickStees opened this issue · comments

I am struggling to do a more complex implementation of this... for example given the following HTML (and having multiples on the page)

<div class="card">
  <div class="card-title">Title</div>
  <div class="card-desc">Description text here</div>
  <a href="#link" class="card-link">More info</a>
</div>

How would you format your JSON request to get those cards? This is what I assume it would be but lack of docs has me guessing...

{
    "title": "title",
    "news": [{
        "elem": ".card",
        "cardTitle": ".card .card-title",
        "cardDesc": ".card .card-desc"
    }]
}

Also the example on jamapi.xyz should be like something above, and not rely on website that can change like it does.

Unfortunately jam-api doesn't support nesting like you have there, what you'd have to do instead is something like this:

{
    "title": "title",
    "news_titles": [".card-title"],
    "news_descs": [".card .card-desc"]
}

Sorry about that, there's an issue for a nesting api, but I haven't started on it at all.

Hello
I read this thread but still can't understand how to work with this library.
For example we have a standard Wordpress blog:
http://gargo.of.by/

How to get an array of ["title", "text", "link"] for each article from this page? I understand that I could take this info via RSS but it is just an example.
I ask because there are a lot of examples on your website but I understand how to take an array of tags by their name only.

@NickStees

I made npm package like that
https://github.com/rike422/kirinuki-core

following example by using kirinuki-core

{
    "title": "title",
    "news": {
        "_unfold": true,
        "cardTitle": ".card .card-title",
        "cardDesc": ".card .card-desc"
    }
}

would you like to try use this library?

Hey @gerchicov-bp, somehow your message slipped through my notification feed, incredibly sorry about that.

It does look like @rike422's package might be better for this specific instance, it's currently very difficult to grab multiple elements like you want. If you are still interested in using jamapi you can use the following json data to get it working:

{
  "article_links_and_titles": [{"elem": ".entry-title a", "link": "href", "title": "text"}],
  "article_texts": [".entry-content"]
}