matthewmueller / x-ray

The next web scraper. See through the <html> noise.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

selecting items and handle them sequentially

moelfassi opened this issue · comments

Subject of the issue

this is my page:
var html =
"<div class='time_head'>time_head content1</div>"
+ "<div class='blockfix'>blockfix1</div>"
+ "<div class='blockfix'>blockfix2</div>"
+ "<div class='time_head'>time_head content2</div>"
+ "<div class='blockfix'>blockfix3</div>"
+ "<div class='blockfix'>blockfix4</div>"
+ "<div class='blockfix'>blockfix5</div>";

i need to get the results in that order like :
TIME_HEAD CONTENT1
----blockfix1
----blockfix2
TIME_HEAD CONTENT2
----blockfix3
----blockfix4

this what i tried so far:
x(html, {
head: ['.time_head'],
games: ['.blockfix']

})(function (err, obj) {
console.log(obj['head']);
console.log(obj['games']);
});

Actual behaviour

but the result is:

[ 'time_head content1', 'time_head content2' ]
[ 'blockfix1', 'blockfix2', 'blockfix3', 'blockfix4', 'blockfix5' ]

Is the number of time_head divs consistent?

Is the number of time_head divs consistent?

No.. they are dates of events

I think the solution is to capture them non-sequentially then sort them is a post-processing step.

+1