The index in the xpath doesn't work
leon0707 opened this issue · comments
In page 37
scrapy shell https://www.gumtree.com/p/commercial-property-to-rent/south-kensington-to-let-serviced-office-space-in-sloane-avenue-sw3-south-kensington/1258815123
>>> response.xpath('//*[@][1]').extract()
[u'<meta content="1190.00pm">', u'<meta content="1750.00pw">', u'<meta content="346.00pw">', u'<meta content="50.00pw">', u'<meta content="625.00pm">', u'<meta content="250.00pm">', u'<meta content="300.00pm">', u'<meta content="400.00pm">', u'<meta content="500.00pm">', u'<meta content="190.00pm">', u'<meta content="502.00pm">']
[1] in the xpath doesn't work, since it returns all <meta ...>. The first one is the price of the property, rest are the prices of similar property.
Copied from chrome: /html/body/div[2]/div/div[3]/main/div[2]/header/span/meta[2]
. If I try this xpath, it return empty list.
@lookfwd Appreciate the effort you put in this book.
I think the xpath to find the price on a Gumtree is incorrect.
The correct one should be response.xpath('(//*[@])[1]').extract()
//*[@][1]
would return all elements whose itemprop is "price" and which are the first child of their parents.
Thanks @leon0707 . Both are correct. I will update them in the next version of the book. Thanks a million!