python - How to select spans under one div but not another via XPath? -
say have page:
<div class="top"> <span class="strings">asdf</span> <span class="strings">qwer</span> <span class="strings">zxcv</span> </div> <div id="content"> other text <span class="strings">1234</span> <span class="strings">5678</span> <span class="strings">1234</span> </div> how script scrape span class strings in div id="content", not div class="top"? results should '1234', '5678', '1234'.
here code far:
from lxml import html import requests url = 'http://www.amazon.com/dp/b00sggqrno' response = requests.get(url) tree = html.fromstring(response.content) bullets = tree.xpath('//span[@class="strings"]/text()') print ('bullets: ',bullets)
to select text of span elements (with @class="strings") children of div element @id="content, use xpath expression:
//div[@id="content"]/span[@class="strings"]/text()
Comments
Post a Comment