python - Scrapy print to json file -


i run spider against craigslist , save results json file using scrapy. spider displays results in console, .json file empty. command using is:

scrapy runspider detroit.py -o detroit.json

can shed little light, thanks!

from scrapy.spider import basespider scrapy.selector import htmlxpathselector craigslist_sample.items import craigslistsampleitem  class myspider(basespider):         name = "craig"         allowed_domains = ["craigslist.org"]         start_urls = ["http://detroit.craigslist.org/search/sof"]           def parse(self, response):                 hxs = htmlxpathselector(response)                 titles = hxs.select("//span[@class='pl']")                 titles in titles:                         title = titles.select("a/text()").extract()[0]                         link = titles.select("a/@href").extract()[0]                         print title, link 

that's because printing results. need instantiate items , return them:

def parse(self, response):     elm in response.xpath("//span[@class='pl']//a"):         item = craigslistsampleitem()         item["title"] = elm.xpath("text()").extract_first()         item["link"] = elm.select("href").extract_first()         yield item 

Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -