python - Getting crawled information in dictionary format -
i getting information usual text, want output in key/values format. eg:
{'base pay':'$140,000.00 - $160,000.00 /year'}, {'employment type':'full-time'}, {'job type':'information technology, engineering, professional services'}
this code:
from bs4 import beautifulsoup import urllib website = 'http://www.careerbuilder.com/jobseeker/jobs/jobdetails.aspx?apath=2.21.0.0.0&job_did=j3h7fw656rr51clg5hc&shownewjdp=yes&ipath=rskv' html = urllib2.urlopen(website).read() soup = beautifulsoup(html) elm in soup.find_all('section',{"id":"job-snapshot-section"}): dn = elm.get_text() print dn
this output code:
job snapshot base pay $140,000.00 - $160,000.00 /year employment type full-time job type information technology, engineering, professional services education 4 year degree experience @ least 5 year(s) manages others not specified relocation no industry computer software, banking - financial services, biotechnology required travel not specified job id ee-1213256
i have edited code requested including required import of libraries
i'd suggest:
dict(i.strip().split('\n') in text.split('\n\n') if len(i.strip().split('\n')) == 2)
output:
{'job id': 'ee-1213256', 'manages others': 'not specified', 'job type': 'information technology, engineering, professional services', 'relocation': 'no', 'education': '4 year degree', 'base pay': '$140,000.00 - $160,000.00 /year', 'experience': 'at least 5 year(s)', 'industry': 'computer software, banking - financial services, biotechnology', 'employment type': 'full-time', 'required travel': 'not specified'}
Comments
Post a Comment