Getting table from HTML file with Python -


game_link = "http://espn.go.com/nba/playbyplay?gameid=400579510&period=0" game_source = urlopen(game_link) game_html = game_source.read() game_source.close(); row = beautifulsoup(game_html, "html.parser") pieces = list(row.children) 

i need game log rows above link above code gives me whol html text how can extract tables , turn them single rowns (pieces).

you try beautifulsoup.findall , supply tag , other attributes may know tags looking for. after looking @ page looks you're looking <tr> tags class even. use soup.findall("tr", attrs = {"class": "even"}). example.

import urllib.request bs4 import beautifulsoup  game_link = "http://espn.go.com/nba/playbyplay?gameid=400579510&period=0" game_source = urllib.request.urlopen(game_link) game_html = game_source.read() game_source.close(); soup = beautifulsoup(game_html, "html.parser") # find instances of row class "even" rows = soup.findall("tr", attrs = {"class": "even"}) row in rows:     // work     print(row) 

you still need parse html each row. following "crude" example.

def parse_row(row):     cols = row.findall("td") # each column in row     # ignore timeouts, example     if len(cols) < 4:         return none     else:         return {                 "time": cols[0].get_text(),                 "team1": cols[1].get_text(),                 "score": cols[2].get_text(),                 "team2": cols[3].get_text()                }  parsed_rows = [] row in rows:     parsed = parse_row(row)     if parsed:         parsed_rows.append(parsed) 

Comments

Popular posts from this blog

routing - AngularJS State management ->load multiple states in one page -

python - GRASS parser() error -

Swift game error message -