python - Limiting amount read using readline -


i'm trying read first 100 lines of large text files. simple code doing shown below. challenge, though, have guard against case of corrupt or otherwise screwy files don't have line breaks (yes, people somehow figure out ways generate these). in cases i'd still read in data (because need see what's going on in there) limit to, say, n bytes.

the way can think of read file char char. other being slow (probably not issue 100 lines) worried i'll run trouble when encounter file using non-ascii encoding.

is possible limit bytes read using readline()? or there more elegant way handle this?

line_count = 0 open(filepath, 'r') f:     line in f:         line_count += 1         print('{0}: {1}'.format(line_count, line))         if line_count == 100:             break 

edit:

as @fredrik correctly pointed out, readline() accepts arg limits number of chars read (i'd thought buffer size param). so, purposes, following works quite well:

max_bytes = 1024*1024 bytes_read = 0  fo = open(filepath, "r") line = fo.readline(max_bytes) bytes_read += len(line) line_count = 0 while line != '':     line_count += 1     print('{0}: {1}'.format(line_count, line))     if (line_count == 100) or (bytes_read == max_bytes):         break     else:         line = fo.readline(max_bytes - bytes_read)         bytes_read += len(line) 

if have file:

f = open("a.txt", "r") f.readline(size) 

the size parameter tells maximum number of bytes read


Comments

Popular posts from this blog

sublimetext3 - what keyboard shortcut is to comment/uncomment for this script tag in sublime -

java - No use of nillable="0" in SOAP Webservice -

ubuntu - Laravel 5.2 quickstart guide gives Not Found Error -