by rupe

How do I read a huge file line by line in Python, without loading the entire thing into memory first?

In Python, the most common way to read lines from a file is to do the following:

for line in open('myfile','r').readlines():
do_something(line)


When this is done, however, the readlines() function loads the entire file into memory as it runs. A better approach for large files is to use the fileinput module, as follows:

import fileinput
for line in fileinput.input(['myfile']):
do_something(line)


the fileinput.input() call reads lines sequentially, but doesn't keep them in memory after they've been read.

 


Annotation by enki :
as 'file' is iteratable, why not simply iterate on it? (use iter() when you need more control over the iterator's state)

ex:

for line in open('myfile','r'):
    doSomething(line)

 


 
Read more of   The Yak's Frequently Questioned Answers   (mod.2008-06-12)

436.   How should an American use their cellphone on a short trip to Europe?   [strick/2006-07-25]
435.   What is a lezzie lad or dyke dude? (lezzielad and dykedudes unite!)   [jake/2006-06-19]
426.   what is pair programming or what is therapy programming?   [jake/2005-08-26]
352.   Where can I see an example of looping code in G and in C?   [jake/2003-02-26]
285.   I use gnome, my soundcard died, lots of things need sound to run! what do i do?   [jesse/2001-11-13]
239.   What's the Grubstake?   [rupe/2001-06-05]
225.   What do I use to write cool Palm software?   [combee/2001-11-01] ( combee/2001-09-21 )
220.   Where can I find serial/parallel controllable, backlit LCD screens?   [rupe/2001-05-30] ( sidd/2001-05-30 )
191.   What did Jon Postel (IANA)'s obituary look like?   [rupe/2001-03-06] ( strick/2001-03-11 )
179.   How can I become a hacker?   [strick/2001-06-21] ( rupe/2001-02-28 )
128.   What is a TINI, and what are some good sources of information on it?   [jesse/2000-10-07]
121.   What time is it?   [rupe/2000-09-18] ( josh/2001-04-03 strick/2001-03-10 )
114.   Where's a good place to find advice about 401(k) plans?   [rupe/2000-08-07]
94.   How do I boot my Sparc Classic/Sparc Classic X/Sparc 5/Sparc 10 without a keyboard or monitor?   [rupe/2000-05-12]
72.   What happens when you smoke weed every day for a year and then quit cold turkey.   [vonguard/2000-04-04]
20.   What is a good tool to beautify HTML? Can it handle the mess generated by many web-page generation programs, especially the bloated files created by Word2000?   [jamison/2000-02-10]