by rupe

How do I read a huge file line by line in Python, without loading the entire thing into memory first?

In Python, the most common way to read lines from a file is to do the following:

for line in open('myfile','r').readlines():

When this is done, however, the readlines() function loads the entire file into memory as it runs. A better approach for large files is to use the fileinput module, as follows:

import fileinput
for line in fileinput.input(['myfile']):

the fileinput.input() call reads lines sequentially, but doesn't keep them in memory after they've been read.


Annotation by enki :
as 'file' is iteratable, why not simply iterate on it? (use iter() when you need more control over the iterator's state)


for line in open('myfile','r'):


Read more of   The Yak's Frequently Questioned Answers   (mod.2010-02-10)

442.   What is erlang (and why should I care?)   [aestetix/2008-06-01]
351.   How can I setup netware services / mount netware servers in linux?   [jake/2003-02-25]
331.   how do i convert my gaim buddy list to a centericq buddy list?   [jesse/2002-10-16]
266.   Why won't sites with 24.x.x.x IPs respond to HTTP requests?   [rupe/2001-08-13]
265.   Who is Brad   [brad/2002-07-16]
250.   What's in a Ross Omelette?   [strick/2001-08-07] ( ross/2003-09-16 treesn/2001-12-30 )
140.   What was programming the Royal McBee LGP-30 like?   [rupe/2000-10-25]
132.   What is the official scale for grading Yak parties?   [rupe/2000-10-08]
124.   How do I make common modifications to my Gnu Emacs .emacs file?   [rupe/2000-09-26]
119.   What is a clitoris?   [n0b0dy/2000-09-14] ( strick/2003-04-02 )
93.   Where can I find a GPLed Z80 assembler / disassembler for Linux and DOS?   [rupe/2000-05-11]
91.   Should the answer to an FQA consist entirely of an external link?   [macki/2000-05-06] ( strick/2002-01-23 )
36.   [17:25] <y42> 26 <PRiS> [jesse666] Jesse, who the hell is jesse?   [strick/2000-05-18]
19.   What DNS servers does the YAK use?   [strick/2001-05-25]