By Vasudev Ram
This post is about "regular" functions versus generator functions in Python. I'm using the term "regular" functions for lack of a better word; what I mean by that is non-generator functions.
Consider this text file, test1.txt:
this is a line with a foo and another foo and one more foo. the foo brown foo jumped over the lazy foo foo are you. you are foo.Here is a program, with a "regular" function, to process all the lines in that text file:
# regular_text_proc.py import string # Replace instances of the string old_pat with new_pat in line. def process_line(line, old_pat, new_pat): return line.replace(old_pat, new_pat) # Process a text file, calling process_line on each line. def regular_text_proc(filename, old_pat, new_pat): new_lines = [] with open(filename) as fp: for line in fp: new_line = process_line(line, old_pat, new_pat) new_lines.append(new_line) return new_lines def main(): newlines = regular_text_proc("test1.txt", "foo", "bar") print "new file:" for line in newlines: print line, main()This command:
python regular_text_proc.pygives this output:
new file: this is a line with a bar and another bar and one more bar. the bar brown bar jumped over the lazy bar bar are you. you are bar.Here is a program, with a generator function, to do the same kind of processing of the same file:
# lazy_text_proc.py # Lazy text processing with Python generators. import string # Replace instances of the string old_pat with new_pat in line. def process_line(line, old_pat, new_pat): return line.replace(old_pat, new_pat) # Process a text file lazily, calling process_line on each line. def lazy_text_proc(filename, old_pat, new_pat): with open(filename) as fp: for line in fp: new_line = process_line(line, old_pat, new_pat) yield new_line def main(): newlines = lazy_text_proc("test1.txt", "foo", "bar") print "type(newlines) =", type(newlines) # Line below will give error if uncommented, because # newlines is not a list, it is a generator. #print "len(newlines) =", len(newlines) print "new file:" for lin in newlines: print lin, main()This command:
python lazy_text_proc.pygives the same output as the regular_text_proc.py program, except for the type(newlines) output, which I added, to show that the variable called 'newlines', in this program, is not a list but a generator. (It is a list in the regular_text_proc.py program.)
I found the difference between these two programs, one with a regular function and the other with a generator function, to be interesting in a few ways. I'll discuss that in my next blog post.
The Wikipedia article on generators is of interest.
- Vasudev Ram - Dancing Bison Enterprises
Contact me
No comments:
Post a Comment