Thursday, September 4, 2014

Python Idiom: First Occurence

Finding the first occurrence in a collection of data is a common problem. 
 

# Non Idiomatic
found_line = None
for line in logfile:
   if regex.match(line):
      found_line = line
      break
return found_line

Compared to

# Idiomatic
return next(line for line in logfile if regex.match(line), None)


or

# Idiomatic (thanks to Suresh V)
from itertools import dropwhile
return next(dropwhile(lambda x: not regex.match(x), logfile), None)


The idiomatic solution is not only more compact, but it reads better.   It also gives the interpreter the opportunity to be more efficient in how it allocates memory due to the generator expression

§ 

5 comments:

  1. Idiomatic doesn't mean "pack as much into one expression as you can". I'd say instead that fixing your first example to avoid the unnecessary pieces:

    for line in logfile:
    ....if regex.match(line):
    ........return line
    return None

    makes it just as nearly idiomatic as your second version.

    ReplyDelete
    Replies
    1. I am not "packing as much into one expression as " I can. It is an often used and readable expression.

      Changing the example to use returns presumes that we don't want to use the iterator after. It also seems unnecessary to extract this logic in it's own function unless there is a significant number of additional operations occurring in the function.

      Delete
  2. Evenn better, use itertools.dropwhile

    ReplyDelete
  3. Your 'idiomatic' example does not work unless you wrap the generator expression in an extra set of parentheses.

    ReplyDelete