Session Eight: Generators, Iterators, Decorators, and Context Managers

The tools of Pythonicity

Last Session!

  • All this material is optional
  • 5 optional homeworks in canvas
  • Turn in your homework by next Wednesday (October 14)
  • You must get 85% of the points in class to pass.

Today’s Agenda

  • Review Circle Testing, HTML Renderer
  • Complexity, Data Structures
  • Decorators
  • Iterators
  • Generators
  • Context Managers
  • The Future

Review/Questions

Review of Previous Class

  • Advanced OO Concepts
    • Properties
    • Special Methods
  • Testing with pytest

Homework review

  • Circle Class
  • Writing Tests using the pytest module

Decorators

A Short Digression

Functions are things that generate values based on input (arguments).

In Python, functions are first-class objects.

This means that you can bind symbols to them, pass them around, just like other objects.

Because of this fact, you can write functions that take functions as arguments and/or return functions as values (we played with this a bit with the lambda magic assignment):

def substitute(a_function):
    def new_function(*args, **kwargs):
        return u"I'm not that other function"
    return new_function

A Definition

There are many things you can do with a simple pattern like this one.

So many, that we give it a special name:

Decorator

“A decorator is a function that takes a function as an argument and returns a function as a return value.”

That’s nice and all, but why is that useful?

An Example

Imagine you are trying to debug a module with a number of functions like this one:

def add(a, b):
    return a + b

You want to see when each function is called, with what arguments and with what result. So you rewrite each function as follows:

def add(a, b):
    print(u"Function 'add' called with args: %r" % locals())
    result = a + b
    print(u"\tResult --> %r" % result)
    return result

That’s not particularly nice, especially if you have lots of functions in your module.

Now imagine we defined the following, more generic decorator:

def logged_func(func):
    def logged(*args, **kwargs):
        print(u"Function %r called" % func.__name__)
        if args:
            print(u"\twith args: %r" % args)
        if kwargs:
            print(u"\twith kwargs: %r" % kwargs)
        result = func(*args, **kwargs)
        print(u"\t Result --> %r" % result)
        return result
    return logged

We could then make logging versions of our module functions:

logging_add = logged_func(add)

Then, where we want to see the results, we can use the logged version:

In [37]: logging_add(3, 4)
Function 'add' called
    with args: (3, 4)
     Result --> 7
Out[37]: 7

This is nice, but we have to call the new function wherever we originally had the old one.

It’d be nicer if we could just call the old function and have it log.

Remembering that you can easily rebind symbols in Python using assignment statements leads you to this form:

def logged_func(func):
    # implemented above

def add(a, b):
    return a + b
add = logged_func(add)

And now you can simply use the code you’ve already written and calls to add will be logged:

In [41]: add(3, 4)
Function 'add' called
    with args: (3, 4)
     Result --> 7
Out[41]: 7

Syntax

Rebinding the name of a function to the result of calling a decorator on that function is called decoration.

Because this is so common, Python provides a special operator to perform it more declaratively: the @ operator:

# this is the imperative version:
def add(a, b):
    return a + b
add = logged_func(add)

# and this declarative form is exactly equal:
@logged_func
def add(a, b):
    return a + b
The declarative form (called a decorator expression) is far more common, but both have the identical result, and can be used interchangeably.

Callables

Our original definition of a decorator was nice and simple, but a tiny bit incomplete.

In reality, decorators can be used with anything that is callable.

In python a callable is a function, a method on a class, or even a class that implements the __call__ special method.

So in fact the definition should be updated as follows:

“A decorator is a callable that takes a callable as an argument and returns a callable as a return value.”“

Intuitively

Functions are like recipes, little programs that take in input (ingredients) and produce an output (food).

Decorators are like recipes for changing recipes (meta-recipes). For example, let’s say you like garlic, and you believe that there is no way to use too much garlic. You can have a meta-recipe which takes in a normal recipe, checks if there is garlic, doubles the amount, and then outputs a new recipe.

An Example

Consider a decorator that would save the results of calling an expensive function with given arguments:

class Memoize:
    """Provide a decorator class that caches expensive function results

    from avinash.vora http://avinashv.net/2008/04/python-decorators-syntactic-sugar/
    """
    def __init__(self, function):  # runs when memoize class is called
        self.function = function
        self.memoized = {}

    def __call__(self, *args):  # runs when memoize instance is called
        try:
            return self.memoized[args]
        except KeyError:
            self.memoized[args] = self.function(*args)
            return self.memoized[args]

Let’s try that out with a potentially expensive function:

In [56]: @Memoize
   ....: def sum2x(n):
   ....:     return sum(2 * i for i in range(n))
   ....:

In [57]: sum2x(10000000)
Out[57]: 99999990000000

In [58]: sum2x(10000000)
Out[58]: 99999990000000

It’s nice to see that in action, but what if we want to know exactly how much difference it made?

Nested Decorators

You can stack decorator expressions. The result is like calling each decorator in order, from bottom to top:

@decorator_two
@decorator_one
def func(x):
    pass

# is exactly equal to:
def func(x):
    pass
func = decorator_two(decorator_one(func))

Let’s define another decorator that will time how long a given call takes:

import time
def timed_func(func):
    def timed(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        elapsed = time.time() - start
        print(u"time expired: %s" % elapsed)
        return result
    return timed

And now we can use this new decorator stacked along with our memoizing decorator:

In [71]: @timed_func
   ....: @Memoize
   ....: def sum2x(n):
   ....:     return sum(2 * i for i in range(n))
In [72]: sum2x(10000000)
time expired: 0.997071027756
Out[72]: 99999990000000
In [73]: sum2x(10000000)
time expired: 4.05311584473e-06
Out[73]: 99999990000000

Examples from the Standard Library

It’s going to be a lot more common for you to use pre-defined decorators than for you to be writing your own.

Let’s see a few that might help you with work you’ve been doing recently.

For example, we saw that staticmethod() can be implemented with a decorator expression:

class C(object):
    def add(a, b):
        return a + b
    add = staticmethod(add)

Can be implimented as:

class C(object):
    @staticmethod
    def add(a, b):
        return a + b

And the classmethod() builtin can do the same thing:

In imperative style...

class C(object):
    def from_iterable(cls, seq):
        # method body
    from_iterable = classmethod(from_iterable)

and in declarative style:

class C(object):
    @classmethod
    def from_iterable(cls, seq):
        # method body

Perhaps most commonly, you’ll see the property() builtin used this way.

Remember this from last week?

class C(object):
    def __init__(self):
        self._x = None
    def getx(self):
        return self._x
    def setx(self, value):
        self._x = value
    def delx(self):
        del self._x
    x = property(getx, setx, delx,
                 u"I'm the 'x' property.")
class C(object):
    def __init__(self):
        self._x = None
    @property
    def x(self):
        return self._x
    @x.setter
    def x(self, value):
        self._x = value
    @x.deleter
    def x(self):
        del self._x

Note that in this case, the decorator object returned by the property decorator itself implements additional decorators as attributes on the returned method object.

Does this make more sense now?

Iterators and Generators

Iterators

Iterators are one of the main reasons Python code is so readable:

for x in just_about_anything:
    do_stuff(x)

It does not have to be a “sequence”: list, tuple, etc.

Rather: you can loop through anything that satisfies the “iterator protocol”

http://docs.python.org/library/stdtypes.html#iterator-types

The Iterator Protocol

An iterator must have the following methods:

an_iterator.__iter__()

Returns the iterator object itself. This is required to allow both containers and iterators to be used with the for and in statements.

an_iterator.__next__()

Returns the next item from the container. If there are no further items, raises the StopIteration exception.

List as an Iterator:

In [10]: a_list = [1,2,3]

In [11]: list_iter = a_list.__iter__()

In [12]: list_iter.__next__()
Out[12]: 1

In [13]: list_iter.__next__()
Out[13]: 2

In [14]: list_iter.__next__()
Out[14]: 3

In [15]: list_iter.next()
--------------------------------------------------
StopIteration     Traceback (most recent call last)
<ipython-input-15-1a7db9b70878> in <module>()
----> 1 list_iter.__next__()
StopIteration:

Making an Iterator

A simple version of range() (whoo hoo!)

class IterateMe_1(object):
    def __init__(self, stop=5):
        self.current = 0
        self.stop = stop
    def __iter__(self):
        return self
    def next(self):
        if self.current < self.stop:
            self.current += 1
            return self.current
        else:
            raise StopIteration

(demo: examples/iterator_1.py)

iter()

How doyou get the iterator object (the thing with the __next__() method) from an “iterable”?

The iter() function:

In [20]: iter([2,3,4])
Out[20]: <listiterator at 0x101e01350>

In [21]: iter(u"a string")
Out[21]: <iterator at 0x101e01090>

In [22]: iter( (u'a', u'tuple') )
Out[22]: <tupleiterator at 0x101e01710>

for an arbitrary object, iter() calls the __iter__ method. But it knows about some object (str, for instance) that don’t have a __iter__ method.

What does for do?

Now that we know the iterator protocol, we can write something like a for loop:

(examples/my_for.py)

def my_for(an_iterable, func):
    """
    Emulation of a for loop.
    func() will be called with each item in an_iterable
    """
    # equiv of "for i in l:"
    iterator = iter(an_iterable)
    while True:
        try:
            i = iterator.next()
        except StopIteration:
            break
        func(i)

Itertools

itertools is a collection of utilities that make it easy to build an iterator that iterates over sequences in various common ways

http://docs.python.org/library/itertools.html

NOTE:

iterators are not only for for

They can be used with anything that expects an iterator:

sum, tuple, sorted, and list

For example.

LAB / Homework

In the examples dir, you will find: iterator_1.py

  • Extend (iterator_1.py ) to be more like range() – add three input parameters: iterator_2(start, stop, step=1)
  • See what happens if you break out in the middle of the loop.

To run these, use the -i flag of python3 to load the module and then enter interactive mode.

it = IterateMe_2(2, 20, 2)
for i in it:
    if i > 10:  break
    print(i)

And then pick up again:

for i in it:
    print(i)
  • Does range() behave the same?
    • make yours match range()

Generators

Generators give you the iterator immediately:

  • no access to the underlying data ... if it even exists
Conceptually:
Iterators are about various ways to loop over data, generators generate the data on the fly
Practically:

You can use either either way (and a generator is one type of iterator)

Generators do some of the book-keeping for you.

yield

yield is a way to make a quickie generator with a function:

def a_generator_function(params):
    some_stuff
    yield something

Generator functions “yield” a value, rather than returning a value.

State is preserved in between yields.

A function with yield in it is a “factory” for a generator

Each time you call it, you get a new generator:

gen_a = a_generator()
gen_b = a_generator()

Each instance keeps its own state.

Really just a shorthand for an iterator class that does the book keeping for you.

An example: like xrange()

def y_xrange(start, stop, step=1):
    i = start
    while i < stop:
        yield i
        i += step

Real World Example from FloatCanvas:

https://github.com/svn2github/wxPython/blob/master/3rdParty/FloatCanvas/floatcanvas/FloatCanvas.py#L100

Note:

In [164]: gen = y_xrange(2,6)
In [165]: type(gen)
Out[165]: generator
In [166]: dir(gen)
Out[166]:
...
 '__iter__',
...
 '__next__',

So the generator is an iterator

A generator function can also be a method in a class

More about iterators and generators:

http://www.learningpython.com/2009/02/23/iterators-iterables-and-generators-oh-my/

examples/yield_example.py

generator comprehension

yet another way to make a generator:

>>> [x * 2 for x in [1, 2, 3]]
[2, 4, 6]
>>> (x * 2 for x in [1, 2, 3])
<generator object <genexpr> at 0x10911bf50>
>>> for n in (x * 2 for x in [1, 2, 3]):
...   print(n)
... 2 4 6

More interesting if [1, 2, 3] is also a generator

Generator LAB / Homework

Write a few generators:

  • Sum of integers
  • Doubler
  • Fibonacci sequence
  • Prime numbers

(test code in examples/test_generator.py)

Descriptions:

Sum of the integers:

keep adding the next integer

0 + 1 + 2 + 3 + 4 + 5 + ...

so the sequence is:

0, 1, 3, 6, 10, 15 .....

Doubler:

Each value is double the previous value:

1, 2, 4, 8, 16, 32,

Fibonacci sequence:

The fibonacci sequence as a generator:

f(n) = f(n-1) + f(n-2)

1, 1, 2, 3, 5, 8, 13, 21, 34...

Prime numbers:

Generate the prime numbers (numbers only divisible by them self and 1):

2, 3, 5, 7, 11, 13, 17, 19, 23...

Others to try:
Try x^2, x^3, counting by threes, x^e, counting by minus seven, ...

Context Managers

A Short Digression

Repetition in code stinks.

A large source of repetition in code deals with the handling of externals resources.

As an example, how many times do you think you might type the following code:

file_handle = open(u'filename.txt', u'r')
file_content = file_handle.read()
file_handle.close()
# do some stuff with the contents

What happens if you forget to call .close()?

What happens if reading the file raises an exception?

Resource Handling

Leaving an open file handle laying around is bad enough. What if the resource is a network connection, or a database cursor?

You can write more robust code for handling your resources:

try:
    file_handle = open(u'filename.txt', u'r')
    file_content = file_handle.read()
finally:
    file_handle.close()
# do something with file_content here

But what exceptions do you want to catch? And do you really want to have to remember all that every time you open a file (or other resource)?

Starting in version 2.5, Python provides a structure for reducing the repetition needed to handle resources like this.

Context Managers

You can encapsulate the setup, error handling and teardown of resources in a few simple steps.

The key is to use the with statement.

Since the introduction of the with statement in pep343, the above six lines of defensive code have been replaced with this simple form:

with open(u'filename', u'r') as file_handle:
    file_content = file_handle.read()
# do something with file_content

open builtin is defined as a context manager.

The resource it returnes (file_handle) is automatically and reliably closed when the code block ends.

At this point in Python history, many functions you might expect to behave this way do:

  • open and codecs.open both work as context managers
  • networks connections via socket do as well.
  • most implementations of database wrappers can open connections or cursors as context managers.
  • ...

But what if you are working with a library that doesn’t support this (urllib)?

There are a couple of ways you can go.

If the resource in questions has a .close() method, then you can simply use the closing context manager from contextlib to handle the issue:

import urllib
from contextlib import closing

with closing(urllib.urlopen('http://google.com')) as web_connection:
    # do something with the open resource
# and here, it will be closed automatically

But what if the thing doesn’t have a close() method, or you’re creating the thing and it shouldn’t?

You can also define a context manager of your own.

The interface is simple. It must be a class that implements these two special methods:

__enter__(self):
Called when the with statement is run, it should return something to work with in the created context.
__exit__(self, e_type, e_val, e_traceback):

Clean-up that needs to happen is implemented here.

The arguments will be the exception raised in the context.

If the exception will be handled here, return True. If not, return False.

Let’s see this in action to get a sense of what happens.

An Example

Consider this code:

class Context(object):
"""from Doug Hellmann, PyMOTW
http://pymotw.com/2/contextlib/#module-contextlib
"""
def __init__(self, handle_error):
    print(u'__init__(%s)' % handle_erro)r
    self.handle_error = handle_error
def __enter__(self):
    print(u'__enter__()')
    return self
def __exit__(self, exc_type, exc_val, exc_tb):
    print(u'__exit__(%s, %s, %s)' % (exc_type, exc_val, exc_tb))
    return self.handle_error

This class doesn’t do much of anything, but playing with it can help clarify the order in which things happen:

In [46]: with Context(True) as foo:
   ....:     print(u'This is in the context')
   ....:     raise RuntimeError(u'this is the error message')
__init__(True)
__enter__()
This is in the context
__exit__(<type 'exceptions.RuntimeError'>, this is the error message, <traceback object at 0x1049cca28>)
Because the exit method returns True, the raised error is ‘handled’.

What if we try with False?

In [47]: with Context(False) as foo:
   ....:     print(u'This is in the context')
   ....:     raise RuntimeError(u'this is the error message')
__init__(False)
__enter__()
This is in the context
__exit__(<type 'exceptions.RuntimeError'>, this is the error message, <traceback object at 0x1049ccb90>)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-47-de2c0c873dfc> in <module>()
      1 with Context(False) as foo:
      2     print(u'This is in the context')
----> 3     raise RuntimeError(u'this is the error message')
      4
RuntimeError: this is the error message

contextlib.contextmanager turns generator functions into context managers

Consider this code:

from contextlib import contextmanager

@contextmanager
def context(boolean):
    print(u"__init__ code here")
    try:
        print(u"__enter__ code goes here")
        yield object()
    except Exception as e:
        print(u"errors handled here")
        if not boolean:
            raise
    finally:
        print(u"__exit__ cleanup goes here")

The code is similar to the class defined previously.

And using it has similar results. We can handle errors:

In [50]: with context(True):
   ....:     print(u"in the context")
   ....:     raise RuntimeError(u"error raised")
__init__ code here
__enter__ code goes here
in the context
errors handled here
__exit__ cleanup goes here

Or, we can allow them to propagate:

In [51]: with context(False):
   ....: print(u"in the context")
   ....: raise RuntimeError(u"error raised")
__init__ code here
__enter__ code goes here
in the context
errors handled here
__exit__ cleanup goes here
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-51-641528ffa695> in <module>()
      1 with context(False):
      2     print(u"in the context")
----> 3     raise RuntimeError(u"error raised")
      4
RuntimeError: error raised

Accounting

Personal Growth Plan

Attendance

All homework due for final grading by Wednesday, October 14

The Future

Book for More In-Depth Introduction:

Learning Python by John Zelle

Intermediate

Newcoder <http://newcoder.io>

Next Year: 401

February 29th

3 code challenge questions, pinned on slack channel.

Main study resources:

Go over class notes again.

Talk about code!

Read up on Django basics.

Data Science

UW CSE 140: Data Programming https://courses.cs.washington.edu/courses/cse140/14wi/

Machine learning example with Theano and MNIST

http://deeplearning.net/software/theano/

http://yann.lecun.com/exdb/mnist/

How to Keep in Touch

Drinks at Bravehorse after class

TA for CodeFellows

LinkedIn (paulpham@yahoo.com)

Hacker Hour meetups on Tuesdays

Readings

Automate the Boring Stuff with Python <http://www.amazon.com/gp/product/1593275994/ref=pd_lpo_sbs_dp_ss_2/192-3873851-7761951?pf_rd_m=ATVPDKIKX0DER&pf_rd_s=lpo-top-stripe-1&pf_rd_r=1PRPRT5BW852MN34KREK&pf_rd_t=201&pf_rd_p=1944687662&pf_rd_i=1598631586>

Iterators, generators, and containers:

A nice post that clearly lays out how all these things fit together:

http://nvie.com/posts/iterators-vs-generators/



Transforming Code into Beautiful, Idiomatic Python:

Raymond hettinger (again) talks about Pythonic code.

A lot of it is about using iterators – now you know what those really are.

https://www.youtube.com/watch?v=OSGv2VnC0go

The End

Thank you!