This is the second post documenting my progress through Data Science from Scratch (by Joel Grus). Chapter 2 of the book provides a quick "crash course" in Python.
As a relative newcomer to Python (from R), my goals are two fold. First, to go through this book and, as a byproduct, learn python. Second, to look out for and highlight the areas where the Pythonic way of doing things is necessary to accomplish something in the data science process.
I'll be on the look out for specific features of the Python language needed to carry out some task in cleaning or pre-processing data, preparing the data for modeling, exploratory data analysis or the mechanics of training, validating and testing models.
In his coverage of functions, Grus emphasizes how in Python functions are first-class and can be passed as argument to other functions. I'll be drawing from examples in the book and may supplement with external sources to examine the same concept from another angle.
The illustration of functions being passed as arguments is demonstrated below. A function double
is created. A function apply_to_one
is created. The double
function is pointed at my_double
. We pass my_double
into the apply_to_one
function and set that to x
.
Whatever function is passed to apply_to_one
, its argument is 1. So passing my_double
means we are doubling 1, so x
is 2.
But the important thing is that a function got passed to another function (aka higher order functions).
def double(x):
"""
this function doubles and returns the argument
"""
return x * 2
def apply_to_one(f):
"""Calls the function f with 1 as its argument"""
return f(1)
my_double = double
# x is 2 here
x = apply_to_one(my_double)
Here's an extension of the above example. We create a apply_to_five
function that returns a function with the integer 5 as its argument.
# extending the above example
def apply_five_to(e):
"""returns the function e with 5 as its argument"""
return e(5)
# doubling 5 is 10
w = apply_five_to(my_double)
Since functions are going to be used extensively, here's another more complicated example. I found this from Trey Hunner's site. Two functions are defined - square
and cube
. Both functions are saved to a list called operations
. Another list, numbers
is created.
Finally, a for-loop is used to iterate through numbers
, and the enumerate
property allows access to both index and item in numbers. That's used to find whether the action
is a square
or cube
(operations[0] is square
, operations[1] is cube
), which is then given as its argument, the items inside the numbers
list.
# create two functions
def square(n): return n**2
def cube(n): return n**3
# store those functions inside a list, operations, to reference later
operations = [square, cube]
# create a list of numbers
numbers = [2,1,3,4,7,11,18,29]
# loop through the numbers list
# using enumerate the identify index and items
# [i % 2] results in either 0 or 1, that's pointed at action
# using the dunder, name, retrieves the name of the function - either square or cube - from the operations list
# print __name__ along with the item from the numbers list
# action is either a square or cube
for i, n in enumerate(numbers):
action = operations[i % 2]
print(f"{action.__name__}({n}):", action(n))
# more explicit, yet verbose way to write the for-loop
for index, num in enumerate(numbers):
action = operations[index % 2]
print(f"{action.__name__}({num}):", action(num))
The for-loop prints out the following:
square(2): 4
cube(1): 1
square(3): 9
cube(4): 64
square(7): 49
cube(11): 1331
square(18): 324
cube(29): 24389
A special example of functions being passed as arguments to other functions is the Python anonymous function lambda
. However, with lambda
instead of defining functions with def
, it is defined immediately inside another function. Here’s an illustration:
# we'll reuse apply_five_to, which takes in a function and provides '5' as the argument
def apply_five_to(e):
"""returns the function e with 5 as its argument"""
return e(5)
# this lambda function adds '4' to any argument
# when passing this lambda function to apply_five_to
# you get y = 5 + 4
y = apply_five_to(lambda x: x + 4)
# we can also change what the lambda function does without defining a separate function
# here the lambda function multiplies the argument by 4
# y = 20
y = apply_five_to(lambda x: x * 4)
While lambda
functions are convenient and succinct, there seems to be consensus that you should just define a function with def
instead.
Here's an external example of lambda
functions from Trey Hunner. In this example, a lambda
function is used within a filter
function that takes in two arguments.
# calling help(filter) displays an explanation
class filter(object)
| filter(function or None, iterable) --> filter object
# create a list of numbers
numbers = [2,1,3,4,7,11,18,29]
# the lambda function will return n if it is an even number
# we filter the numbers list using the lambda function
# wrapped in a list, this returns [2,4,18]
list(filter(lambda n: n % 2 == 0, numbers))
There are whole books, or at least whole chapters, that can be written about Python functions, but we’ll limit our discussion for now to the idea that functions can be passed as arguments to other functions. I’ll report back on this section as we progress through the book.
Top comments (0)