15 Functions & Modules
So far, you’ve been writing scripts that execute from top to bottom. That works for short, one-off tasks, but it doesn’t scale. When you find yourself copying the same block of code with minor changes, or when your script grows long enough that you lose track of what each section does, you need functions.
A function is a named, reusable block of code that takes inputs, does some work, and (optionally) returns a result. Functions are how you break a complex task into manageable pieces, name those pieces clearly, and compose them into larger workflows. By the end of this chapter, you’ll also learn to organize functions into modules, Python files that you can import and reuse across projects, and to document your functions with docstrings that make them usable by others (including your future self).
15.1 Defining Functions
You define a function with the def keyword, followed by a name, parameters in parentheses, and a colon. The function body is indented:
first_function.py
def calculate_revenue(unit_price, quantity):
return unit_price * quantity
revenue = calculate_revenue(18.00, 10)
print(revenue) # 180.0The return statement sends a value back to the caller. Without an explicit return, a function returns None:
no_return.py
def greet(name):
print(f"Hello, {name}!")
result = greet("Alice")
print(result) # NoneThis distinction matters. Functions that return a value can be used in expressions, assigned to variables, and composed with other functions. Functions that only print produce output for humans but don’t create values that other code can use. In general, prefer returning values over printing them. Let the caller decide what to do with the result.
15.1.1 Functions as Units of Work
A good function does one thing, does it well, and has a name that makes its purpose obvious. Compare these two approaches:
bad_function.py
def process(data, mode):
"""Does too many things depending on the mode."""
if mode == "clean":
# 20 lines of cleaning logic
...
elif mode == "analyze":
# 30 lines of analysis logic
...
elif mode == "report":
# 25 lines of reporting logic
...good_functions.py
def clean_order_data(raw_orders):
"""Remove invalid entries and standardize field names."""
...
def compute_revenue_by_category(orders):
"""Sum revenue grouped by product category."""
...
def format_revenue_report(category_totals):
"""Format category totals as a printable table."""
...The second approach has three focused functions, each with a clear name and a single responsibility. They’re easier to test, easier to debug, and easier to reuse.
15.2 Docstrings
A docstring is the first string literal in a function (or module, or class). Python treats it specially: the help() function displays it, editors show it on hover, and documentation tools extract it automatically.
docstring_basic.py
def calculate_revenue(unit_price, quantity):
"""Calculate the total revenue for a line item."""
return unit_price * quantityIn the REPL, help(calculate_revenue) will display:
output
Help on function calculate_revenue in module __main__:
calculate_revenue(unit_price, quantity)
Calculate the total revenue for a line item.
15.2.1 The Google Docstring Convention
For functions with multiple parameters, return values, or potential errors, a one-line docstring isn’t enough. This book uses the Google convention, one of the most widely used formats:
google_docstring.py
def calculate_revenue(
orders: list[dict],
category: str | None = None,
) -> float:
"""Calculate total revenue, optionally filtered by category.
Computes the sum of unit_price * quantity for each order.
If a category is specified, only orders matching that category
are included.
Args:
orders: List of order dictionaries. Each must have
'category', 'unit_price', and 'quantity' keys.
category: If provided, filter to this category only.
If None, include all orders.
Returns:
Total revenue as a float.
Raises:
ValueError: If the orders list is empty.
"""
if not orders:
raise ValueError("Orders list cannot be empty.")
if category is not None:
orders = [o for o in orders if o["category"] == category]
return sum(o["unit_price"] * o["quantity"] for o in orders)The Google convention has four sections, each optional:
Args describes each parameter: its name, type (if not already annotated), and what it represents. If a parameter has a default value, document what the default means.
Returns describes what the function gives back. For functions that return None, you can omit this section.
Raises lists the exceptions the function might raise and under what conditions.
The summary line (first line of the docstring) should be a concise, imperative statement of what the function does: “Calculate total revenue,” not “This function calculates total revenue.”
Try writing the docstring before you write the function body. Describing what a function should do, what it takes as input, and what it returns forces you to think about the interface before you worry about the implementation. If the docstring is hard to write, the function might be doing too much.
15.2.2 Module Docstrings
A module docstring is the first string in a .py file. It describes what the file contains and why:
northwind_utils.py
"""Utility functions for Northwind order processing.
This module provides functions for reading, filtering, and
summarizing Northwind product and order data from CSV files.
"""
def calculate_revenue(unit_price, quantity):
...When someone runs help(northwind_utils) after importing your module, they’ll see this docstring along with a list of the module’s functions.
15.3 Arguments in Depth
Python functions support several argument-passing patterns that give you flexibility in how you design interfaces.
15.3.1 Positional and Keyword Arguments
When you call a function, you can pass arguments by position or by name:
argument_styles.py
def create_product(name, price, category):
return {"name": name, "price": price, "category": category}
# Positional: arguments matched by order
create_product("Chai", 18.00, "Beverages")
# Keyword: arguments matched by name (order doesn't matter)
create_product(category="Beverages", name="Chai", price=18.00)
# Mixed: positional first, then keyword
create_product("Chai", price=18.00, category="Beverages")Keyword arguments make function calls more readable, especially when a function has many parameters or when the meaning of a positional argument isn’t obvious from context.
15.3.2 Default Values
Parameters can have default values, making them optional:
defaults.py
def format_price(amount, currency="USD", decimals=2):
symbols = {"USD": "$", "EUR": "€", "GBP": "£"}
symbol = symbols.get(currency, currency)
return f"{symbol}{amount:,.{decimals}f}"
format_price(1234.5) # "$1,234.50"
format_price(1234.5, currency="EUR") # "€1,234.50"
format_price(1234.5, decimals=0) # "$1,234"Never use a mutable value (like a list or dictionary) as a default argument:
mutable_default_bad.py
def add_item(item, items=[]): # BUG: the list is shared across calls!
items.append(item)
return items
add_item("Chai") # ["Chai"]
add_item("Chang") # ["Chai", "Chang"] ← Where did "Chai" come from?Default values are evaluated once when the function is defined, not each time it’s called. So every call shares the same list object. The fix is to use None as the default and create a new list inside the function:
mutable_default_fixed.py
def add_item(item, items=None):
if items is None:
items = []
items.append(item)
return items15.3.3 *args and **kwargs
When you don’t know how many arguments a function will receive, use *args for positional arguments and **kwargs for keyword arguments:
args_kwargs.py
def summarize(*values):
"""Summarize any number of numeric values."""
return {
"count": len(values),
"sum": sum(values),
"mean": sum(values) / len(values) if values else 0,
}
summarize(1, 2, 3) # {"count": 3, "sum": 6, "mean": 2.0}
summarize(10, 20, 30, 40) # {"count": 4, "sum": 100, "mean": 25.0}*args collects extra positional arguments into a tuple. **kwargs collects extra keyword arguments into a dictionary. You’ll encounter these most often when reading library code rather than writing your own, but knowing how they work helps you understand function signatures in documentation.
15.3.4 Keyword-Only Arguments
Placing a bare * in the parameter list forces everything after it to be keyword-only:
keyword_only.py
def query_products(category, *, min_price=0, in_stock_only=False):
...
# These work:
query_products("Beverages", min_price=10, in_stock_only=True)
# This fails:
query_products("Beverages", 10, True) # TypeErrorKeyword-only arguments prevent positional confusion. When a function has several options, requiring them to be named makes the call site much more readable.
15.3.5 Exercises
Write a function
compute_line_total(unit_price, quantity, discount=0.0)that returns the line total after discount. Include a Google-style docstring. Test it withcompute_line_total(18.00, 10)andcompute_line_total(18.00, 10, discount=0.15).Predict the output of this code without running it:
solution.py
def add_item(item, items=[]):
items.append(item)
return items
print(add_item("Chai"))
print(add_item("Chang"))Why does the second call return ["Chai", "Chang"] instead of ["Chang"]? Write the corrected version.
Write a function
format_product(name, price, *, currency="$", decimals=2)wherecurrencyanddecimalsare keyword-only arguments. Call it three different ways: with defaults, with a custom currency, and with zero decimal places.What happens if you call
create_product("Chai", 18.00, "Beverages")vs.create_product(price=18.00, name="Chai", category="Beverages")for a function with positional parameters(name, price, category)? Are the results the same?
1. Show the code:
solution.py
def compute_line_total(unit_price: float, quantity: int, discount: float = 0.0) -> float:
"""Compute the total for an order line item after discount.
Args:
unit_price: Price per unit in dollars.
quantity: Number of units ordered.
discount: Discount as a decimal (0.15 = 15%). Defaults to 0.0.
Returns:
Total price after discount.
"""
return unit_price * quantity * (1 - discount)
compute_line_total(18.00, 10) # 180.0
compute_line_total(18.00, 10, discount=0.15) # 153.02. The second call returns ["Chai", "Chang"] because the default list [] is created once at function definition time and shared across all calls. Fix:
solution.py
def add_item(item, items=None):
if items is None:
items = []
items.append(item)
return items3. Show the code:
solution.py
format_product("Chai", 18.00) # "$18.00"
format_product("Chai", 18.00, currency="€") # "€18.00"
format_product("Chai", 18.00, currency="$", decimals=0) # "$18"(The function body would be return f"{currency}{price:,.{decimals}f}")
4. Yes, both produce the same result. Positional arguments are matched by order, keyword arguments by name. The second form is more readable when the meaning of arguments isn’t obvious from context.
15.4 Scope and Namespaces
When you create a variable inside a function, it exists only within that function. This is called local scope:
scope.py
def compute_tax(amount):
rate = 0.06 # Local variable
return amount * rate
compute_tax(100) # 6.0
print(rate) # NameError: name 'rate' is not definedThe variable rate is created inside compute_tax and destroyed when the function returns. Code outside the function can’t see it. This is a feature, not a limitation: it means functions don’t accidentally interfere with each other’s variables.
15.4.1 The LEGB Rule
When Python encounters a name, it looks for it in four scopes, in this order:
- Local: variables defined in the current function.
- Enclosing: variables in any enclosing function (relevant for nested functions).
- Global: variables defined at the module level (the top of your
.pyfile). - Built-in: Python’s built-in names like
print,len,range.
Python uses the first match it finds. If the name doesn’t exist in any scope, you get a NameError.
legb.py
tax_rate = 0.06 # Global scope
def compute_total(subtotal):
# 'subtotal' is in Local scope
# 'tax_rate' is found in Global scope
return subtotal * (1 + tax_rate)
print(compute_total(100)) # 106.0global
Python has a global keyword that lets a function modify a global variable. Don’t use it. It makes your code harder to reason about because any function could change any variable at any time. Instead, pass values into functions as arguments and get values out via return.
15.4.2 Variable Shadowing
A local variable can shadow a global one with the same name:
shadowing.py
name = "Global Name"
def greet():
name = "Local Name" # This creates a NEW local variable
print(name) # "Local Name"
greet()
print(name) # "Global Name" ← unchangedThe function’s name is a completely separate variable from the global name. Shadowing is a common source of subtle bugs, especially when you accidentally reuse a name like list, sum, or input that shadows a built-in:
shadowing_builtin.py
# Don't do this!
list = [1, 2, 3] # Shadows the built-in list() function
new_list = list(range(5)) # TypeError: 'list' object is not callable15.4.3 Exercises
- Predict the output of this code:
solution.py
x = 10
def outer():
x = 20
def inner():
print(x)
inner()
outer()
print(x)- Why does this code raise an error?
solution.py
count = 0
def increment():
count = count + 1
return count
increment()How would you fix it without using the global keyword? (Hint, pass count as an argument and return the new value.)
- A colleague writes
list = [1, 2, 3]at the top of their script. Later,list(range(5))fails with aTypeError. Explain what went wrong and how to fix it.
1. The inner() function prints 20 (from the enclosing scope). The final print(x) prints 10 (from the global scope). The x = 20 in outer() creates a local variable that shadows the global one.
2. Python sees count = count + 1 and treats count as a local variable (because it’s being assigned to). But count + 1 tries to read the local count before it’s been assigned, causing UnboundLocalError. Fix:
solution.py
def increment(count):
return count + 1
count = 0
count = increment(count) # count is now 13. Assigning list = [1, 2, 3] shadows the built-in list() function. Now list refers to the [1, 2, 3] object, not the constructor. Fix by choosing a different variable name like numbers = [1, 2, 3] and deleting (or not using) the shadowed name.
15.5 Functions as Objects
In Python, functions aren’t special. They’re objects, just like integers, strings, and lists. You can assign them to variables, pass them as arguments, and store them in collections:
function_objects.py
def double(x):
return x * 2
def triple(x):
return x * 3
# Assign a function to a variable
transform = double
print(transform(5)) # 10
# Put functions in a list
operations = [double, triple]
for op in operations:
print(op(10)) # 20, then 3015.5.1 Passing Functions as Arguments
The most practical use of functions-as-objects is passing one function to another. You’ve already seen this with sorted():
sorted_key.py
products = [
{"name": "Tofu", "price": 23.25},
{"name": "Chai", "price": 18.00},
{"name": "Chang", "price": 19.00},
]
# Sort by price using a key function
by_price = sorted(products, key=lambda p: p["price"])The key argument tells sorted() how to extract a comparison value from each element. The lambda creates a small anonymous function inline.
15.5.2 Lambda Expressions
A lambda is a one-expression function without a name:
lambda.py
square = lambda x: x ** 2
square(5) # 25
# Equivalent to:
def square(x):
return x ** 2Lambdas are useful as arguments to functions like sorted(), map(), and filter(). They’re not useful for anything complex, because they’re limited to a single expression.
map() and filter()
Python provides map() and filter() for transforming and filtering sequences:
map_filter.py
prices = [18.00, 19.00, 10.00, 23.25]
# map() applies a function to each element
with_tax = list(map(lambda p: p * 1.06, prices))
# filter() keeps elements where the function returns True
expensive = list(filter(lambda p: p > 15, prices))Comprehensions are almost always clearer:
comprehension_equivalent.py
with_tax = [p * 1.06 for p in prices]
expensive = [p for p in prices if p > 15]Prefer comprehensions. Use map() and filter() only when you have an existing named function to pass and the comprehension would add no clarity.
15.6 Generators
What happens when you need to process a million records, but you don’t have enough memory to hold them all in a list at once? Generators solve this problem by producing values one at a time, on demand, instead of computing everything upfront.
15.6.1 The Memory Problem
Consider reading a large file and processing each line:
memory_problem.py
# This loads the ENTIRE file into memory as a list
lines = open("huge_file.csv").readlines()
# If the file is 10 GB, you just used 10 GB of RAM
for line in lines:
process(line)If the file is large, this approach exhausts your memory. A generator processes one line at a time, never holding the whole file in memory.
15.6.2 Generator Functions with yield
A generator function uses yield instead of return. Each time it yields a value, it pauses and gives the value to the caller. When the caller asks for the next value, the function resumes exactly where it left off:
generator.py
def count_up_to(n):
"""Generate integers from 1 to n."""
i = 1
while i <= n:
yield i
i += 1
# The function doesn't execute yet. It returns a generator object.
counter = count_up_to(5)
# Each call to next() resumes the function and gets the next value
print(next(counter)) # 1
print(next(counter)) # 2
print(next(counter)) # 3
# Or use it in a for loop (the normal way)
for num in count_up_to(5):
print(num)The key insight is that yield pauses the function, and next() (or a for loop) resumes it. The function maintains its state between yields: all its local variables are preserved.
15.6.3 Generator Expressions
Just as list comprehensions create lists, generator expressions create generators. The only difference is parentheses instead of brackets:
genexpr.py
# List comprehension: creates the entire list in memory
squares_list = [n ** 2 for n in range(1_000_000)]
# Generator expression: creates values one at a time
squares_gen = (n ** 2 for n in range(1_000_000))The list uses memory proportional to a million elements. The generator uses almost no memory, because it computes each square only when asked.
Generator expressions work anywhere an iterable is expected:
genexpr_sum.py
# Sum of squares without building a list
total = sum(n ** 2 for n in range(1_000_000))15.6.4 A Practical Generator
Here’s a generator that reads a CSV file line by line, yielding each row as a dictionary. It processes the file lazily, never loading more than one row into memory:
csv_generator.py
def read_csv_rows(filepath):
"""Yield each row of a CSV file as a dictionary.
Args:
filepath: Path to the CSV file.
Yields:
A dictionary mapping column names to values for each row.
"""
with open(filepath) as f:
headers = f.readline().strip().split(",")
for line in f:
values = line.strip().split(",")
yield dict(zip(headers, values))
# Process order details without loading the entire file
for row in read_csv_rows("order_details.csv"):
if float(row["unitPrice"]) > 20:
print(row["productID"])You’ll revisit file processing in much more detail in Chapter 16. For now, the important concept is that generators let you work with data that’s larger than your available memory, by producing values one at a time.
15.6.5 Exercises
Write a generator expression that yields the cube of each number from 1 to 100. Use
sum()to compute the total without creating a list. Then write the equivalent list comprehension and compare whattype()returns for each.Predict what this code prints:
example.py
def countdown(n):
while n > 0:
yield n
n -= 1
gen = countdown(3)
print(next(gen))
print(next(gen))
print(list(gen))- Explain why
sum(n ** 2 for n in range(1_000_000))uses almost no memory, whilesum([n ** 2 for n in range(1_000_000)])uses significant memory. Which should you prefer?
1. Show the code:
solution.py
total = sum(n ** 3 for n in range(1, 101)) # Generator expression
total_list = sum([n ** 3 for n in range(1, 101)]) # List comprehension
type(n ** 3 for n in range(1, 101)) # <class 'generator'>
type([n ** 3 for n in range(1, 101)]) # <class 'list'>Both produce the same sum (25502500), but the generator never creates the full list in memory.
2. next(gen) → 3, next(gen) → 2, list(gen) → [1]. The generator remembers its state between next() calls. By the time list() is called, it has already yielded 3 and 2, so only 1 remains.
3. The generator expression (n ** 2 for ...) computes values one at a time and feeds them to sum(), never storing more than one value in memory. The list comprehension [n ** 2 for ...] creates a list of 1 million integers in memory first, then sums it. Prefer the generator expression when you only need to iterate once.
15.7 Modules and Imports
A module is a .py file that contains functions, variables, and other code that you can import and reuse. Every Python file is a module. When you write import math, you’re importing Python’s built-in math module.
15.7.1 Import Syntax
Python provides several ways to import:
imports.py
# Import the entire module
import math
print(math.sqrt(16)) # 4.0
# Import specific items
from math import sqrt, pi
print(sqrt(16)) # 4.0
print(pi) # 3.141592653589793
# Import with an alias
import statistics as stats
print(stats.mean([1, 2, 3])) # 2.0The import module form is usually preferred because it makes clear where each function comes from. When you see math.sqrt(), you know it’s from the math module. When you see just sqrt(), you have to remember which import brought it in.
15.7.2 What Happens When You Import
When Python encounters import mymodule, it:
- Finds the file
mymodule.py(or looks in the standard library and installed packages). - Executes the entire file from top to bottom.
- Creates a module object containing everything the file defined.
- Binds that object to the name
mymodulein your current scope.
The second step is important: all code at the module level runs when you import. This means if your module has a print("Hello!") at the top level, it prints every time someone imports it.
15.7.3 The if __name__ == "__main__" Pattern
To prevent top-level code from running on import, wrap it in this guard:
my_module.py
"""Utility functions for temperature conversion."""
def fahrenheit_to_celsius(f):
"""Convert Fahrenheit to Celsius."""
return (f - 32) * 5 / 9
def celsius_to_fahrenheit(c):
"""Convert Celsius to Fahrenheit."""
return c * 9 / 5 + 32
# This only runs when the file is executed directly,
# not when it's imported by another file.
if __name__ == "__main__":
temp_f = 72
temp_c = fahrenheit_to_celsius(temp_f)
print(f"{temp_f}°F = {temp_c:.1f}°C")When you run uv run my_module.py, Python sets __name__ to "__main__", so the code under the guard executes. When another file runs import my_module, __name__ is set to "my_module", and the guard block is skipped. This pattern lets a file serve double duty: as both an importable module and a standalone script.
15.7.4 Building Your Own Modules
Organizing code into modules is one of the most important habits in professional Python development. Consider this structure:
output
northwind-analysis/
├── pyproject.toml
├── northwind_utils.py # Your reusable functions
└── analyze_revenue.py # A script that uses them
northwind_utils.py
"""Utility functions for Northwind order processing.
This module provides functions for filtering, aggregating, and
formatting Northwind product and order data.
"""
def filter_by_category(orders, category):
"""Filter a list of orders to a specific category.
Args:
orders: List of order dictionaries with a 'category' key.
category: The category name to filter by.
Returns:
A new list containing only orders in the specified category.
"""
return [o for o in orders if o["category"] == category]
def compute_revenue(orders):
"""Compute total revenue from a list of orders.
Each order must have 'unit_price' and 'quantity' keys.
Args:
orders: List of order dictionaries.
Returns:
Total revenue as a float.
"""
return sum(o["unit_price"] * o["quantity"] for o in orders)
def format_currency(amount, symbol="$"):
"""Format a number as currency.
Args:
amount: The numeric value to format.
symbol: The currency symbol to prepend. Defaults to '$'.
Returns:
A formatted string like '$1,234.56'.
"""
return f"{symbol}{amount:,.2f}"analyze_revenue.py
"""Analyze Northwind revenue by category."""
from northwind_utils import compute_revenue, filter_by_category, format_currency
# Sample data (in the next chapter, we'll read this from files)
orders = [
{"product": "Chai", "category": "Beverages", "unit_price": 18.00, "quantity": 10},
{"product": "Chang", "category": "Beverages", "unit_price": 19.00, "quantity": 5},
{"product": "Tofu", "category": "Produce", "unit_price": 23.25, "quantity": 15},
]
beverages = filter_by_category(orders, "Beverages")
revenue = compute_revenue(beverages)
print(f"Beverages revenue: {format_currency(revenue)}")The functions in northwind_utils.py are reusable. You can import them in any script, a different analysis, a report generator, or a test suite. By giving each function a clear name, a docstring, and a focused purpose, you’ve created a building block that others can use without reading the implementation.
15.7.5 Standard Library Highlights
Python ships with a large standard library. Here are a few modules you’ll use frequently:
| Module | Purpose | Example |
|---|---|---|
math |
Mathematical functions | math.sqrt(16), math.log(100) |
random |
Random number generation | random.choice([1, 2, 3]) |
statistics |
Basic statistics | statistics.mean([1, 2, 3]) |
datetime |
Date and time handling | datetime.date.today() |
pathlib |
File system paths | Path("data/orders.csv") |
csv |
CSV file reading/writing | csv.DictReader(file) |
json |
JSON reading/writing | json.load(file) |
You’ll use pathlib, csv, and json extensively in the next chapter.
15.8 Putting It Together: A Northwind Utility Module
Let’s build a more complete version of the utility module, bringing together functions, docstrings, comprehensions, and generators:
northwind_utils.py
"""Utility functions for Northwind order processing.
Provides functions for filtering, aggregating, and formatting
Northwind order data. Designed to be imported by analysis scripts.
"""
def filter_active_products(products):
"""Return only products that are not discontinued.
Args:
products: List of product dictionaries with a
'discontinued' key.
Returns:
A new list containing only active (non-discontinued) products.
"""
return [p for p in products if not p["discontinued"]]
def group_by_category(orders):
"""Group orders by their product category.
Args:
orders: List of order dictionaries with a 'category' key.
Returns:
A dictionary mapping category names to lists of orders.
"""
groups = {}
for order in orders:
category = order["category"]
if category not in groups:
groups[category] = []
groups[category].append(order)
return groups
def compute_revenue(orders):
"""Compute total revenue from a list of orders.
Args:
orders: List of order dictionaries, each with
'unit_price' and 'quantity' keys.
Returns:
Total revenue as a float.
"""
return sum(o["unit_price"] * o["quantity"] for o in orders)
def revenue_by_category(orders):
"""Compute revenue for each product category.
Args:
orders: List of order dictionaries with 'category',
'unit_price', and 'quantity' keys.
Returns:
A dictionary mapping category names to total revenue.
"""
groups = group_by_category(orders)
return {
category: compute_revenue(category_orders)
for category, category_orders in groups.items()
}
def format_report(category_totals):
"""Format category revenue totals as a printable report.
Args:
category_totals: Dictionary mapping category names
to revenue amounts.
Returns:
A formatted string with aligned columns.
"""
sorted_items = sorted(
category_totals.items(),
key=lambda pair: pair[1],
reverse=True,
)
lines = [f"{'Category':<20} {'Revenue':>12}"]
lines.append(f"{'-' * 20} {'-' * 12}")
for category, revenue in sorted_items:
lines.append(f"{category:<20} ${revenue:>11,.2f}")
total = sum(category_totals.values())
lines.append(f"{'-' * 20} {'-' * 12}")
lines.append(f"{'TOTAL':<20} ${total:>11,.2f}")
return "\n".join(lines)
if __name__ == "__main__":
sample_orders = [
{"product": "Chai", "category": "Beverages", "unit_price": 18.00, "quantity": 10},
{"product": "Chang", "category": "Beverages", "unit_price": 19.00, "quantity": 5},
{"product": "Aniseed Syrup", "category": "Condiments", "unit_price": 10.00, "quantity": 20},
{"product": "Tofu", "category": "Produce", "unit_price": 23.25, "quantity": 15},
]
totals = revenue_by_category(sample_orders)
print(format_report(totals))Running this directly (uv run northwind_utils.py) prints the report. Importing it (from northwind_utils import revenue_by_category) gives you access to the individual functions without running the demo.
Exercises
Function Design
Write a function
classify_stock_levelthat takesunits_in_stockandreorder_levelas arguments and returns a stock classification string. Use these thresholds:"Critical"if stock is zero,"Low"if stock is at or below the reorder level,"Adequate"if stock is above the reorder level but below three times the reorder level, and"Overstocked"if stock is at or above three times the reorder level. Include a Google-style docstring.Write a function
format_product_linethat takes a product name, price, and quantity and returns a formatted string like"Chai ............. $18.00 x 10". Use f-string alignment to make the output neat.Refactor the order processing script from Chapter 13 into functions. Each logical step (filtering, grouping, computing totals, formatting output) should be its own function.
Generator Practice
Write a generator function
fibonacci()that yields Fibonacci numbers indefinitely. Use it with aforloop andbreakto print the first 20 terms.Write a generator expression that yields the square root of each number from 1 to 1,000,000. Use
sum()to add them up without creating a list.Compare memory usage: create a list comprehension
[n ** 2 for n in range(10_000_000)]and a generator expression(n ** 2 for n in range(10_000_000)). Which one can you create without running out of memory? (Hint: useimport sys; sys.getsizeof()on each.)
Module Building
Create a file called
conversions.pythat contains functions for common engineering unit conversions (PSI to kPa, Fahrenheit to Celsius, miles to kilometers, etc.). Include a module docstring and Google-style docstrings for each function. Write a separate script that imports and uses these functions.Add to your
northwind_utils.py: a function that takes a list of products and returns the top N products by stock value (price × quantity). Include a default ofn=5.
Summary
Functions are the primary organizational tool in Python. A well-designed function has a clear name, a focused purpose, documented parameters and return values, and no side effects beyond what its docstring describes. Docstrings, particularly the Google convention, make functions self-documenting: anyone can call help() on your function and understand how to use it.
Generators extend the idea of functions with yield, producing values lazily instead of computing them all at once. They’re essential for processing data that doesn’t fit in memory, a common situation in real-world engineering data work.
Modules turn .py files into importable, reusable libraries. The if __name__ == "__main__" pattern lets a file work both as a standalone script and as an importable module. These organizational techniques, functions, generators, and modules, are the foundation you’ll build on for every remaining chapter.
Glossary
- argument
- A value passed to a function when it’s called. Distinguished from a parameter, which is the variable name in the function definition.
- closure
- A function that captures variables from its enclosing scope. Created when a nested function references a variable from the outer function.
- default value
- A value assigned to a parameter in the function definition, used when the caller doesn’t provide that argument.
- docstring
-
The first string literal in a function, class, or module. Used by
help(), editors, and documentation tools to describe the item’s purpose and usage. - generator
-
A function that uses
yieldinstead ofreturn, producing values one at a time on demand. Generator expressions use parentheses:(expr for x in iterable). - keyword argument
-
An argument passed by name:
f(x=10). Can be in any order. - lambda
-
A small anonymous function defined with the
lambdakeyword. Limited to a single expression. - LEGB rule
- The order Python searches for names: Local, Enclosing, Global, Built-in.
- module
-
A
.pyfile that can be imported by other Python code. Contains functions, variables, classes, and other definitions. - namespace
- A mapping from names to objects. Each scope (local, global, built-in) has its own namespace.
- parameter
- A variable name in a function definition that receives a value when the function is called.
- positional argument
- An argument matched to a parameter by position: the first argument goes to the first parameter, and so on.
- scope
- The region of code where a variable is accessible. Python has local, enclosing, global, and built-in scopes.
- shadowing
- When a local variable has the same name as a variable in an outer scope, hiding it within the local scope.
yield- A keyword that pauses a generator function and produces a value. The function resumes when the next value is requested.