24 Computational Documents
You’ve built your analytical pipeline: SQL queries, Polars transformations, Altair visualizations, Excel exports. Now comes the final step, and it’s the one that matters most to the people who use your work, communication. You learned Quarto as a writing tool in Chapter 3. You learned about Marimo notebooks as interactive exploration environments. Now you’ll make your documents come alive with embedded computation. Instead of copying results from a Python script into your report, your report is the script. Instead of running a notebook for yourself and then writing a separate document, your document executes live code during rendering or during interactive use.
In this chapter, you’ll master two complementary approaches to computational documents. Quarto documents execute code during rendering, producing polished, reproducible reports that can be shared as standalone HTML, PDF, or Word files. Marimo notebooks execute code during exploration, providing a reactive environment where you and your stakeholders can adjust parameters and see results update instantly. Both tools eliminate the gap between analysis and communication, turning your technical work into something others can understand and act on.
24.1 Part 1: Quarto Code Cells
You already know that Quarto documents are Markdown files with a YAML header and optional code blocks. What you’ll now focus on is the computational aspect: how code blocks work, how they integrate with your SQL and Python workflow, and how to control what appears in your rendered output.
24.1.1 The Anatomy of a Code Cell
A code cell in Quarto is a Markdown code block with an executable language specified:
document.qmd
::: {#48de32ba .cell}
``` {.python .cell-code}
import duckdb
conn = duckdb.connect("data/northwind.duckdb", read_only=True)
```
:::
When you render this document, Quarto executes the Python code in your project’s virtual environment, captures any output, and includes it in the rendered document. This is different from a static code block (which Quarto treats as syntax-highlighted text): an executable code block actually runs.
To make a code block executable, specify the language in curly braces. Quarto supports Python, SQL, R, and others. For non-executable blocks (when you want to show code without running it), omit the language specification and use a dot prefix:
document.qmd
```python
# This is a non-executable code block (shown as plain text)
print("Hello, world")
```
::: {#3065a1c7 .cell}
``` {.python .cell-code}
# This is executable (Python runs it)
print("Hello, world")
```
:::
24.1.2 Cell-Level Options
Every code cell can have options that control its behavior. These are specified using YAML comments at the top of the block, with the #| prefix:
document.qmd
::: {#91f274fc .cell}
``` {.python .cell-code}
import polars as pl
data = pl.DataFrame({"x": [1, 2, 3], "y": [4, 5, 6]})
data
```
:::
The most commonly used options are:
| Option | Default | Effect |
|---|---|---|
echo |
true |
Show the source code in the output |
eval |
true |
Execute the code |
output |
true |
Show the code’s output |
warning |
true |
Show warning messages |
label |
(none) | An identifier for cross-referencing |
fig-cap |
(none) | Caption for a figure |
tbl-cap |
(none) | Caption for a table |
24.1.2.1 Using echo to Control Code Visibility
echo controls whether the source code appears in the rendered document. This is your main tool for adapting documents to different audiences.
For a technical report read by engineers and data scientists, you might show the code:
technical_report.qmd
::: {#f756e067 .cell}
``` {.python .cell-code}
import duckdb
conn = duckdb.connect("data/northwind.duckdb", read_only=True)
products = conn.sql("SELECT * FROM products WHERE unit_price > 50").pl()
products.head()
```
:::
For a stakeholder presentation, you might hide the code and show only the output:
stakeholder_report.qmd
Both blocks run the same code and produce the same output table. The only difference is what the reader sees. Non-technical audiences are distracted by code; technical audiences want to verify the methodology. One code cell, two different purposes.
24.1.2.2 Using eval and output for Setup Code
Some code should run during rendering but not appear in the output. A typical use case is initial setup: connecting to a database, importing libraries, or loading configuration. You want this code to execute (so downstream cells can use the connection), but you don’t want the reader to see it.
document.qmd
# Analysis Section
The first cell (eval: true, output: false, echo: false) runs silently: the connection is established, but no code or output appears in the document. The second cell uses that connection to retrieve data.
24.1.2.3 Using warning: false to Clean Up Output
Some operations produce harmless warnings. When rendering a report for stakeholders, these warnings distract from the findings. The warning: false option suppresses them:
document.qmd
::: {#be056367 .cell}
``` {.python .cell-code}
import polars as pl
# Polars may emit warnings about deprecated syntax; suppress them
revenue = pl.read_csv("data.csv")
revenue
```
:::
24.1.3 Inline Code
In addition to code blocks, Quarto supports inline code that executes within narrative text. This is useful for embedding computed values in prose:
document.qmd
The total revenue across all categories is `{python} revenue['revenue'].sum()`.
According to our analysis of Northwind data,
Beverages category leads with $`{python} int(revenue.filter(revenue['category_name'] == 'Beverages')['revenue'][0])` in annual revenue.When rendered, the expressions evaluate and appear inline:
The total revenue across all categories is $2,545,100.
According to our analysis of Northwind data, Beverages category leads with $267,868 in annual revenue.
Inline code is most useful for summary statistics: totals, counts, percentages, or key findings. It keeps your narrative precise and eliminates the possibility of manually typing a number that might become stale if the data changes.
If you hard-code a number (“Beverages generated $267,868 in revenue”) and later the data updates, you have to manually find and correct the number. With inline code, the number updates automatically. This is a small but important advantage: your report stays accurate without manual maintenance.
24.1.4 A Complete Quarto Example: Revenue Report
Let’s build a realistic Quarto document that queries the Northwind database, computes summaries, and displays them alongside narrative and charts.
northwind_revenue_report.qmd
---
title: "Northwind Revenue Analysis"
author: "Your Name"
date: today
format: html
toc: true
number-sections: true
code-fold: true
---
## Overview
This report analyzes revenue patterns across product categories
in the Northwind database. We'll examine total revenue by category,
identify top-performing categories, and investigate seasonal trends.
## Setup
## Revenue by Category
As shown in @fig-category-revenue, Beverages leads with
`{python} f"${category_revenue[0, 'revenue']:,.2f}"` in total revenue,
followed by Dairy Products and Confections. Together, these three categories
represent `{python} f"{(category_revenue['revenue'].sum() / category_revenue['revenue'].sum() * 100):.1f}%"` of all revenue.
## Category Summary Table
## Monthly Trends
The monthly trends in @fig-monthly reveal clear seasonal patterns.
All three leading categories show peaks in Q4, consistent with holiday purchasing behavior.
## Conclusion
Beverages and Dairy Products are the revenue engines of Northwind's business.
Strategic focus on inventory management and marketing for these categories
during Q4 is critical for maximizing annual revenue.This document demonstrates the complete workflow: setup code that runs silently, queries that produce tables, inline code that embeds summaries in narrative, and charts with captions and cross-references. When you render it with uv run quarto render northwind_revenue_report.qmd, Quarto executes all the Python code, embeds the outputs, and produces a polished HTML report that’s ready to share.
24.1.5 Exercises
1. Example setup for a stakeholder report:
---
title: "Q4 Revenue Analysis"
author: "Analytics Team"
format: html
code-fold: true # Readers can expand code if curious, but it's hidden by default
---Then use echo: false for all data cells and echo: true only for the final chart and summary. This gives stakeholders results without code distraction.
2. A setup cell with eval: true, output: false, echo: false:
Then downstream cells use the connection. Rendering the document will execute this setup cell (the connection is established), but no code or output appears in the report.
3. Inline code example:
The Beverages category generated
`{python} f"${category_revenue.filter(pl.col('category_name') == 'Beverages')['revenue'][0]:,.2f}"`
in total revenue.This embeds the exact value from the data, so if the data updates, the number updates automatically.
4. A report with multiple outputs:
---
title: "Analysis"
format: [html, docx, pdf]
---Rendering with uv run quarto render report.qmd produces HTML, Word, and PDF versions from the same source. Alternatively, use the command line to override: uv run quarto render report.qmd --to docx.
You’re writing a Quarto report for non-technical stakeholders. You have four code cells: a database connection, a SQL query, a Polars transformation, and an Altair chart. For each cell, decide which options (
echo,eval,output) you’d set and explain why.Create a setup cell in Quarto that loads your DuckDB connection using
eval: true, output: false, echo: false. Explain why this combination is useful.Write an inline code expression that embeds the total revenue from the Northwind Beverages category in a sentence. Make sure it formats the number as currency with commas.
If you want to render a single
.qmdfile to both HTML and Word formats, how would you structure the YAML header?
24.2 Part 2: SQL Cells in Quarto
Quarto supports dedicated SQL code cells that execute against a database connection. This is particularly useful when your analysis is SQL-heavy and you want clean, readable queries inline with your narrative.
A SQL cell looks similar to a Python cell, but the code block is marked {sql}:
document.qmd
```{sql}
#| output:
#| max-items: 10
SELECT
c.category_name,
COUNT(*) AS product_count,
ROUND(AVG(p.unit_price), 2) AS avg_price
FROM products AS p
JOIN categories AS c ON p.category_id = c.category_id
GROUP BY c.category_name
ORDER BY product_count DESC
```By default, Quarto renders SQL output as a formatted table. The output: max-items: 10 option limits the display to the first 10 rows (useful for large result sets).
24.2.1 Connecting SQL to Your Database
For SQL cells to execute, Quarto needs access to a database connection. In Python, you establish this connection in a Python cell:
document.qmd
## Product Categories
```{sql}
#| connection: conn
SELECT
c.category_name,
COUNT(*) AS product_count,
ROUND(AVG(p.unit_price), 2) AS avg_price
FROM products AS p
JOIN categories AS c ON p.category_id = c.category_id
GROUP BY c.category_name
ORDER BY product_count DESC
```The connection: conn option tells Quarto’s SQL cell to use the conn variable from the Python environment. This is the same connection you established in Python.
24.2.2 Exercises
1. Example document with SQL cells:
---
title: "Northwind Product Analysis"
format: html
---
## Top Products by Revenue
```{sql}
#| connection: conn
#| output: max-items: 15
SELECT
p.product_name,
ROUND(SUM(od.unit_price * od.quantity * (1 - od.discount)), 2) AS revenue,
COUNT(DISTINCT od.order_id) AS orders
FROM order_details AS od
JOIN products AS p ON od.product_id = p.product_id
GROUP BY p.product_name
ORDER BY revenue DESC
```2. The SQL cell executes in Quarto’s computational context. The connection: conn option passes the DuckDB connection from the Python environment. Behind the scenes, Quarto executes the SQL query and renders the result as a table.
3. SQL cells can read from Python variables using parameter binding. If you define min_price = 50 in a Python cell, you can reference it in a SQL cell using the params option or string formatting.
Create a Quarto document with a Python cell that establishes a DuckDB connection to the Northwind database. Then add a SQL cell that queries the
productstable and displays the first 10 products sorted byunit_pricedescending. Useconnection: connin the SQL cell.In a SQL cell, where does the data come from? How does Quarto know how to execute the SQL?
You want to create a SQL cell that only displays the top 20 rows of results. What option would you use?
24.3 Part 3: Marimo Notebooks
Quarto documents execute code during rendering, producing static output. Marimo notebooks execute code during interaction, providing a reactive environment for exploration and visualization.
You already learned the fundamentals of Marimo in the previous content: reactive execution, the one-definition rule, cell dependencies, and SQL cells. This section builds on those foundations to show how Marimo fits into your computational workflow.
24.3.1 When to Use Marimo vs. Quarto
Both Quarto and Marimo are computational documents, but they serve different purposes:
| Aspect | Quarto | Marimo |
|---|---|---|
| When | Rendering reports for external audiences | Exploring data, building dashboards |
| Output | Static HTML, PDF, Word, slides | Interactive web app (or export to HTML) |
| Code execution | During quarto render |
During marimo edit or marimo run |
| Updates | Reader can’t change; requires re-render | User can adjust parameters and see instant results |
| Distribution | Email as HTML file | Share link, deploy to web server, or export |
| Git | Clean diffs (Markdown) | Clean diffs (.py file) |
| Best for | Polished deliverables, reports, presentations | Exploration, dashboards, internal tools |
Use Quarto when your goal is a finished deliverable. Use Marimo when your goal is exploration or interactive analysis.
24.3.2 Reactivity as a Superpower
The defining feature of Marimo is reactive execution. When you change a cell, all downstream cells automatically re-execute. This tight feedback loop makes Marimo ideal for iterative analysis.
Imagine you’re exploring Northwind data and you want to find all products above a certain price threshold. In a Quarto document, you’d edit the code, run quarto render, wait for the full document to re-render, and then view the output. In Marimo, you adjust the threshold, press Enter, and the result updates instantly.
This difference becomes dramatic when you add interactive elements. A Marimo notebook can have a slider that controls the threshold. Move the slider, and dependent cells re-execute in milliseconds. No button clicks, no terminal commands, no waiting. This is exploration at the speed of thought.
24.3.3 Building an Interactive Northwind Explorer
Here’s a realistic Marimo notebook that explores Northwind data with interactive controls:
northwind_explorer.py
import marimo as mo
import duckdb
import polars as pl
import altair as alt
# ===== Setup (hidden, silent) =====
_ = mo.md("""
# Northwind Product Explorer
Explore Northwind products and their pricing patterns.
""")
# Database connection
conn = duckdb.connect("data/northwind.duckdb", read_only=True)
# Load all product data
all_products = conn.sql("""
SELECT
p.product_id,
p.product_name,
c.category_name,
p.unit_price,
p.units_in_stock,
p.reorder_level
FROM products AS p
JOIN categories AS c ON p.category_id = c.category_id
ORDER BY p.product_name
""").pl()
# ===== Interactive Controls =====
min_price_slider = mo.ui.slider(
start=0,
stop=100,
step=5,
label="Minimum price",
value=20
)
mo.md(f"### Price Filter\n\n{min_price_slider}")
category_dropdown = mo.ui.dropdown(
options=all_products["category_name"].unique().to_list(),
label="Filter by category",
value=None
)
mo.md(f"### Category Filter\n\n{category_dropdown}")
# ===== Filtered Data =====
min_price = min_price_slider.value
selected_category = category_dropdown.value
filtered_products = all_products.filter(
(pl.col("unit_price") >= min_price) &
(
(pl.col("category_name") == selected_category)
if selected_category
else True
)
)
# ===== Summary Statistics =====
summary = mo.md(f"""
### Results
**Products matching filters:** {len(filtered_products)}
**Price range:** ${filtered_products['unit_price'].min():.2f} - ${filtered_products['unit_price'].max():.2f}
**Average price:** ${filtered_products['unit_price'].mean():.2f}
**Total stock value:** ${(filtered_products['unit_price'] * filtered_products['units_in_stock']).sum():,.2f}
""")
summary
# ===== Data Table =====
display_table = filtered_products.select(
pl.col("product_name").alias("Product"),
pl.col("category_name").alias("Category"),
(pl.col("unit_price")).alias("Price"),
pl.col("units_in_stock").alias("Stock"),
)
display_table
# ===== Chart =====
if len(filtered_products) > 0:
chart = alt.Chart(filtered_products).mark_circle().encode(
x=alt.X("unit_price:Q", title="Unit Price ($)"),
y=alt.Y("units_in_stock:Q", title="Units in Stock"),
color="category_name:N",
tooltip=["product_name", "unit_price", "units_in_stock"]
).properties(
width=600,
height=400,
title="Price vs. Stock for Filtered Products"
).interactive()
chart
else:
mo.md("_No products match the selected filters._")This notebook has three key features:
- Setup cells that load data (hidden from view with
mo.md()for narrative). - Interactive controls (slider and dropdown) that users can adjust.
- Reactive cells that depend on the controls and update automatically.
When a user adjusts the slider or selects a category, the notebook re-executes only the cells that depend on those changes. The filtered data, summary statistics, table, and chart all update instantly.
24.3.4 SQL Cells in Marimo
Marimo has native SQL cells that execute directly against a database. This is useful when your analysis is SQL-heavy:
northwind_sql.py
import marimo as mo
import duckdb
# Setup: establish connection
conn = duckdb.connect("data/northwind.duckdb", read_only=True)
# SQL cell equivalent (as a Python cell with SQL string)
revenue_by_category = conn.sql("""
SELECT
c.category_name,
ROUND(SUM(od.unit_price * od.quantity * (1 - od.discount)), 2) AS revenue,
COUNT(DISTINCT o.order_id) AS order_count
FROM order_details AS od
JOIN orders AS o ON od.order_id = o.order_id
JOIN products AS p ON od.product_id = p.product_id
JOIN categories AS c ON p.category_id = c.category_id
GROUP BY c.category_name
ORDER BY revenue DESC
""").pl()
# Display results
mo.md(f"## Revenue by Category")
revenue_by_categoryWhen using Marimo’s graphical interface (marimo edit), you can actually create a dedicated SQL cell by clicking the cell type dropdown. In a .py file, you write SQL as a string inside a Python cell.
24.3.5 Exercises
1. Example Marimo notebook structure:
import marimo as mo
import duckdb
# Setup
conn = duckdb.connect("data/northwind.duckdb", read_only=True)
# Interactive control
year_slider = mo.ui.slider(start=2010, stop=2014, label="Year")
mo.md(f"### Filter by Year\n\n{year_slider}")
# Query using the slider value
year = year_slider.value
orders = conn.sql(f"SELECT * FROM orders WHERE YEAR(order_date) = {year}").pl()
# Display results
mo.md(f"Orders in {year}: {len(orders)}")
orders.head()The dependency graph has: slider → year → orders → display. When the slider changes, all downstream cells re-execute.
2. Use mo.md() to create narrative cells and mo.ui.* for interactive controls. Keep data loading and transformations in regular Python cells.
3. Marimo automatically detects dependencies by analyzing which variables each cell reads and writes. You can view the dependency graph in the notebook interface (usually a graph icon or sidebar). This graph determines execution order and reactive updates.
Create a Marimo notebook with an interactive slider that controls a minimum revenue threshold. Use this slider to filter products from the Northwind database and display the filtered results in a table.
In Marimo, how do you create narrative text (markdown) alongside code cells?
How does Marimo know which cells depend on a slider’s value? How would you view the dependency graph?
24.4 Part 4: When to Use What
The three computational tools you’ve mastered each have a role:
Quarto is for polished deliverables that you share with others: - A report emailed to stakeholders - A presentation to leadership - A technical document for a team - An assignment or deliverable - A blog post or article
Quarto produces static output that’s the same every time someone views it. This is appropriate for reports: the reader wants the finished analysis, not to fiddle with sliders.
Marimo is for interactive exploration and dashboards: - Investigating a dataset with multiple hypotheses - Building a tool for colleagues to explore data themselves - Prototyping an analysis before formalizing it in a report - Creating a dashboard that updates as you adjust parameters - An internal tool that your team uses repeatedly
Marimo produces interactive output where the user can change parameters and see results update. This is appropriate for exploration: the analyst wants fast feedback.
Scripts (plain .py files) are for automation and reusable components: - Running the same analysis on new data weekly or monthly - Building functions that other code imports and uses - Processing data pipelines that run on a schedule - Any code that needs to run without human interaction
Scripts execute top-to-bottom, produce output (files, databases, logs), and are reproducible. This is appropriate for production: the system runs automatically.
In a real data engineering workflow, these three tools work together. You script your data pipeline (SQL queries, data cleaning). You explore the results in a Marimo notebook, trying different visualizations and aggregations. You publish your findings in a Quarto report that your stakeholders read. Each tool serves its purpose.
24.4.1 A Complete Workflow Example
Here’s how the three tools might work together on a Northwind analysis:
Script (
etl.py): Extract monthly revenue data from the database, clean it, and save to a CSV file. Run this weekly on a schedule.Marimo notebook (
explore.py): Load the CSV file. Create interactive controls to filter by category, date range, and metric (revenue vs. order count). Explore different charts and summaries. This is where you figure out what questions to ask.Quarto document (
report.qmd): Based on findings from the notebook, write a polished report with fixed parameters. Show the top three findings with charts, tables, and narrative. This is where you communicate what you learned.
The script ensures data freshness. The notebook enables fast exploration. The report communicates the findings. All three are necessary.
24.5 Capstone: Building a Complete Computational Analysis
Throughout this book, we’ve used the Northwind dataset. For your own projects, you’ll apply these same patterns to whatever dataset your work requires.
Here’s a capstone project that brings everything together:
24.5.1 Project Brief
You’re given a CSV file with historical sales data for your organization (or a public dataset of your choice). Your task is to:
Explore the data using a Marimo notebook. Create interactive controls to filter by date range, product category, or other relevant dimensions. Build visualizations that help you understand the data’s structure and patterns.
Develop hypotheses about what drives revenue or customer behavior. Use SQL to answer specific questions (Which categories are growing? Which customers contribute the most? Are there seasonal patterns?).
Create a polished report using Quarto. Document your three key findings with charts, tables, and narrative explanation. Show the code (so a technical reader can verify your methodology) but make the findings clear to any audience.
Deploy the analysis as a Marimo dashboard. Create a tool that your stakeholders can use to explore the data themselves. Include sliders or dropdowns for the most important dimensions.
24.5.2 Success Criteria
- Your Marimo notebook demonstrates reactive execution: when you change a control, dependent cells update.
- Your Quarto report renders without errors and can be exported to multiple formats.
- Your code follows the patterns from this book: SQL for data retrieval, Polars for transformation, Altair for visualization.
- Your analysis answers a genuine business question, not just summarizes data.
- Your narrative explains why the findings matter, not just what the data shows.
24.6 Summary
Quarto documents and Marimo notebooks are complementary tools for computational communication. Quarto renders code into static documents, making it ideal for polished reports that you share with others. Marimo creates interactive notebooks, making it ideal for exploration and dashboards where users need to adjust parameters and see results update instantly.
Code cells in Quarto execute during rendering. Cell options like echo, eval, and output let you control what appears in the finished document. Inline code embeds computed values in narrative. This separation between code and output enables you to write documents for different audiences: show code to engineers, hide code from stakeholders.
Marimo notebooks execute during interaction. The reactive model ensures that changing a cell automatically updates all dependent cells. Interactive UI elements turn analyses into explorable tools. The one-definition rule prevents hidden state bugs that plague traditional notebooks.
Together, Quarto and Marimo transform your analytical work into products that inform decision-making, not just papers that document what you did.
24.7 Glossary
- cell
- A unit of code in Quarto or Marimo. In Quarto, a cell is a code block that executes during rendering. In Marimo, a cell executes during interaction and can depend on other cells.
- code cell
- An executable code block (Python, SQL, R, etc.) in a Quarto document or Marimo notebook.
- code folding
-
A Quarto feature that hides code by default but lets readers expand it to see the source. Controlled by the
code-foldYAML option. - computational document
- A document that interleaves code, output, and narrative. Includes Quarto documents and Marimo notebooks.
- inline code
-
Code that executes within a paragraph of narrative text, not in a separate block. In Quarto, written as
`{python} expression`. - reactive execution
- Marimo’s model where changing a cell automatically re-executes all cells that depend on it.
- rendering
-
The process of executing code blocks in a Quarto document and converting the
.qmdfile to output (HTML, PDF, Word, slides).