Getting Started with Jupyter Notebooks
Learn to use Jupyter notebooks for interactive coding, data analysis, and sharing your work.
Learn to use Jupyter notebooks for interactive coding, data analysis, and sharing your work.
Get full access to all Data Science, Machine Learning, and AI courses built for finance professionals.
One-time payment - Lifetime access
Or create a free account to start
A step-by-step guide covering Python, SQL, analytics, and finance applications.
Or create a free account to access more
Get full access to all Data Science, Machine Learning, and AI courses built for finance professionals.
One-time payment - Lifetime access
Or create a free account to start
A step-by-step guide covering Python, SQL, analytics, and finance applications.
Or create a free account to access more
Jupyter notebooks are the standard tool for interactive data science work. They let you write code, see results immediately, add explanations, and share your analysis - all in one document. If you're learning Python or R for data work, you'll spend a lot of time in Jupyter.
A Jupyter notebook is an interactive document that combines:
The name "Jupyter" comes from Julia, Python, and R - the three languages it originally supported. Today it works with dozens of languages, but Python is most common.
Exploratory analysis. When you're investigating data, you want to try things quickly and see results. Notebooks let you run code in chunks, inspect outputs, and iterate fast.
Documentation built-in. You can explain your thinking alongside your code. This makes notebooks great for sharing analysis with colleagues who want to understand your approach.
Visualization inline. Charts and plots appear right below the code that generated them. No switching between windows.
Reproducibility. A notebook captures your entire analysis workflow. Others can run it and get the same results.
Anaconda includes Jupyter, Python, and common data science packages:
If you already have Python installed:
pip install jupyter
Then run:
jupyter notebook
JupyterLab is the newer interface with more features:
1pip install jupyterlab
2jupyter lab
3When you launch Jupyter, it opens in your web browser. You'll see:
File browser - Navigate your folders and open notebooks (.ipynb files)
Notebook view - The main editing area with cells
Toolbar - Buttons for common actions (run, stop, save)
Kernel indicator - Shows whether code is running
Notebooks are made of cells. Each cell is either:
Type Python code and press Shift+Enter to run:
1import pandas as pd
2df = pd.read_csv('data.csv')
3df.head()
4The output appears directly below the cell.
Write formatted text using Markdown syntax:
1## Analysis Summary
2
3This section explores the **key findings** from our data:
4
5- Finding 1
6- Finding 2
7Press Shift+Enter to render the formatted text.
These will speed up your work significantly:
| Action | Shortcut |
|---|---|
| Run cell | Shift + Enter |
| Run cell, stay in place | Ctrl + Enter |
| Insert cell below | B (in command mode) |
| Insert cell above | A (in command mode) |
| Delete cell | DD (in command mode) |
| Switch to Markdown | M (in command mode) |
| Switch to Code | Y (in command mode) |
| Enter command mode | Esc |
| Enter edit mode | Enter |
Command mode (blue cell border): Navigate and manipulate cells Edit mode (green cell border): Type in a cell
Each cell should do one thing. This makes debugging easier and helps readers follow your logic.
1# Good: One operation per cell
2df = pd.read_csv('sales.csv')
3df['total'] = df['quantity'] * df['price']
df.groupby('region')['total'].sum()
Don't just show code - explain what you're doing and why:
1## Data Cleaning
2
3The raw data has several issues we need to address:
4- Missing values in the 'region' column
5- Duplicate transaction IDs
6- Negative quantities (likely returns)
7Before sharing, restart the kernel and run all cells from top to bottom. This catches issues where cells depend on deleted code or out-of-order execution.
Kernel > Restart & Run All
Name your notebooks descriptively:
2024-01-sales-analysis.ipynbUntitled.ipynb1# Load data
2import pandas as pd
3df = pd.read_csv('data.csv')
4
5# Quick look
61# Shape and types
2print(f"Rows: {len(df)}, Columns: {len(df.columns)1# Summary statistics
2df.describe()
31# Check for missing values
2df.isnull().sum()
31import matplotlib.pyplot as plt
2
3# Enable inline plots
4%matplotlib inline
5
6# Create a chart
7Add a semicolon to prevent output:
1fig, ax = plt.subplots(figsize=(10, 6)); # No extra output
2Use notebooks for:
Use scripts (.py files) for:
Many data scientists prototype in notebooks, then move finalized code to scripts.
Classic Notebook - Simpler, one notebook at a time, lighter weight
JupyterLab - Multiple tabs, file browser, terminal, more IDE-like
Both work with the same .ipynb files. JupyterLab is the direction Jupyter is heading, but classic notebooks are still widely used.
Notebooks are a tool - the more you use them, the more natural they become. Start simple, and you'll develop your own workflow over time.