How to Use Jupyter Notebooks
The name Jupyter comes from three programming languages (Julia, Python, R), reflecting that the notebook system supports multiple languages through interchangeable kernels. In practice, Python is by far the most common kernel. The notebook file format (.ipynb) is JSON that stores cell contents, cell types, outputs, and metadata. This format allows notebooks to be version-controlled with git, rendered on GitHub (which displays notebooks as formatted documents), and converted to other formats like HTML, PDF, LaTeX, and slide presentations.
Step 1: Install and Launch Jupyter
Install JupyterLab (the modern interface) with pip install jupyterlab or conda install jupyterlab. The classic Jupyter Notebook interface is also available via pip install notebook, but JupyterLab is recommended because it provides a more complete environment with file browsing, multiple tabs, a terminal, text editor, and extension system. Both interfaces run the same notebook files and are fully interchangeable.
Launch JupyterLab by typing jupyter lab in a terminal. This starts a local web server and opens a browser tab at http://localhost:8888. The left sidebar shows the file browser. Click the "+" button or the Python 3 kernel tile to create a new notebook. The notebook opens with a single empty code cell ready for input. Type import numpy as np in the cell and press Shift+Enter to execute it and move to the next cell. The kernel (Python process) running behind the notebook maintains state between cells, so np is now available in all subsequent cells.
For cloud-based Jupyter without any local installation, Google Colab (colab.research.google.com) provides free notebooks with GPU access, pre-installed scientific packages, and Google Drive integration. IBM Watson Studio, Amazon SageMaker, and Microsoft Azure Notebooks offer similar cloud environments with more compute resources. JupyterHub deployments at universities and research institutions provide shared Jupyter environments for teams with centralized data access and compute resources.
Step 2: Master Cell Types and Execution
Notebooks contain two primary cell types: code and Markdown. Code cells execute Python and display the output directly below. The last expression in a code cell is automatically printed (you do not need print() for the final expression). Markdown cells contain formatted text using Markdown syntax: # for headings, ** for bold, * for italic, - for bullet lists, and LaTeX math between dollar signs ($E = mc^2$ for inline, $$\int_0^\infty e^{-x} dx = 1$$ for displayed equations). Use Markdown cells to explain your reasoning, document assumptions, and describe results between code cells.
Keyboard shortcuts make navigation fast. Jupyter has two modes: edit mode (typing in a cell, green border) and command mode (navigating between cells, blue border). Press Escape to enter command mode, Enter to enter edit mode. In command mode: A inserts a cell above, B inserts below, DD deletes the cell, M converts to Markdown, Y converts to code, C copies, V pastes, Z undoes cell deletion. In both modes: Shift+Enter executes and moves down, Ctrl+Enter executes and stays, Alt+Enter executes and inserts a new cell below. These shortcuts eliminate mouse clicking and keep your hands on the keyboard.
Cell execution order matters and is a common source of confusion. The number in brackets [3] next to a code cell shows when it was executed relative to other cells. If you execute cells out of order (run cell 5 before cell 3), variables may have unexpected values because they reflect the execution order, not the document order. The safest practice is to periodically restart the kernel and run all cells from top to bottom (Kernel menu, Restart and Run All) to verify that the notebook produces correct results when executed in document order. A notebook that only works with a specific execution order is effectively broken.
Step 3: Build an Analysis Workflow
Structure notebooks in a consistent top-to-bottom flow. Start with a Markdown cell containing the title, date, author, and a brief description of the analysis goal. Follow with an imports cell that loads all required libraries. Then a data loading cell. Then exploration cells (df.info(), df.describe(), initial plots). Then cleaning and transformation cells. Then analysis cells. Then visualization cells. Then a summary Markdown cell with conclusions. This linear structure ensures the notebook can be re-executed from top to bottom and read as a coherent document.
Inline plotting makes figures appear directly in the notebook. The %matplotlib inline magic command (run once, typically in the imports cell) causes matplotlib figures to display below the code cell that creates them. For higher-resolution inline figures, set %config InlineBackend.figure_format = 'retina' (on high-DPI screens) or %config InlineBackend.figure_format = 'svg' (for crisp vector rendering). Interactive plotting with %matplotlib widget (requires ipympl: pip install ipympl) creates zoomable, pannable figures that update in real time.
Use DataFrames as display objects throughout the notebook. When a code cell's last expression is a DataFrame, Jupyter renders it as a formatted HTML table. df.head() displays the first 5 rows as a clean table. df.describe() displays statistics as a formatted table. df.style.background_gradient(cmap='RdYlGn') adds color-coded formatting. This automatic display makes data exploration interactive and visual without explicit print statements, reducing the friction between running code and understanding results.
Keep cells focused: one task per cell. A cell that loads data should only load data. A cell that computes statistics should not also create plots. Small, focused cells are easier to debug (you can re-run just the failing cell), easier to rearrange (move a cell without breaking dependencies), and easier to read (each cell has a clear purpose). If a cell is longer than about 20 lines, consider splitting it into multiple cells with brief Markdown annotations between them.
Step 4: Use Magic Commands and Extensions
Magic commands are special Jupyter commands prefixed with % (line magic) or %% (cell magic). %timeit expression runs the expression many times and reports the average execution time, essential for benchmarking. %%timeit at the top of a cell times the entire cell. %time expression runs once and reports wall time, useful for long-running operations you only want to time once. %who lists all variables in the current namespace. %whos lists variables with their types and values. %reset clears all variables (useful when the namespace becomes cluttered).
System interaction magics bridge the notebook and the operating system. !pip install package installs a package from within the notebook. !ls lists files. !wget url downloads a file. %%bash at the top of a cell executes the entire cell as a bash script. %env VAR=value sets environment variables. %store variable persists a variable across notebook sessions (saved to a local database). These magics are particularly useful in cloud notebooks (Google Colab) where you need to install packages or download data before analysis.
JupyterLab extensions add functionality beyond the core system. The Table of Contents extension generates a navigable sidebar from Markdown headings. The Variable Inspector shows all variables and their current values. jupyterlab-lsp adds code completion, hover documentation, and go-to-definition powered by the Python language server. nbdime provides better notebook diffing and merging for git. jupyterlab-git adds a git interface directly in JupyterLab. Install extensions with pip install extension_name, then they appear automatically in JupyterLab.
Interactive widgets with ipywidgets create controls that update outputs dynamically. interact(f, x=(0, 10, 0.1)) creates a slider that calls function f with the slider value and displays the result. This enables parameter exploration without editing code: create a plotting function that takes parameters, wrap it with interact, and adjust parameters visually to see how the plot changes in real time. Widgets include sliders, dropdowns, checkboxes, text inputs, and buttons. For more complex interactive applications, combine widgets with Output widgets and observe patterns for reactive dashboards within notebooks.
Step 5: Share and Reproduce Notebooks
Export notebooks to other formats with the File menu or the command line. jupyter nbconvert notebook.ipynb --to html creates a standalone HTML file with all outputs embedded. --to pdf creates a PDF (requires LaTeX installation or use --to webpdf for browser-based conversion). --to slides creates a Reveal.js slide presentation where each cell or cell group becomes a slide. --to script extracts only the code cells as a .py file. HTML export is the most portable format for sharing results with non-technical colleagues.
GitHub renders .ipynb files automatically, so pushing a notebook to a repository makes it viewable by anyone with a browser. For executable sharing, nbviewer.org renders any public notebook URL, and mybinder.org creates a live, executable copy from a GitHub repository. The user clicks a Binder link and gets a running JupyterLab environment with your code, data, and exact dependencies, all without installing anything. Include an environment.yml or requirements.txt in the repository so Binder can recreate your environment.
Reproducibility best practices prevent the "works on my machine" problem. Always restart the kernel and run all cells before sharing to verify sequential execution. Include a requirements.txt or environment.yml at the top level of your project. Pin package versions (numpy==1.26.4, not just numpy) for exact reproducibility. Avoid absolute paths: use relative paths or environment variables for data locations. Set random seeds where applicable so stochastic results are reproducible. Clear all cell outputs before committing to git to reduce file size and merge conflicts (or configure a pre-commit hook that strips outputs automatically).
For production-quality analysis that outgrows a single notebook, refactor reusable code into Python modules (.py files) and import them from the notebook. The notebook becomes a thin orchestration layer: import functions from your module, call them with parameters, display results. This separation keeps the notebook readable, makes the analysis code testable with standard unit test frameworks, and allows the same analysis functions to be used in batch processing scripts, web services, or other notebooks without copy-pasting code.
Jupyter notebooks are most powerful when structured as linear, top-to-bottom documents that mix code, output, and explanation. Always verify reproducibility by restarting the kernel and running all cells before sharing.