0PricingLogin
Pandas & NumPy Academy · Lesson

Connecting to a Database with SQLAlchemy

Create a SQLAlchemy engine for SQLite and PostgreSQL, and pass it to pd.read_sql to load a table into a DataFrame.

Why Connect Pandas to Databases?

Most production data lives in relational databases — PostgreSQL, MySQL, SQLite, or SQL Server — not CSV files. Connecting Pandas directly to a database lets you query data into a DataFrame without exporting to CSV first, push cleaned DataFrames back into tables, and combine Python's analytical power with the database's indexing and joining capabilities. The bridge between Pandas and databases is SQLAlchemy, Python's standard database abstraction library.

Installing SQLAlchemy

SQLAlchemy is a Python SQL toolkit and ORM. For Pandas integration you only need the Core layer — not the ORM. Install with pip install sqlalchemy. You also need the specific database driver: psycopg2 for PostgreSQL, pymysql for MySQL, or sqlite3 (built into Python) for SQLite. SQLAlchemy acts as an abstraction layer: the same Pandas code works with any supported database by changing only the connection string.

# Install dependencies
# pip install sqlalchemy psycopg2-binary  # for PostgreSQL
# pip install sqlalchemy pymysql           # for MySQL
# sqlite3 is built into Python

import sqlalchemy as sa
import pandas as pd

print('SQLAlchemy version:', sa.__version__)

All lessons in this course

  1. Connecting to a Database with SQLAlchemy
  2. Running SQL Queries from Pandas
  3. Writing DataFrames to Database Tables
  4. Pandas vs. SQL: Choosing the Right Tool
← Back to Pandas & NumPy Academy