0Pricing
Pandas & NumPy Academy · Lesson

Reading Excel and JSON Files

Import Excel workbooks with pd.read_excel and JSON records with pd.read_json, handling common format variations.

Why Excel and JSON Matter

While CSV is the universal format, Excel files (.xlsx/.xls) are the dominant format for business data shared via email and reporting tools. JSON is the native format of REST APIs and NoSQL databases. Pandas provides pd.read_excel() and pd.read_json() that mirror the CSV reader's API, so skills transfer directly.

import pandas as pd

# The three most common load functions share the same philosophy
df_csv   = pd.read_csv('data.csv')
df_excel = pd.read_excel('data.xlsx')
df_json  = pd.read_json('data.json')

Installing the Excel Engine

Reading Excel files requires an engine: openpyxl for .xlsx files (modern format) and xlrd for older .xls files. Install with pip install openpyxl. Pandas auto-selects the engine based on the file extension. For writing, use xlsxwriter for advanced formatting. Without the engine installed, read_excel() raises a ModuleNotFoundError.

# Install the required engine:
# pip install openpyxl          # for .xlsx (modern)
# pip install xlrd==1.2.0       # for .xls (legacy)

import pandas as pd
df = pd.read_excel('report.xlsx', engine='openpyxl')

All lessons in this course

  1. Reading CSV Files
  2. Reading Excel and JSON Files
  3. Writing DataFrames to Files
  4. Reading from URLs and StringIO
← Back to Pandas & NumPy Academy