2.Loading Excel and CSV Data

Download this dataset :

400KB

32KB

First thing to understand before you start working in pandas is how to load your data. In this guide we show how to load Excel and CSV files into pandas.

Install dependencies

If you want to work with Excel files in pandas, you need an Excel reader. pandas commonly uses openpyxl for .xlsx files. Install pandas (and openpyxl if needed) from your terminal:

pip install pandas openpyxl

Load Excel file (using a full path)

Example — read an Excel file using an absolute path:

import pandas as pd

data = pd.read_excel(r"C:\Users\abhis\Desktop\Adidas US Sales Datasets.xlsx")
data

Example output (truncated):

Retailer

Invoice Date

Region

State

City

Product

Price per Unit

Units Sold

Sales Method

Foot Locker

2020-01-01

Northeast

New York

Men's Street Footwear

50.0

1200

In-store

Foot Locker

2020-01-02

Northeast

New York

Men's Athletic Footwear

50.0

1000

In-store

...

9647

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Women's Street Footwear

29.0

Outlet

9648 rows × 9 columns

Notes:

pandas assumes the first row is a header and the remaining rows are values.
pandas assigns an index to each row.
pandas prints the number of rows and columns.

Load Excel file (without giving a path)

If you save the Excel file in your project directory (the current working directory), you can read it by filename only:

data = pd.read_excel('Adidas US Sales Datasets.xlsx')
data

(Example output is the same as above — 9648 rows × 9 columns.)

skiprows

If the file has a few leading rows that are empty or contain metadata so pandas treats the wrong row as the header, use the skiprows parameter to skip n rows:

Example of problematic reading (first row treated as header):

data = pd.read_excel('Adidas US Sales Datasets.xlsx')
data

Solution — skip the first row so pandas reads the correct header:

data = pd.read_excel('Adidas US Sales Datasets.xlsx', skiprows=1)
data

If there are n rows to skip, use skiprows=n.

No headers in Excel

If the Excel file has no header row, pandas will still take the first row as the header by default. Use header=None to tell pandas there are no headers and let it assign integer column names:

data = pd.read_excel('Adidas US Sales Datasets.xlsx', header=None)
data

This produces numeric column names (0, 1, 2, ...). To provide meaningful column names, use the names parameter:

data = pd.read_excel(
    'Adidas US Sales Datasets.xlsx',
    header=None,
    names=['Retailer','Date','Region','State','City','Product','Price','Units Sold','Sales Method']
)
data

Multiple sheets

To read a specific sheet use sheet_name:

retail_data = pd.read_excel('Adidas US Sales Datasets.xlsx', sheet_name='Retailer Details')
retail_data

Example output (truncated):

Retailer

Retailer ID

Foot Locker

1185732

Walmart

1185733

Sports Direct

1197831

West Gear

1197832

Kohl's

120771

How to read CSV files

What is a CSV file?

CSV stands for Comma-Separated Values.
In a CSV file, every field is separated by a delimiter (usually a comma).
CSV is a universal file format and can be opened in many tools.

Read a CSV with pandas:

data = pd.read_csv('retail_sales_dataset.csv')
data

Example output (truncated):

Customer ID

Gender

Age

Product Category

Quantity

Price per Unit

CUST001

Male

Beauty

CUST002

Female

Clothing

500

...

999

CUST1000

Male

Electronics

1000 rows × 6 columns

If your CSV uses a different delimiter (e.g., tab, semicolon), use the delimiter (or sep) parameter:

data = pd.read_csv('some_file.csv', delimiter=';')  # or sep='\t' for tabs

(End of guide.)

Assignments:

Install pandas
Load Adidas Dataset from desktop.
Load Adidas Dataset from your project.
Insert 2 empty rows in your excel file and load the data with real columns. (Use Skiprows)
Delete Header and load the data with headers.
Load Second Sheet from Adidas Dataset.
Load Retail Sales CSV Files

Previous3.Introduction to pandas Next3.DataFrame and Series

Last updated 1 month ago

Good afternoon

hashtagInstall dependencies

hashtagLoad Excel file (using a full path)

hashtagLoad Excel file (without giving a path)

hashtagskiprows

hashtagNo headers in Excel

hashtagMultiple sheets

hashtagHow to read CSV files

hashtagAssignments:

Install dependencies

Load Excel file (using a full path)

Load Excel file (without giving a path)

skiprows

No headers in Excel

Multiple sheets

How to read CSV files

Assignments: