2.Loading Excel and CSV Data

Download this dataset :

First thing to understand before you start working in pandas is how to load your data. In this guide we show how to load Excel and CSV files into pandas.

Install dependencies

If you want to work with Excel files in pandas, you need an Excel reader. pandas commonly uses openpyxl for .xlsx files. Install pandas (and openpyxl if needed) from your terminal:

pip install pandas openpyxl

Load Excel file (using a full path)

Example — read an Excel file using an absolute path:

import pandas as pd

data = pd.read_excel(r"C:\Users\abhis\Desktop\Adidas US Sales Datasets.xlsx")
data

Example output (truncated):

Retailer
Invoice Date
Region
State
City
Product
Price per Unit
Units Sold
Sales Method

0

Foot Locker

2020-01-01

Northeast

New York

New York

Men's Street Footwear

50.0

1200

In-store

1

Foot Locker

2020-01-02

Northeast

New York

New York

Men's Athletic Footwear

50.0

1000

In-store

...

...

...

...

...

...

...

...

...

...

9647

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Women's Street Footwear

29.0

83

Outlet

9648 rows × 9 columns

Notes:

  • pandas assumes the first row is a header and the remaining rows are values.

  • pandas assigns an index to each row.

  • pandas prints the number of rows and columns.

Load Excel file (without giving a path)

If you save the Excel file in your project directory (the current working directory), you can read it by filename only:

data = pd.read_excel('Adidas US Sales Datasets.xlsx')
data

(Example output is the same as above — 9648 rows × 9 columns.)

skiprows

If the file has a few leading rows that are empty or contain metadata so pandas treats the wrong row as the header, use the skiprows parameter to skip n rows:

Example of problematic reading (first row treated as header):

data = pd.read_excel('Adidas US Sales Datasets.xlsx')
data

Solution — skip the first row so pandas reads the correct header:

data = pd.read_excel('Adidas US Sales Datasets.xlsx', skiprows=1)
data

If there are n rows to skip, use skiprows=n.

No headers in Excel

If the Excel file has no header row, pandas will still take the first row as the header by default. Use header=None to tell pandas there are no headers and let it assign integer column names:

data = pd.read_excel('Adidas US Sales Datasets.xlsx', header=None)
data

This produces numeric column names (0, 1, 2, ...). To provide meaningful column names, use the names parameter:

data = pd.read_excel(
    'Adidas US Sales Datasets.xlsx',
    header=None,
    names=['Retailer','Date','Region','State','City','Product','Price','Units Sold','Sales Method']
)
data

Multiple sheets

To read a specific sheet use sheet_name:

retail_data = pd.read_excel('Adidas US Sales Datasets.xlsx', sheet_name='Retailer Details')
retail_data

Example output (truncated):

Retailer
Retailer ID

0

Foot Locker

1185732

1

Walmart

1185733

2

Sports Direct

1197831

3

West Gear

1197832

4

Kohl's

120771

How to read CSV files

What is a CSV file?

  • CSV stands for Comma-Separated Values.

  • In a CSV file, every field is separated by a delimiter (usually a comma).

  • CSV is a universal file format and can be opened in many tools.

Read a CSV with pandas:

data = pd.read_csv('retail_sales_dataset.csv')
data

Example output (truncated):

Customer ID
Gender
Age
Product Category
Quantity
Price per Unit

0

CUST001

Male

34

Beauty

3

50

1

CUST002

Female

26

Clothing

2

500

...

...

...

...

...

...

...

999

CUST1000

Male

47

Electronics

4

30

1000 rows × 6 columns

If your CSV uses a different delimiter (e.g., tab, semicolon), use the delimiter (or sep) parameter:

data = pd.read_csv('some_file.csv', delimiter=';')  # or sep='\t' for tabs

(End of guide.)

Assignments:

  1. Install pandas

  2. Load Adidas Dataset from desktop.

  3. Load Adidas Dataset from your project.

  4. Insert 2 empty rows in your excel file and load the data with real columns. (Use Skiprows)

  5. Delete Header and load the data with headers.

  6. Load Second Sheet from Adidas Dataset.

  7. Load Retail Sales CSV Files

Last updated