seab

Seaborn ¶

Seaborn is an advanced data visualization tool and it is used for advanced visualization. We are going to Cover these Plots :

relplot(relational plot) - scatter line
heatmap
regression - lmplot
catplot(categorical plot) - strip swarm box violin boxen point bar count
displot - histplot kdeplot

Install Seaborn - ¶

pip install seaborn

Import Libraries ¶

In [1]:

importpandasaspd importmatplotlib.pyplotasplt importseabornassns

1. relplot ¶

relplot is a high-level plotting function in Seaborn used to visualize relationships between numerical variables.

It shows how one numeric variable changes with another numeric variable.

relplot offers two plots.

scatter plot (default)
line plot

In [4]:

df = pd.read_csv('relplot data.csv') df.head()

Out[4]:

Order_ID

Year

Month

City

Customer_Type

Order_Value_K

Delivery_Time_Min

Customer_Rating

Distance_KM

Day_Type

Weather

Festival_Period

Time_Slot

1001

2023

Jan

Delhi

Premium

1.2

4.8

Weekday

Clear

Lunch

1002

2023

Jan

Delhi

Regular

0.5

4.0

Weekday

Clear

Dinner

1003

2023

Jan

Delhi

Premium

1.5

4.5

Weekend

Rainy

Yes

Dinner

1004

2023

Jan

Pune

Regular

0.4

3.8

Weekday

Rainy

Lunch

1005

2023

Jan

Pune

Premium

1.0

4.6

Weekend

Clear

Yes

Dinner

Finding Relationship Between Delivery Time and Customer Rating

In [3]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating') plt.show()

Delivery time and Customer rating has negative relationship

Segmentation - hue

It uses different colors to separate categories in the data.

In [6]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', hue='Customer_Type') plt.show()

Regular Customers often get late deliveries rather than premium customer , which is the reason , regular customers give very less rating rather than premium customers.

hue - palette

In [7]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', hue='Customer_Type', palette = { 'Premium':'Green' , 'Regular':'red' }) plt.show()

In [10]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', hue='Customer_Type', palette = 'Reds') plt.show()

size ¶

It uses different sizes to separate categories in the data.

In [12]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', hue='Customer_Type', palette = { 'Premium':'Green' , 'Regular':'red' }, size='Customer_Rating') plt.show()

size - sizes

In [13]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', hue='Customer_Type', palette = { 'Premium':'Green' , 'Regular':'red' }, size='Customer_Rating', sizes=(50,100)) plt.show()

Segmentation - style ¶

It uses different sizes to separate categories in the data.

In [20]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', hue='Customer_Type', palette = { 'Premium':'Green' , 'Regular':'red' }, size='Customer_Rating', sizes=(50,100), style='Customer_Type') # plt.legend( # bbox_to_anchor=(1, 1), # move legend outside # loc='upper left' # ) plt.show()

Segmentation - row ¶

It uses different rows to separate categories in the data.

In [23]:

Segmentation - col ¶

It uses different cols to separate categories in the data.

In [24]:

In [25]:

2. relplot - line ¶

line plot is used to find pattern over time.

by default line plot uses average to show the line.

In [27]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', kind='line') plt.show()

In [29]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', kind='line', estimator='sum') plt.show()

An Example : ¶

In [31]:

df = pd.read_excel('Adidas US Sales Datasets.xlsx') df['total'] = df['Price per Unit'] * df['Units Sold'] df

Out[31]:

Retailer

Invoice Date

Region

State

City

Product

Price per Unit

Units Sold

Sales Method

total

Foot Locker

2020-01-01

Northeast

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

Foot Locker

2020-01-02

Northeast

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

Foot Locker

2020-01-03

Northeast

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

Foot Locker

2020-01-04

Northeast

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

Foot Locker

2020-01-05

Northeast

New York

Men's Apparel

60.0

900

In-store

54000.0

...

9643

Foot Locker

2021-01-24

Northeast

New Hampshire

Manchester

Men's Apparel

50.0

Outlet

3200.0

9644

Foot Locker

2021-01-24

Northeast

New Hampshire

Manchester

Women's Apparel

41.0

105

Outlet

4305.0

9645

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Men's Street Footwear

41.0

184

Outlet

7544.0

9646

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Men's Athletic Footwear

42.0

Outlet

2940.0

9647

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Women's Street Footwear

29.0

Outlet

2407.0

9648 rows × 10 columns

In [32]:

sns.relplot(data=df, x='Units Sold', y='total') plt.show()

In [34]:

sns.relplot(data=df, x='Units Sold', y='total', kind='line', marker='o') plt.show()

In [35]:

sns.relplot(data=df, x='Units Sold', y='total', kind='line', marker='o', estimator='sum') plt.show()

In [36]:

df = pd.read_csv('relplot data.csv') df.head()

Out[36]:

Order_ID

Year

Month

City

Customer_Type

Order_Value_K

Delivery_Time_Min

Customer_Rating

Distance_KM

Day_Type

Weather

Festival_Period

Time_Slot

1001

2023

Jan

Delhi

Premium

1.2

4.8

Weekday

Clear

Lunch

1002

2023

Jan

Delhi

Regular

0.5

4.0

Weekday

Clear

Dinner

1003

2023

Jan

Delhi

Premium

1.5

4.5

Weekend

Rainy

Yes

Dinner

1004

2023

Jan

Pune

Regular

0.4

3.8

Weekday

Rainy

Lunch

1005

2023

Jan

Pune

Premium

1.0

4.6

Weekend

Clear

Yes

Dinner

In [38]:

sns.relplot(data=df, x='Delivery_Time_Min', y='Customer_Rating', kind='line', hue='Customer_Type') plt.show()

Let's Explore With Examples ¶

In [42]:

df = pd.read_excel('Adidas US Sales Datasets.xlsx') df['total'] = df['Price per Unit'] * df['Units Sold'] df['Invoice Date'] = pd.to_datetime(df['Invoice Date'],format='%Y-%m-%d') df['Year'] = df['Invoice Date'].dt.year df.head()

Out[42]:

Retailer

Invoice Date

Region

State

City

Product

Price per Unit

Units Sold

Sales Method

total

Year

Foot Locker

2020-01-01

Northeast

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

2020

Foot Locker

2020-01-02

Northeast

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

2020

Foot Locker

2020-01-03

Northeast

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

2020

Foot Locker

2020-01-04

Northeast

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

2020

Foot Locker

2020-01-05

Northeast

New York

Men's Apparel

60.0

900

In-store

54000.0

2020

In [45]:

sns.relplot(data=df, x='Units Sold', y='total', ) plt.show()

Region Wise

In [46]:

sns.relplot(data=df, x='Units Sold', y='total', hue='Region') plt.show()

In [52]:

sns.relplot(data=df, x='Units Sold', y='total', col='Region', col_wrap=2, hue='Region') plt.show()

Yearly Average Sales ¶

In [50]:

sns.relplot(data=df, x='Year', y='total', kind='line') plt.show()

In [51]:

sns.relplot(data=df, x='Year', y='total', kind='line', hue='Region') plt.show()

Total Sales ¶

In [53]:

sns.relplot(data=df, x='Year', y='total', kind='line', hue='Region', estimator='sum') plt.show()

📊 Correlation Heatmap — Notebook Explanation ¶

📌 What is Correlation? ¶

Correlation measures how strongly two numerical variables are related.

Value range: -1 to +1
+1 → strong positive relationship
-1 → strong negative relationship
0 → no relationship

📌 What is a Correlation Heatmap? ¶

A correlation heatmap is a visual way to represent correlation values using colors.

Instead of reading numbers, we interpret color intensity.

🎨 How to Read Colors in a Correlation Heatmap ¶

Color Meaning

Interpretation

Dark positive color

Strong positive correlation

Dark negative color

Strong negative correlation

Light / neutral color

Weak or no correlation

👉 Darker the color = stronger the relationship

📌 Why Use a Correlation Heatmap? ¶

To quickly identify relationships
To detect multicollinearity
To find important features in data analysis

📌 Example Interpretation ¶

If a heatmap shows:

Hours_Studied vs Marks → dark positive color → More study hours leads to higher marks
Speed vs Travel_Time → dark negative color → Higher speed reduces travel time

🧠 Key Takeaway ¶

A correlation heatmap visually shows how strongly and in which direction numerical variables are related using color intensity.

In [54]:

df = pd.read_csv('Sleep_health_and_lifestyle_dataset.csv') df

Out[54]:

Person ID

Gender

Age

Occupation

Sleep Duration

Quality of Sleep

Physical Activity Level

Stress Level

BMI Category

Blood Pressure

Heart Rate

Daily Steps

Sleep Disorder

Male

Software Engineer

6.1

Overweight

126/83

4200

NaN

Male

Doctor

6.2

Normal

125/80

10000

NaN

Male

Doctor

6.2

Normal

125/80

10000

NaN

Male

Sales Representative

5.9

Obese

140/90

3000

Sleep Apnea

Male

Sales Representative

5.9

Obese

140/90

3000

Sleep Apnea

...

369

370

Female

Nurse

8.1

Overweight

140/95

7000

Sleep Apnea

370

371

Female

Nurse

8.0

Overweight

140/95

7000

Sleep Apnea

371

372

Female

Nurse

8.1

Overweight

140/95

7000

Sleep Apnea

372

373

Female

Nurse

8.1

Overweight

140/95

7000

Sleep Apnea

373

374

Female

Nurse

8.1

Overweight

140/95

7000

Sleep Apnea

374 rows × 13 columns

In [60]:

cdf = df[ ['Sleep Duration','Quality of Sleep','Stress Level','Age'] ].corr() cdf

Out[60]:

Sleep Duration

Quality of Sleep

Stress Level

Age

Sleep Duration

1.000000

0.883213

-0.811023

0.344709

Quality of Sleep

0.883213

1.000000

-0.898752

0.473734

Stress Level

-0.811023

-0.898752

1.000000

-0.422344

Age

0.344709

0.473734

-0.422344

1.000000

In [61]:

sns.heatmap(data=cdf,annot=True) plt.show()

regression - lmplot ¶

In [62]:

df = pd.read_excel('Adidas US Sales Datasets.xlsx') df['total'] = df['Price per Unit'] * df['Units Sold'] df.head()

Out[62]:

Retailer

Invoice Date

Region

State

City

Product

Price per Unit

Units Sold

Sales Method

total

Foot Locker

2020-01-01

Northeast

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

Foot Locker

2020-01-02

Northeast

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

Foot Locker

2020-01-03

Northeast

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

Foot Locker

2020-01-04

Northeast

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

Foot Locker

2020-01-05

Northeast

New York

Men's Apparel

60.0

900

In-store

54000.0

In [69]:

sns.lmplot(data=df, x='Units Sold', y='total', scatter_kws={'color': 'blue'}, line_kws={'color': 'red'}, col='Region', col_wrap=2 ) plt.show()

In [ ]:

PreviousRequired Files NextExtras

Good evening

hashtagSeaborn ¶

hashtagInstall Seaborn - ¶

hashtagImport Libraries ¶

hashtag1. relplot ¶

hashtagsize ¶

hashtagSegmentation - style ¶

hashtagSegmentation - row ¶

hashtagSegmentation - col ¶

hashtag2. relplot - line ¶

hashtagAn Example : ¶

hashtagLet's Explore With Examples ¶

hashtagYearly Average Sales ¶

hashtagTotal Sales ¶

hashtag📊 Correlation Heatmap — Notebook Explanation ¶

hashtag📌 What is Correlation? ¶

hashtag📌 What is a Correlation Heatmap? ¶

hashtag🎨 How to Read Colors in a Correlation Heatmap ¶

hashtag📌 Why Use a Correlation Heatmap? ¶

hashtag📌 Example Interpretation ¶

hashtag🧠 Key Takeaway ¶

hashtagregression - lmplot ¶

Seaborn ¶

Install Seaborn - ¶

Import Libraries ¶

1. relplot ¶

size ¶

Segmentation - style ¶

Segmentation - row ¶

Segmentation - col ¶

2. relplot - line ¶

An Example : ¶

Let's Explore With Examples ¶

Yearly Average Sales ¶

Total Sales ¶

📊 Correlation Heatmap — Notebook Explanation ¶

📌 What is Correlation? ¶

📌 What is a Correlation Heatmap? ¶

🎨 How to Read Colors in a Correlation Heatmap ¶

📌 Why Use a Correlation Heatmap? ¶

📌 Example Interpretation ¶

🧠 Key Takeaway ¶

regression - lmplot ¶