17. Seaborn

file-download
1KB

Seaborn

Seaborn is an advanced data visualization tool and it is used for advanced visualization. We are going to Cover these Plots :

  1. relplot(relational plot) - scatter line

  2. heatmap

  3. regression - lmplot

  4. displot - histplot kdeplot

  5. catplot(categorical plot) - strip swarm box violin boxen point count bar

Install Seaborn -

pip install seaborn

Import Libraries

1. relplot

relplot is a high-level plotting function in Seaborn used to visualize relationships between numerical variables.

It shows how one numeric variable changes with another numeric variable.

relplot offers two plots.

  1. scatter plot (default)

  2. line plot

Order_ID
Year
Month
City
Customer_Type
Order_Value_K
Delivery_Time_Min
Customer_Rating
Distance_KM
Day_Type
Weather
Festival_Period
Time_Slot

0

1001

2023

Jan

Delhi

Premium

1.2

18

4.8

3

Weekday

Clear

No

Lunch

1

1002

2023

Jan

Delhi

Regular

0.5

25

4.0

5

Weekday

Clear

No

Dinner

2

1003

2023

Jan

Delhi

Premium

1.5

22

4.5

4

Weekend

Rainy

Yes

Dinner

3

1004

2023

Jan

Pune

Regular

0.4

28

3.8

6

Weekday

Rainy

No

Lunch

4

1005

2023

Jan

Pune

Premium

1.0

20

4.6

4

Weekend

Clear

Yes

Dinner

Finding Relationship Between Delivery Time and Customer Rating

Delivery time and Customer rating has negative relationship

Segmentation - hue

It uses different colors to separate categories in the data.

Regular Customers often get late deliveries rather than premium customer , which is the reason , regular customers give very less rating rather than premium customers.

hue - palette

size

It uses different sizes to separate categories in the data.

size - sizes

Segmentation - style

It uses different sizes to separate categories in the data.

Segmentation - row

It uses different rows to separate categories in the data.

Segmentation - col

It uses different cols to separate categories in the data.

png

2. relplot - line

line plot is used to find pattern over time.

by default line plot uses average to show the line.

png
png

An Example :

Retailer
Invoice Date
Region
State
City
Product
Price per Unit
Units Sold
Sales Method
total

0

Foot Locker

2020-01-01

Northeast

New York

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

1

Foot Locker

2020-01-02

Northeast

New York

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

2

Foot Locker

2020-01-03

Northeast

New York

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

3

Foot Locker

2020-01-04

Northeast

New York

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

4

Foot Locker

2020-01-05

Northeast

New York

New York

Men's Apparel

60.0

900

In-store

54000.0

...

...

...

...

...

...

...

...

...

...

...

9643

Foot Locker

2021-01-24

Northeast

New Hampshire

Manchester

Men's Apparel

50.0

64

Outlet

3200.0

9644

Foot Locker

2021-01-24

Northeast

New Hampshire

Manchester

Women's Apparel

41.0

105

Outlet

4305.0

9645

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Men's Street Footwear

41.0

184

Outlet

7544.0

9646

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Men's Athletic Footwear

42.0

70

Outlet

2940.0

9647

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Women's Street Footwear

29.0

83

Outlet

2407.0

9648 rows × 10 columns

png
png
png
Order_ID
Year
Month
City
Customer_Type
Order_Value_K
Delivery_Time_Min
Customer_Rating
Distance_KM
Day_Type
Weather
Festival_Period
Time_Slot

0

1001

2023

Jan

Delhi

Premium

1.2

18

4.8

3

Weekday

Clear

No

Lunch

1

1002

2023

Jan

Delhi

Regular

0.5

25

4.0

5

Weekday

Clear

No

Dinner

2

1003

2023

Jan

Delhi

Premium

1.5

22

4.5

4

Weekend

Rainy

Yes

Dinner

3

1004

2023

Jan

Pune

Regular

0.4

28

3.8

6

Weekday

Rainy

No

Lunch

4

1005

2023

Jan

Pune

Premium

1.0

20

4.6

4

Weekend

Clear

Yes

Dinner

png

Let's Explore With Examples

Retailer
Invoice Date
Region
State
City
Product
Price per Unit
Units Sold
Sales Method
total
Year

0

Foot Locker

2020-01-01

Northeast

New York

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

2020

1

Foot Locker

2020-01-02

Northeast

New York

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

2020

2

Foot Locker

2020-01-03

Northeast

New York

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

2020

3

Foot Locker

2020-01-04

Northeast

New York

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

2020

4

Foot Locker

2020-01-05

Northeast

New York

New York

Men's Apparel

60.0

900

In-store

54000.0

2020

Region Wise

Yearly Average Sales

Total Sales

📊 Correlation Heatmap — Notebook Explanation

📌 What is Correlation?

Correlation measures how strongly two numerical variables are related.

  • Value range: -1 to +1

  • +1 → strong positive relationship

  • -1 → strong negative relationship

  • 0 → no relationship


📌 What is a Correlation Heatmap?

A correlation heatmap is a visual way to represent correlation values using colors.

Instead of reading numbers, we interpret color intensity.


🎨 How to Read Colors in a Correlation Heatmap

Color Meaning
Interpretation

Dark positive color

Strong positive correlation

Dark negative color

Strong negative correlation

Light / neutral color

Weak or no correlation

👉 Darker the color = stronger the relationship


📌 Why Use a Correlation Heatmap?

  • To quickly identify relationships

  • To detect multicollinearity

  • To find important features in data analysis


📌 Example Interpretation

If a heatmap shows:

  • Hours_Studied vs Marks → dark positive color → More study hours leads to higher marks

  • Speed vs Travel_Time → dark negative color → Higher speed reduces travel time


🧠 Key Takeaway

A correlation heatmap visually shows how strongly and in which direction numerical variables are related using color intensity.


Person ID
Gender
Age
Occupation
Sleep Duration
Quality of Sleep
Physical Activity Level
Stress Level
BMI Category
Blood Pressure
Heart Rate
Daily Steps
Sleep Disorder

0

1

Male

27

Software Engineer

6.1

6

42

6

Overweight

126/83

77

4200

NaN

1

2

Male

28

Doctor

6.2

6

60

8

Normal

125/80

75

10000

NaN

2

3

Male

28

Doctor

6.2

6

60

8

Normal

125/80

75

10000

NaN

3

4

Male

28

Sales Representative

5.9

4

30

8

Obese

140/90

85

3000

Sleep Apnea

4

5

Male

28

Sales Representative

5.9

4

30

8

Obese

140/90

85

3000

Sleep Apnea

...

...

...

...

...

...

...

...

...

...

...

...

...

...

369

370

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

370

371

Female

59

Nurse

8.0

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

371

372

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

372

373

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

373

374

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

374 rows × 13 columns

Sleep Duration
Quality of Sleep
Stress Level
Age

Sleep Duration

1.000000

0.883213

-0.811023

0.344709

Quality of Sleep

0.883213

1.000000

-0.898752

0.473734

Stress Level

-0.811023

-0.898752

1.000000

-0.422344

Age

0.344709

0.473734

-0.422344

1.000000

regression - lmplot

Retailer
Invoice Date
Region
State
City
Product
Price per Unit
Units Sold
Sales Method
total

0

Foot Locker

2020-01-01

Northeast

New York

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

1

Foot Locker

2020-01-02

Northeast

New York

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

2

Foot Locker

2020-01-03

Northeast

New York

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

3

Foot Locker

2020-01-04

Northeast

New York

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

4

Foot Locker

2020-01-05

Northeast

New York

New York

Men's Apparel

60.0

900

In-store

54000.0

displot

displot is distribution plot.

  1. displot - histplot (default)

  2. kdeplot (kernel density)

Person ID
Gender
Age
Occupation
Sleep Duration
Quality of Sleep
Physical Activity Level
Stress Level
BMI Category
Blood Pressure
Heart Rate
Daily Steps
Sleep Disorder

0

1

Male

27

Software Engineer

6.1

6

42

6

Overweight

126/83

77

4200

NaN

1

2

Male

28

Doctor

6.2

6

60

8

Normal

125/80

75

10000

NaN

2

3

Male

28

Doctor

6.2

6

60

8

Normal

125/80

75

10000

NaN

3

4

Male

28

Sales Representative

5.9

4

30

8

Obese

140/90

85

3000

Sleep Apnea

4

5

Male

28

Sales Representative

5.9

4

30

8

Obese

140/90

85

3000

Sleep Apnea

...

...

...

...

...

...

...

...

...

...

...

...

...

...

369

370

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

370

371

Female

59

Nurse

8.0

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

371

372

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

372

373

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

373

374

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

374 rows × 13 columns

png

displot - kdeplot

png

Categorical Plot

  1. strip plot (default)

strip plot is a scatter plot but for categorical values.

Retailer
Invoice Date
Region
State
City
Product
Price per Unit
Units Sold
Sales Method
total

0

Foot Locker

2020-01-01

Northeast

New York

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

1

Foot Locker

2020-01-02

Northeast

New York

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

2

Foot Locker

2020-01-03

Northeast

New York

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

3

Foot Locker

2020-01-04

Northeast

New York

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

4

Foot Locker

2020-01-05

Northeast

New York

New York

Men's Apparel

60.0

900

In-store

54000.0

png
png
Person ID
Gender
Age
Occupation
Sleep Duration
Quality of Sleep
Physical Activity Level
Stress Level
BMI Category
Blood Pressure
Heart Rate
Daily Steps
Sleep Disorder

0

1

Male

27

Software Engineer

6.1

6

42

6

Overweight

126/83

77

4200

NaN

1

2

Male

28

Doctor

6.2

6

60

8

Normal

125/80

75

10000

NaN

2

3

Male

28

Doctor

6.2

6

60

8

Normal

125/80

75

10000

NaN

3

4

Male

28

Sales Representative

5.9

4

30

8

Obese

140/90

85

3000

Sleep Apnea

4

5

Male

28

Sales Representative

5.9

4

30

8

Obese

140/90

85

3000

Sleep Apnea

...

...

...

...

...

...

...

...

...

...

...

...

...

...

369

370

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

370

371

Female

59

Nurse

8.0

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

371

372

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

372

373

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

373

374

Female

59

Nurse

8.1

9

75

3

Overweight

140/95

68

7000

Sleep Apnea

374 rows × 13 columns

png
Retailer
Invoice Date
Region
State
City
Product
Price per Unit
Units Sold
Sales Method
total

0

Foot Locker

2020-01-01

Northeast

New York

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

1

Foot Locker

2020-01-02

Northeast

New York

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

2

Foot Locker

2020-01-03

Northeast

New York

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

3

Foot Locker

2020-01-04

Northeast

New York

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

4

Foot Locker

2020-01-05

Northeast

New York

New York

Men's Apparel

60.0

900

In-store

54000.0

png

catplot - swarmplot

Dots don't overlap at all.

Order_ID
Year
Month
City
Customer_Type
Order_Value_K
Delivery_Time_Min
Customer_Rating
Distance_KM
Day_Type
Weather
Festival_Period
Time_Slot

0

1001

2023

Jan

Delhi

Premium

1.2

18

4.8

3

Weekday

Clear

No

Lunch

1

1002

2023

Jan

Delhi

Regular

0.5

25

4.0

5

Weekday

Clear

No

Dinner

2

1003

2023

Jan

Delhi

Premium

1.5

22

4.5

4

Weekend

Rainy

Yes

Dinner

3

1004

2023

Jan

Pune

Regular

0.4

28

3.8

6

Weekday

Rainy

No

Lunch

4

1005

2023

Jan

Pune

Premium

1.0

20

4.6

4

Weekend

Clear

Yes

Dinner

5

1006

2023

Jan

Bangalore

Regular

0.6

26

4.1

5

Weekday

Rainy

No

Lunch

6

1007

2023

Jan

Bangalore

Premium

1.3

18

4.9

3

Weekend

Clear

Yes

Dinner

7

1008

2023

Jan

Chennai

Regular

0.5

24

3.9

4

Weekday

Rainy

No

Lunch

8

1009

2023

Jan

Chennai

Premium

1.2

22

4.7

5

Weekend

Clear

Yes

Dinner

9

1010

2023

Feb

Delhi

Regular

0.6

27

4.0

5

Weekday

Rainy

No

Lunch

10

1011

2023

Feb

Delhi

Premium

1.4

19

4.8

3

Weekend

Clear

Yes

Dinner

11

1012

2023

Feb

Pune

Regular

0.5

28

3.9

6

Weekday

Rainy

No

Lunch

12

1013

2023

Feb

Pune

Premium

1.2

21

4.6

4

Weekend

Clear

Yes

Dinner

13

1014

2023

Feb

Bangalore

Regular

0.7

25

4.2

5

Weekday

Rainy

No

Lunch

14

1015

2023

Feb

Bangalore

Premium

1.5

18

5.0

3

Weekend

Clear

Yes

Dinner

15

1016

2023

Feb

Chennai

Regular

0.6

24

4.0

4

Weekday

Rainy

No

Lunch

16

1017

2023

Feb

Chennai

Premium

1.3

20

4.8

5

Weekend

Clear

Yes

Dinner

png
png

If you have a small data , swarm plot is good but for large dataset strip plot is good.

catplot - boxplot

Retailer
Invoice Date
Region
State
City
Product
Price per Unit
Units Sold
Sales Method
total

0

Foot Locker

2020-01-01

Northeast

New York

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

1

Foot Locker

2020-01-02

Northeast

New York

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

2

Foot Locker

2020-01-03

Northeast

New York

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

3

Foot Locker

2020-01-04

Northeast

New York

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

4

Foot Locker

2020-01-05

Northeast

New York

New York

Men's Apparel

60.0

900

In-store

54000.0

png
png
png

catplot - violin

png

catplot - boxenplot

a much more detailed version of box plot

png

catplot - pointplot

png
png

catplot - countplot

png

catplot - barplot

Retailer
Invoice Date
Region
State
City
Product
Price per Unit
Units Sold
Sales Method
total

0

Foot Locker

2020-01-01

Northeast

New York

New York

Men's Street Footwear

50.0

1200

In-store

60000.0

1

Foot Locker

2020-01-02

Northeast

New York

New York

Men's Athletic Footwear

50.0

1000

In-store

50000.0

2

Foot Locker

2020-01-03

Northeast

New York

New York

Women's Street Footwear

40.0

1000

In-store

40000.0

3

Foot Locker

2020-01-04

Northeast

New York

New York

Women's Athletic Footwear

45.0

850

In-store

38250.0

4

Foot Locker

2020-01-05

Northeast

New York

New York

Men's Apparel

60.0

900

In-store

54000.0

...

...

...

...

...

...

...

...

...

...

...

9643

Foot Locker

2021-01-24

Northeast

New Hampshire

Manchester

Men's Apparel

50.0

64

Outlet

3200.0

9644

Foot Locker

2021-01-24

Northeast

New Hampshire

Manchester

Women's Apparel

41.0

105

Outlet

4305.0

9645

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Men's Street Footwear

41.0

184

Outlet

7544.0

9646

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Men's Athletic Footwear

42.0

70

Outlet

2940.0

9647

Foot Locker

2021-02-22

Northeast

New Hampshire

Manchester

Women's Street Footwear

29.0

83

Outlet

2407.0

9648 rows × 10 columns

png
png
png

Assignments


🔹 Section 1: Relplot – Scatter - relplot data

  1. Create a scatter plot between Delivery_Time_Min and Customer_Rating using relplot.

  2. Add Customer_Type as hue in the scatter plot.

  3. Change the color palette to differentiate Premium and Regular customers.

  4. Map Order_Value_K to the size parameter.

  5. Set the point size range to (40, 180).

  6. Add style='Customer_Type' to use different markers.

  7. Create separate plots for Weekday and Weekend using col.

  8. Create separate plots for different Weather conditions using row.

  9. Filter the dataset to show only Delhi orders and plot the scatter.

  10. Plot only Premium customers and visualize delivery time vs rating.

  11. Remove the legend from the scatter plot.

  12. Increase plot height and aspect ratio.

  13. Create separate scatter plots for each City using col.

  14. Change marker transparency using alpha.

  15. Sort the data by Delivery_Time_Min before plotting.


🔹 Section 2: Relplot – Line - Adidas

  1. Create a line plot showing average Total Sales over Date.

  2. Change the estimator to show sum of Total Sales.

  3. Plot Year-wise Total Sales trend.

  4. Add Region as hue to compare sales trends.

  5. Add markers to the line plot.

  6. Rotate x-axis labels to avoid overlap.

  7. Create separate line plots for In-store and Outlet sales using col.

  8. Plot Product-wise sales trend over time.

  9. Change line styles for different regions.

  10. Disable error bars in the line plot.

  11. Filter and plot sales data for 2020 only.

  12. Plot Units Sold trend over time.

  13. Create a line plot for New York city sales only.

  14. Compare Men’s vs Women’s products using hue.

  15. Increase line thickness for better visibility.


🔹 Section 3: Correlation Heatmap (Sleep / Health Dataset)

  1. Select Sleep Duration, Quality of Sleep, Stress Level, and Age columns.

  2. Compute the correlation matrix for the selected columns.

  3. Create a correlation heatmap using Seaborn.

  4. Display correlation values inside the heatmap.

  5. Change the heatmap color map for strong contrast.

  6. Increase the figure size to (8,6).

  7. Mask the upper triangle of the heatmap.

  8. Center the color scale at zero.

  9. Remove the color bar from the heatmap.

  10. Rotate x-axis labels for better readability.


🔹 Section 4: Regression – lmplot - Adidas

  1. Create a regression plot between Units Sold and Total Sales.

  2. Add Product as hue in the regression plot.

  3. Disable the confidence interval.

  4. Create separate regression plots for each Region using col.

  5. Plot regression only for In-store sales.

  6. Plot regression only for Outlet sales.

  7. Change height and aspect ratio of the regression plot.

  8. Create a regression plot for Men’s products only.

  9. Create a regression plot for Women’s products only.

  10. Compare regression lines for different Regions in a single plot.


Last updated