4. Series Operations In Pandas

What is Series

Series is a single row or a single column in pandas.

import pandas as pd
import numpy as np

d = ['iphone', 'motorolla', 'samsung']
s = pd.Series(d, name='phones')
s

Output:

0      iphone
1   motorolla
2     samsung
Name: phones, dtype: object

How to access a Series / Columns

df = pd.read_csv('retail_sales_dataset.csv')
df['Customer ID']

Output (truncated):

0      CUST001
1      CUST002
2      CUST003
3      CUST004
4      CUST005
       ...
995    CUST996
996    CUST997
997    CUST998
998    CUST999
999    CUST1000
Name: Customer ID, Length: 1000, dtype: object

How to access multiple Columns

To access multiple columns, pass a list of columns:

cols = ['Customer ID', 'Age']
df[cols]

Output (truncated):

Customer ID
Age

0

CUST001

34

1

CUST002

26

2

CUST003

50

3

CUST004

37

4

CUST005

30

...

...

...

995

CUST996

62

996

CUST997

52

997

CUST998

23

998

CUST999

36

999

CUST1000

47

1000 rows × 2 columns

How to create new columns

df['new col'] = 1
df

Output (truncated):

Customer ID
Gender
Age
Product Category
Quantity
Price per Unit
new col

0

CUST001

Male

34

Beauty

3

50

1

1

CUST002

Female

26

Clothing

2

500

1

2

CUST003

Male

50

Electronics

1

30

1

3

CUST004

Male

37

Clothing

1

500

1

4

CUST005

Male

30

Beauty

2

50

1

...

...

...

...

...

...

...

...

995

CUST996

Male

62

Clothing

1

50

1

996

CUST997

Male

52

Beauty

3

30

1

997

CUST998

Female

23

Beauty

4

25

1

998

CUST999

Female

36

Electronics

3

50

1

999

CUST1000

Male

47

Electronics

4

30

1

1000 rows × 7 columns

How to create a column with Serial Numbers

df['Serial Number'] = np.arange(1, 1001)
df

Output (truncated):

Customer ID
Gender
Age
Product Category
Quantity
Price per Unit
new col
Serial Number

0

CUST001

Male

34

Beauty

3

50

1

1

1

CUST002

Female

26

Clothing

2

500

1

2

2

CUST003

Male

50

Electronics

1

30

1

3

3

CUST004

Male

37

Clothing

1

500

1

4

4

CUST005

Male

30

Beauty

2

50

1

5

...

...

...

...

...

...

...

...

...

995

CUST996

Male

62

Clothing

1

50

1

996

996

CUST997

Male

52

Beauty

3

30

1

997

997

CUST998

Female

23

Beauty

4

25

1

998

998

CUST999

Female

36

Electronics

3

50

1

999

999

CUST1000

Male

47

Electronics

4

30

1

1000

1000 rows × 8 columns

How to create a column with random numbers

df['random column'] = np.random.randint(5,10,1000)

How to Update a Column

df['new col'] = 5
df

Output (truncated):

Customer ID
Gender
Age
Product Category
Quantity
Price per Unit
new col
Serial Number

0

CUST001

Male

34

Beauty

3

50

5

1

1

CUST002

Female

26

Clothing

2

500

5

2

2

CUST003

Male

50

Electronics

1

30

5

3

3

CUST004

Male

37

Clothing

1

500

5

4

4

CUST005

Male

30

Beauty

2

50

5

5

...

...

...

...

...

...

...

...

...

995

CUST996

Male

62

Clothing

1

50

5

996

996

CUST997

Male

52

Beauty

3

30

5

997

997

CUST998

Female

23

Beauty

4

25

5

998

998

CUST999

Female

36

Electronics

3

50

5

999

999

CUST1000

Male

47

Electronics

4

30

5

1000

1000 rows × 8 columns

How to create a Calculative Column

df['new price'] = df['Price per Unit'] / 2
df

Output (truncated):

Customer ID
Gender
Age
Product Category
Quantity
Price per Unit
new col
Serial Number
new price

0

CUST001

Male

34

Beauty

3

50

5

1

25

1

CUST002

Female

26

Clothing

2

500

5

2

250

2

CUST003

Male

50

Electronics

1

30

5

3

15

3

CUST004

Male

37

Clothing

1

500

5

4

250

4

CUST005

Male

30

Beauty

2

50

5

5

25

...

...

...

...

...

...

...

...

...

...

995

CUST996

Male

62

Clothing

1

50

5

996

25

996

CUST997

Male

52

Beauty

3

30

5

997

15

997

CUST998

Female

23

Beauty

4

25

5

998

12.5

998

CUST999

Female

36

Electronics

3

50

5

999

25

999

CUST1000

Male

47

Electronics

4

30

5

1000

15

1000 rows × 9 columns

Then:

df['Total Price'] = df['Price per Unit'] * df['Quantity']
df

Output (truncated):

Customer ID
Gender
Age
Product Category
Quantity
Price per Unit
new col
Serial Number
new price
Total Price

0

CUST001

Male

34

Beauty

3

50

5

1

25

150

1

CUST002

Female

26

Clothing

2

500

5

2

250

1000

2

CUST003

Male

50

Electronics

1

30

5

3

15

30

3

CUST004

Male

37

Clothing

1

500

5

4

250

500

4

CUST005

Male

30

Beauty

2

50

5

5

25

100

...

...

...

...

...

...

...

...

...

...

...

995

CUST996

Male

62

Clothing

1

50

5

996

25

50

996

CUST997

Male

52

Beauty

3

30

5

997

15

90

997

CUST998

Female

23

Beauty

4

25

5

998

12.5

100

998

CUST999

Female

36

Electronics

3

50

5

999

25

150

999

CUST1000

Male

47

Electronics

4

30

5

1000

15

120

1000 rows × 10 columns

How to save your changes in your original file

df.to_csv('retail_sales_dataset.csv',index=False)

Assignments:

  1. Load Adidas File from required files.

  2. Get Retailer Column only from the dataframe.

  3. Find the type of the retailer column.

  4. Store top 5 Retailer , Product and Price column records in a small dataframe.

  5. Create Total Price Column.

  6. Create a column with serial numbers.

  7. Create a discount column with random numbers ranging between 1 to 5 percent.

Last updated