12. String Function In Pandas

file-download
880B

If you are working on a dataset, there are scenarios where you need to perform string operations / manipulation.

String Function In Pandas

To demonstrate, consider this sample dataframe:

Output (DataFrame):

CustomerID
FirstName
LastName
Address
Feedback
PhoneNumber

0

101

RaHul

Sharma

221B baker street, delhi

Product was good but DELIVERY was late

98765432

1

102

Priya

Singh

mg road, bengaluru

Packing was excellent!!

9988776655

2

103

Anil

Kapoor

sector 22, noida

Quality not same as shown

123456789012

3

104

sneha

Agarwal

park street, kolkata

Delay of 5 days. not satisfied

90807060

4

105

Rohan

Mehta

andheri east, mumbai

Product damaged. refund in 7 days?

9090909090

5

106

Meera

Verma

baner road, pune

Delivered early!! very happy

8008008008

6

107

amit

Kumar

gomti nagar, lucknow

Color was different from the website

9876543210987

7

108

SwaTi

Joshi

anna salai, chennai

Double payment deducted

88112

8

109

Ravi

Shukla

ring road, ahmedabad

Wrong item delivered

9999999999

9

110

Nidhi

Gupta

civil lines, jaipur

Refund amount incorrect by 150 rupees

989898989

To use string functions on columns, use the str accessor. Just like dt brings datetime functions, str brings string functions.

strip function

strip removes white spaces from the left and right side of the string.

Example:

Output (DataFrame):

CustomerID
FirstName
LastName
Address
Feedback
PhoneNumber

0

101

RaHul

Sharma

221B baker street, delhi

Product was good but DELIVERY was late

98765432

1

102

Priya

Singh

mg road, bengaluru

Packing was excellent!!

9988776655

2

103

Anil

Kapoor

sector 22, noida

Quality not same as shown

123456789012

3

104

sneha

Agarwal

park street, kolkata

Delay of 5 days. not satisfied

90807060

4

105

Rohan

Mehta

andheri east, mumbai

Product damaged. refund in 7 days?

9090909090

5

106

Meera

Verma

baner road, pune

Delivered early!! very happy

8008008008

6

107

amit

Kumar

gomti nagar, lucknow

Color was different from the website

9876543210987

7

108

SwaTi

Joshi

anna salai, chennai

Double payment deducted

88112

8

109

Ravi

Shukla

ring road, ahmedabad

Wrong item delivered

9999999999

9

110

Nidhi

Gupta

civil lines, jaipur

Refund amount incorrect by 150 rupees

989898989

You can strip multiple columns:

Output (DataFrame) — same structure with trimmed whitespace.

lower / upper / swapcase / title / capitalize

These methods change case or capitalization.

Example sequence:

Output (DataFrame): addresses shown in lower case.

len function

Counts the number of characters in a value.

Output includes new column count with character lengths of FirstName.

For phone numbers, first cast to string, then measure length:

Output includes Phone Length column.

replace

Used to replace substrings or characters.

And removing punctuation from feedback:

Output: punctuation removed in Feedback, street replaced by st. in Address.

split(expand=True)

Split values by a delimiter.

This returns a Series of lists. To expand into separate columns:

Output (expanded):

0
1

0

221b baker st.

delhi

1

mg road

bengaluru

2

sector 22

noida

3

park st.

kolkata

4

andheri east

mumbai

5

baner road

pune

6

gomti nagar

lucknow

7

anna salai

chennai

8

ring road

ahmedabad

9

civil lines

jaipur

Assign to new columns:

Output includes Area and City columns.

contains()

Returns True if string contains the keyword (supports regex). Useful with loc to filter rows.

Examples:

Matches rows where Feedback contains "Refund" (case-insensitive).

Combine conditions:

Or use a regex alternation:

To require multiple keywords (AND):

Note: there's no shortcut operator inside a single contains call for logical AND — combine expressions with &.

startswith()

Checks if a string starts with the given keyword. There's no case parameter for startswith, so standardize case first if needed.

endswith()

Checks if a string ends with the given keyword. Also has no case parameter; standardize case first.

string concatenation

You can concatenate strings from columns:

Output includes full name column.

indexing

Access characters by position:

Adds area code column with first character of Address.

slicing

Slice substrings using Python slice notation:

Adds short name column.

You can combine operations to create more complex codes:

Output includes unique code column, e.g. RaH-101-ihled.


Assignments

STRING FUNCTION ASSIGNMENTS


⭐ Basic Cleaning

1. Whitespace Cleanup

  • Remove leading/trailing spaces from FirstName, LastName, Address, and Feedback.

2. Case Standardization

  • Convert FirstName to proper case (first letter capital).

  • Convert LastName to UPPERCASE.

  • Convert Address to lowercase.

  • Convert Feedback to swapcase().


⭐ Length & Validation Tasks

3. Character Count

  • Create a column NameLength = length of FirstName + LastName combined.

  • Create a column PhoneLength = number of digits in PhoneNumber.

4. Identify Invalid Phone Numbers

  • Return all rows where PhoneNumber length is not equal to 10.

  • Return all rows where PhoneNumber contains any non-numeric characters.


⭐ Replace Tasks

5. Address Cleaning

  • Replace “street” → “st.”

6. Remove Punctuation

  • Remove all !, ?, ., and , from Feedback.

7. Mask Phone Number

  • Show only last 4 digits, mask the rest with *. Hint : You can use function in pandas here


⭐ Split & Extract Tasks

8. Address Splitting

Split Address into:

  • Area

  • City

Create two new columns.

9. Extract City Initial

  • Create a column CityCode = first 3 letters of the city.


⭐ contains(), startswith(), endswith()

10. Filter on Feedback

Return rows where Feedback:

  • contains “refund” OR “damaged”

  • contains BOTH words “not” and “satisfied”

  • contains the word “delivered” but not “late”

11. Filter on Address

  • Show addresses starting with ‘park’.

  • Show addresses ending with ‘mumbai’.

  • Show addresses that contain a number.


⭐ Concatenation Tasks

12. Full Name Creation

Create a column:

  • FullName = FirstName + " " + LastName

13. Create a Customer Code

Format:


⭐ Indexing & Slicing

14. Extract Codes

  • Create AreaCode = first character of Address.

  • Create ShortName = first 3 characters of FirstName.

15. Reverse Manipulations

  • Reverse LastName.

  • Reverse full Address string.

Last updated