Chapter 08 Strings

8.1 INTRODUCTION

We have studied in Chapter 5, that a sequence is an orderly collection of items and each item is indexed by an integer. Following sequence data types in Python were also briefly introduced in Chapter 5.

  • Strings
  • Lists
  • Tuples

“The great thing about a computer notebook is that no matter how much you stuff into it, it doesn’t get bigger or heavier.”

$\quad$ - Bill Gates

Another data type ‘Dictionary’ was also introduced in chapter 5 which falls under the category of mapping. In this chapter, we will go through strings in detail. List will be covered in Chapter 9 whereas tuple and dictionary will be discussed in Chapter 10.

8.2 STRINGS

String is a sequence which is made up of one or more UNICODE characters. Here the character can be a letter, digit, whitespace or any other symbol. A string can be created by enclosing one or more characters in single, double or triple quote.

Example 8.1

>>> str1 = ‘Hello World!’
>>> str2 = “Hello World!”
>>> str3 = “““Hello World!”””
>>> str4 = ‘‘‘Hello World!’’

str1, str2, str3, str4 are all string variables having the same value ‘Hello World!’. Values stored in str 3 and str 4 can be extended to multiple lines using triple codes as can be seen in the following example:

>>> str3 = “““Hello World!
welcome to the world of Python”””
>>> str4 = ‘‘‘Hello World!
welcome to the world of Python’’’

Python does not have a character data type. String of length one is considered as character.

8.2.1 Accessing Characters in a String

Each individual character in a string can be accessed using a technique called indexing. The index specifies the character to be accessed in the string and is written in square brackets ([ ]). The index of the first character (from left) in the string is 0 and the last character is $\mathrm{n}-1$ where $\mathrm{n}$ is the length of the string. If we give index value out of this range then we get an IndexError. The index must be an integer (positive, zero or negative).

#initializes a string str1
>>> str1 = ‘Hello World!’
#gives the first character of str1
>>> str1[0]
‘H’
#gives seventh character of str1
>>> str1[6]
‘W’
#gives last character of str1
>>> str1[11]
‘!’
#gives error as index is out of range
>>> str1[15]
IndexError: string index out of range

The index can also be an expression including variables and operators but the expression must evaluate to an integer.

#an expression resulting in an integer index
#so gives 6th character of str1
>>> str1[2+4]
‘W’
#gives error as index must be an integer
>>> str1[1.5]
TypeError: string indices must be integers

Python allows an index value to be negative also. Negative indices are used when we want to access the characters of the string from right to left. Starting from right hand side, the first character has the index as -1 and the last character has the index $-\mathrm{n}$ where $\mathrm{n}$ is the length of the string. Table 8.1 shows the indexing of characters in the string ‘Hello World!’ in both the cases, i.e., positive and negative indices.

>>> str1[-1] #gives first character from right
‘!’
>>> str1[-12]#gives last character from right
‘H’

$\hspace{1cm}$ Table 8.1 Indexing of characters in string ‘Hello World!’

Positive Indices 0 1 2 3 4 5 6 7 8 9 10 11
String $\mathrm{H}$ $\mathrm{e}$ 1 1 $\mathrm{o}$ $\mathrm{W}$ $\mathrm{o}$ $\mathrm{r}$ 1 $\mathrm{~d}$ !
Negative Indices -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1

An inbuilt function len() in Python returns the length of the string that is passed as parameter. For example, the length of string str1 = ‘Hello World!’ is 12.

#gives the length of the string str1
>>> len(str1)
12
#length of the string is assigned to n
>>> n = len(str1)
>>> print(n)
12
#gives the last character of the string
>>> str1[n-1]
‘!’
#gives the first character of the string
>>> str1[-n]
‘H’

8.2.2 String is Immutable

A string is an immutable data type. It means that the contents of the string cannot be changed after it has been created. An attempt to do this would lead to an error.

>>> str1 = “Hello World!”
#if we try to replace character ’e’ with ‘a’
>>> str1[1] = ‘a’
TypeError: ‘str’ object does not support item assignment

8.3 STRING OPERATIONS

As we know that string is a sequence of characters. Python allows certain operations on string data type, such as concatenation, repetition, membership and slicing. These operations are explained in the following subsections with suitable examples.

8.3.1 Concatenation

To concatenate means to join. Python allows us to join two strings using concatenation operator plus which is denoted by symbol + .

>>> str1 = ‘Hello’ $\hspace{2cm}$ #First string
>>> str2 = ‘World!’ $\hspace{2cm}$ #Second string
>>> str1 + str2 $\hspace{2cm}$ #Concatenated strings
‘HelloWorld!’
$\hspace{3cm}$ #str1 and str2 remain same
>>> str1 $\hspace{3cm}$ #after this operation.
‘Hello’
>>> str2
‘World!’

8.3.2 Repetition

Python allows us to repeat the given string using repetition operator which is denoted by symbol *.

#assign string ‘Hello’ to str1
>>> str1 = ‘Hello’
#repeat the value of str1 2 times
>>> str1 * 2
‘HelloHello’
#repeat the value of str1 5 times
>>> str1 * 5
‘HelloHelloHelloHelloHello’

Note: str1 still remains the same after the use of repetition operator.

8.3.3 Membership

Python has two membership operators ‘in’ and ’not in’. The ‘in’ operator takes two strings and returns True if the first string appears as a substring in the second string, otherwise it returns False.

>>> str1 = ‘Hello World!’
>>> ‘W’ in str1
True
>>> ‘Wor’ in str1
True
>>> ‘My’ in str1
False

The ’not in’ operator also takes two strings and returns True if the first string does not appear as a substring in the second string, otherwise returns False.

>>> str1 = ‘Hello World!’
>>> ‘My’ not in str1
True
>>> ‘Hello’ not in str1
False

8.3.4 Slicing

In Python, to access some part of a string or substring, we use a method called slicing. This can be done by specifying an index range. Given a string str1, the slice operation $\operatorname{str} 1[n: m]$ returns the part of the string str 1 starting from index $n$ (inclusive) and ending at $m$ (exclusive). In other words, we can say that $\operatorname{str} 1[n: m]$ returns all the characters starting from str1[n] till str1[m-1]. The numbers of characters in the substring will always be equal to difference of two indices $m$ and n, i.e., $(m-n)$.

>>> str1 = ‘Hello World!’
#gives substring starting from index 1 to 4
>>> str1[1:5]
’ello’
#gives substring starting from 7 to 9
>>> str1[7:10]
‘orl’
#index that is too big is truncated down to
#the end of the string
>>> str1[3:20]
’lo World!’
#first index > second index results in an
#empty ’’ string
>>> str1[7:2]

If the first index is not mentioned, the slice starts from index . #gives substring from index 0 to 4
>>> str1[:5]
‘Hello’

If the second index is not mentioned, the slicing is done till the length of the string.

#gives substring from index 6 to end
>>> str1[6:]
‘World!’

The slice operation can also take a third index that specifies the ‘step size’. For example, str1[n : m : k], means every $k^{\text {th }}$ character has to be extracted from the string str 1 starting from $\mathrm{n}$ and ending at $\mathrm{m}-1$. By default, the step size is one.

>>> str1[0:10:2]
‘HloWr’
>>> str1[0:10:3]
‘HlWl’

Negative indexes can also be used for slicing.

#characters at index -6,-5,-4,-3 and -2 are \ #sliced
>>> str1[-6:-1]
‘World’

If we ignore both the indexes and give step size as -1

#str1 string is obtained in the reverse order
>>> str1[::-1]
‘!dlrow olleh’

8.4 TRAVERSING A STRING

We can access each character of a string or traverse a string using for loop and while loop.

(A) String Traversal Using for Loop:

>>> str1 = ‘Hello World!’
>>> for ch in str1:
$\quad$ print(ch,end = ‘’)
Hello World! $\hspace{1cm}$ #output of for loop

In the above code, the loop starts from the first character of the string str1 and automatically ends when the last character is accessed.

(B) String Traversal Using while Loop:

>>> str1 = ‘Hello World!’
>>> index = 0
#len(): a function to get length of string
>>> while index < len(str1):
$\quad$ print(str1[index],end = ‘’)
$\quad$ index += 1

Hello World! $\hspace{1cm}$#output of while loop

Here while loop runs till the condition index $<$ len(str) is True, where index varies from 0 to len(str1) 1 .

8.5 STRING METHODS AND BUILT-IN FUNCTIONS

Python has several built-in functions that allow us to work with strings. Table 8.2 describes some of the commonly used built-in functions for string manipulation.

$\hspace{3cm}$ Table 8.2 Built-in functions for string manipulations

Method Description Example
len() Returns the length of the
given string
>>> str1 = ‘Hello World!’
>>> len(str1)
12
title() Returns the string with first letter
of every word in the string in
uppercase and rest in lowercase
>>> str1 = ‘hello WORLD!’
‘Hello World!’ '
lower() Returns the string with all
uppercase letters converted
to lowercase
>>> str1 = ‘hello WORLD!’
>>> str1.lower()
‘hello world!’
upper() Returns the string with all
lowercase letters converted
to uppercase
>>> str1 = ‘hello WORLD!’
>>> str1.upper()
‘HELLO WORLD!’
count(str,
start, end)
Returns number of times
substring str occurs in the
given string. If we do not give
start index and end index then
searching starts from index 0
and ends at length of the string
>>> str1 = ‘Hello World! Hello
Hello’
>>> str1.count(‘Hello’,12,25)
2
>>> str1.count(‘Hello’)
3
find(str,start
end)
Returns the first occurrence of
index of substring str occurring
in the given string. If we do not
give start and end then searching
starts from index 0 and ends
at length of the string. If the
substring is not present in the
given string, then the function
returns -1
>>> str1= ‘Hello World! Hello Hello’
>>> str1.find(‘Hello’, 10,20$)$
13
>>> strl.find(‘Hello’,15,25)
19
>>> strl.find(‘Hello’)
0
>>> strl.find(‘Hee’)
-1
index(str,
start, end)
Same as find() but raises an
exception if the substring is not
present in the given string
>>> str1 = ‘Hello World! Hello
Hello’
>>> str1.index(‘Hello’)

>>> str1.index(‘Hee’)
ValueError: substring not found
endswith() Returns True if the given string
ends with the supplied substring
otherwise returns False
>>> str1 = ‘Hello World!’
>>> str1.endswith(‘World!’)
True
>>> str1.endswith(’!’ )
True
>>> str1.endswith(’lde’)
False
startswith() Returns True if the given string
starts with the supplied substring
otherwise returns False
>>> str1 = ‘Hello World!’
>>> str1.startswith(‘He’)
True
>>> str1.startswith(‘Hee’)
False
isalnum() Returns True if characters of the
given string are either alphabets
or numeric. If whitespace or
special symbols are part of the
given string or the string is empty
it returns False
>>> str1 = ‘HelloWorld’
>>> str1.isalnum()
True
>>> str1 = ‘HelloWorld2’
>>> str1.isalnum()
True
>>> str1 = ‘HelloWorld!!’
>>> str1.isalnum()
False
islower() Returns True if the string is
non-empty and has all lowercase
alphabets, or has at least one
character as lowercase alphabet
and rest are non-alphabet
characters
>>> str1 = ‘hello world!’
>>> str1.islower()
True
>>> str1 = ‘hello 1234’
>>> str1.islower()
True
>>> str1 = ‘hello ??’
>>> str1.islower()
True
>>> str1 = ‘1234’
>>> str1.islower()
False
>>> str1 = ‘Hello World!’
>>> str1.islower()
False
isupper() Returns True if the string is
non-empty and has all uppercase
alphabets, or has at least one
character as uppercase character
and rest are non-alphabet
characters
>>> str1 = ‘HELLO WORLD!’
>>> str1.isupper()
True
>>> str1 = ‘HELL0 1234’
>>> str1.isupper()
True
>>> str1 = ‘HELL0 ??’
>>> str1.isupper()
True
>>> str1 = ‘1234’
>>> str1.isupper()
False
>>> str1 = ‘Hello World!’
>>> str1.isupper()
False
isspace() Returns True if the string is
non-empty and all characters
are white spaces (blank, tab,
newline, carriage return)
>>> str1 = ’ \n \t $\backslash r^{\prime}$
>>> str1.isspace( $)$
True
>>> str1 = ‘Hello
>>> str1.isspace()
False
istitle() Returns True if the string is
non-empty and title case, i.e.,
the first letter of every word in
the string in uppercase and rest
in lowercase
>>> str1 = ‘Hello World!’
>>> str1.istitle()
True
>>> str1 = ‘hello World!’
>>> str1.istitle()
False
1strip() Returns the string after removing
the spaces only on the left of the
string
>>> str1 = ’ Hello World!
‘>>> str1.lstrip()
‘Hello World!
rstrip() Returns the string after removing
the spaces only on the right of
the string
>>> str1 $=$ ’ Hello World!’
>>> str1.rstrip()
Hello World!’
strip() Returns the string after removing
the spaces both on the left and
the right of the string
>>> str1 = ’ $\quad$ Hello World!’
>>> str1.strip( )
‘Hello World!’
replace(oldstr,
newstr)
Replaces all occurrences of old
string with the new string
>>> str1 = ‘Hello World!’
>>> str1.replace(‘o’,’’)
‘Hell
W*rld!’
>>> str1 = ‘Hello World!’
>>> str1.replace(‘World’, ‘Country’)
‘Hello Country!’
>>> str1 = ‘Hello World! Hello’
>>> str1.replace(‘Hello’,‘Bye’)
‘Bye World! Bye’
join() Returns a string in which the
characters in the string have
been joined by a separator
>>> str1 $=$ (‘Helloworld!’ $)$
>>> str $2=’-$ #separator
>>> str2.join(str1)
‘H-e-1-1-o-W-o-r-1-d-!’
partition()) Partitions the given string at the
first occurrence of the substring
(separator) and returns the string
partitioned into three parts.
1. Substring before the
separator
2. Separator
3. Substring after the separator
If the separator is not found in the
string, it returns the whole string
itself and two empty strings
>>> str1 = ‘India is a Great Country’
>>> str1.partition(‘is’)
(‘India ‘, ‘is’, ’ a Great
Country’)
>>> str1.partition(‘are’)
(‘India is a Great Country’, ’ ‘,’
‘)
split() Returns a list of words delimited
by the specified substring. If no
delimiter is given then words are
separated by space.
>>> str1 = ‘India is a Great Country’
>>> str1.split()
[‘India’,‘is ‘, ‘a’, ‘Great’,
‘Country’]
>>> str1 = ‘India is a Great
Country’
>>> str1.split(‘a’)
[‘Indi’,’ is ‘, ’ Gre’, ’t
Country’]

8.6 HANDLING STRINGS

In this section, we will learn about user defined functions in Python to perform different operations on strings.

Program 8-1 Write a program with a user defined function to count the number of times a character (passed as argument) occurs in the given string.

#Program 8-1
#Function to count the number of times a character occurs in a \ #string
def charCount(ch,st):
$\quad$ count = 0
$\quad$ for character in st:
$\qquad$ if character == ch:
$\hspace{1cm}$ count += 1
$\quad$ return count
#end of function

st = input(“Enter a string: “)
ch = input(“Enter the character to be searched: “)
count = charCount(ch,st)
print(“Number of times character”,ch,“occurs in the string is:",count)

Output:

Enter a string: Today is a Holiday
Enter the character to be searched: a
Number of times character a occurs in the string is: 3

Program 8-2 Write a program with a user defined function with string as a parameter which replaces all vowels in the string with ‘*’.

#Program 8-2
#Function to replace all vowels in the string with ‘
def replaceVowel(st):
$\quad$ #create an empty string
$\quad$ newstr = ’’
$\quad$ for character in st:
$\qquad$ #check if next character is a vowel
$\qquad$ if character in ‘aeiouAEIOU’:
$\hspace{1cm}$ #Replace vowel with *
$\hspace{1cm}$ newstr += ‘

$\qquad$ else:
$\hspace{1cm}$ newstr += character
$\qquad$ return newstr
#end of function
st = input(“Enter a String: “)
st1 = replaceVowel(st)
print(“The original String is:",st)
print(“The modified String is:",st1)

Output:

Enter a String: Hello World
The original string is: Hello World
The modified String is: H * ll * W * rld

Program 8-3 Write a program to input a string from the user and print it in the reverse order without creating a new string.

#Program 8-3
#Program to display string in reverse order
st = input(“Enter a string: “)
for i in range(-1,-len(st)-1,-1):
$\qquad$ print(st[i],end=’’)

Output:

Enter a string: Hello World
dlrow olleH

Program 8-4 Write a program which reverses a string passed as parameter and stores the reversed string in a new string. Use a user defined function for reversing the string.

#Program 8-4
#Function to reverse a string
def reverseString(st):
$\quad$ newstr = ’’ $\qquad$ #create a new string
$\quad$ length = len(st)
$\quad$ for i in range(-1,-length-1,-1):
$\qquad$ newstr += st[i]
$\quad$ return newstr
#end of function
st = input(“Enter a String: “)
st1 = reverseString(st)
print(“The original String is:",st)
print(“The reversed String is:",st1)

Output:

Enter a String: Hello World
The original string is: Hello World
The reversed String is: dlrow olleH

Program 8-5 Write a program using a user defined function to check if a string is a palindrome or not. (A string is called palindrome if it reads same backwards as forward. For example, Kanak is a palindrome.)

#Program 8-5
#Function to check if a string is palindrome or not
def checkPalin(st):
$\quad$ i = 0
$\quad$ j = len(st) - 1
$\quad$ while(i <= j):
$\qquad$ if(st[i] != st[j]):
$\hspace{1cm}$ return False
$\qquad$ i += 1
$\qquad$ j -= 1
$\quad$ return True
#end of function
st = input(“Enter a String: “)
result = checkPalin(st)
if result == True:
$\quad$ print(“The given string”,st,“is a palindrome”)
else:
$\quad$ print(“The given string”,st,“is not a palindrome”)

Output 1:

Enter a String: kanak
The given string kanak is a palindrome

Output 2:

Enter a String: computer
The given string computer is not a palindrome

SUMMARY

  • A string is a sequence of characters enclosed in single, double or triple quotes.
  • Indexing is used for accessing individual characters within a string.
  • The first character has the index 0 and the last character has the index $\mathrm{n}-1$ where $\mathrm{n}$ is the length of the string. The negative indexing ranges from $-n$ to -1 .
  • Strings in Python are immutable, i.e., a string cannot be changed after it is created.
  • Membership operator in takes two strings and returns True if the first string appears as a substring in the second else returns False. Membership operator ’not in’ does the reverse.
  • Retrieving a portion of a string is called slicing. This can be done by specifying an index range. The slice operation $\operatorname{str} 1[\mathrm{n}: \mathrm{m}]$ returns the part of the string str 1 starting from index $\mathrm{n}$ (inclusive) and ending at $\mathrm{m}$ (exclusive).
  • Each character of a string can be accessed either using a for loop or while loop.
  • There are many built-in functions for working with strings in Python.

EXERCISE

1. Consider the following string mySubject:

mySubject = “Computer Science”

What will be the output of the following string operations :

i. print(mySubject[0:len(mySubject)])
ii. print(mySubject[-7:-1])
iii. print(mySubject[::2])
iv. print(mySubject[len(mySubject)-1])
v. print(2*mySubject)
vi. print(mySubject[::-2])
vii. print(mySubject[:3] + mySubject[3:])
viii. print(mySubject.swapcase())
ix. print(mySubject.startswith(‘Comp’))
x. print(mySubject.isalpha())

2. Consider the following string myAddress:

myAddress = “WZ-1,New Ganga Nagar,New Delhi”

What will be the output of following string operations :

i. print(myAddress.lower())
ii. print(myAddress.upper())
iii. print(myAddress.count(‘New’))
iv. print(myAddress.find(‘New’))
v. print(myAddress.rfind(‘New’))
vi. print(myAddress.split(’,’))
vii. print(myAddress.split(’ ‘))
viii. print(myAddress.replace(‘New’,‘Old’))
ix. print(myAddress.partition(’,’))
x. print(myAddress.index(‘Agra’))

PROGRAMINING PROBLEMS

1. Write a program to input line(s) of text from the user until enter is pressed. Count the total number of characters in the text (including white spaces), total number of alphabets, total number of digits, total number of special symbols and total number of words in the given text. (Assume that each word is separated by one space).

2. Write a user defined function to convert a string with more than one word into title case string where string is passed as parameter. (Title case means that the first letter of each word is capitalised)

3. Write a function deleteChar() which takes two parameters one is a string and other is a character. The function should create a new string after deleting all occurrences of the character from the string and return the new string.

4. Input a string having some digits. Write a function to return the sum of digits present in this string.

5. Write a function that takes a sentence as an input parameter where each word in the sentence is separated by a space. The function should replace each blank with a hyphen and then return the modified sentence.



Table of Contents