Strings


  Machine Learning in Python

Contents

What are strings

Strings are a sequence of characters that a variable can hold. For example, the variable name holds a string of characters for the name.

name = "Ajay"

Both single quotes and double quotes are ok. So, both of the following versions are acceptable versions of strings in Python.In [6]:

name_1 = "Ajay"

#or

name_2 = 'Ajay'

Python automatically assigns the variable with type string.

type(name)

str

What if you wanted to have either single quotes or double quotes as literals within the string ? For example, will this work ?

title = 'Ajay Tech's Tutorials'

No, it will not.

title = 'Ajay Tech's Tutorials'
                       ^
SyntaxError: invalid syntax

There are two ways to resolve this.

  • Use backslash to escape the single quote
title = 'Ajay Tech\'s Tutorials'
  • Use double quotes to enclose the string
title = "Ajay Tech\'s Tutorials"

String Indexing

With indexing, you can really slice and dice a string any way you want. For example, to extract the first character in the string, just use

name[0]
'A'

Now it should make sense why the substring “tech” starts at 5 and not at 6.

Now, we are going to look at a couple of string methods that are most used.


substrings

String variables have a bunch of inbuilt methods – think of them as helper methods. For example, to find out if the string “Ajay Tech” contains the word “Tech” , use the in-built function find ( )

name = "Ajay Tech"

name.find("Tech")

find ( ) returns the index of the start of the sub-string you are looking for. In our case, it is 5. But is it really 5 ? Not unless we start with 0. And just like C or Java, all indexing starts with 0 in Python.

Question – What is the index of the substring “hot” in the string “Python is a hot language”
10
11
12

String split

Splitting strings is pretty useful. Think if splitting a name into it’s first, middle and last names.

name = "William Bradley Pitt" # BTW, that is Brad Pitt's real name.
name.split()

['William', 'Bradley', 'Pitt']

Using multiple variable assignments, you can split up the string and assign them to multiple variables like below.

first, middle, last = name.split()
middle

'Bradley'

split ( ) function by default splits the string based on the blank( space ) character. You can split by any other character. For example, if the name was like this,

name = "William,Bradley,Pitt"

you could use ask the split ( ) function to split by comma (,) and not by space.

first,middle,last = name.split(",")
middle
'Bradley'

We will talk more about functions and arguments in the next section.


String strip

String strip is another useful function. It is often the case that data is messy. For example, when the user enters their name or address on a web form, they could include additional spaces at the end or at the beginning. This cause inconsistency in comparing strings. For example,

name_1 = "Ajay"
name_2 = " Ajay " # observe the leading and trailing space

name_1 == name_2

False
name_2 = name_2.strip()

name_1 == name_2
True
name_1 = "Ajay"
name_2 = " Ajay " # observe the leading and trailing space

name_2.strip()

name_1 == name_2

False
Question – What is the output of the code above.
true
false

String is alphanumeric

Another useful function with Strings in Python is to check if a string is alphanumeric or not. Most websites restrict the user name to be either alphabetic or numeric and nothing else ( no special characters ). How do you check for it ? Use the isalnum ( ) function

name = " Ajay Tech !!" # contains a non-alpha numeric character "!"
name.isalnum()

False

Another useful function is to check if the string is just alphabetic or not ( No numerics or special characters ). In this case, use the isalpha ( ) function.

name = "ajaytech1"  # contains a numeric "1"
name.isalpha()
True

String Concatenation

Concatenating strings is pretty easy. You don’t even need an in-built function for this. Just use the operator + . When we visit Object Oriented Python later, we will understand exactly why this works (operator overloading). For now, just think of it as a shortcut to join strings together.

name_1 = "Ajay"
name_2 = "Tech"

name = name_1 + name_2
print ( name )
AjayTech

Upper and Lower case

Like we discussed earlier, user data is always messy. For example, when a user enters their name, it could be in a variety of ways

  • Init capitals
  • leading and trailing spaces
  • capitals in between etc.

So, it is common practice to process user data and store it in a standardize way. For example, the name “Ajay tech” would probably be needed to be stored internally as “AJAY TECH” , so that all future data comparisions could be done easily.

name = "Ajay tech"
name.upper()

'AJAY TECH'

It is OK to store it this way, but that is not how you would want to display it back, right ? You would want to display it with Initial capitals – “Ajay Tech” . To do that use the title ( ) function.

name_c = name.upper()
name_c

'AJAY TECH'
name_c.title()

'Ajay Tech'

The point we are trying to make here is that there are a variety of in-built string functions that you can use without having to re-invent the wheel everytime you want some piece of functionality like this. Here are some other useful functions

  • swapcase ( ) – swaps upper case to lower and vice-versa
  • isupper ( ) – checks if all the characters in a string are upper case
  • islower ( ) – check if all the characters in the string are lower case
  • capitalize ( ) – Capitalize the first letter and make all the other characters lower case.

Challenges

Reverse a string

For example, if you are given a string “Ajay”, you have reverse the characters in the string and return a new string as “yajA”.
code
text = input("enter string -")

i = len(text)
r_text = ""

while i > 0  :
    r_text = r_text + text[i-1]
    i = i - 1

print ( r_text)

split a user entered string into its words and print each of the word with the initial letter capitalized

Take a string from the user as input and split the string into its words. Loop through each of the words and print them out on the console with the initial letter capitalized.
code
text = input ( "enter string - ")

for word in text.split() :
    print ( word.title())

Convert a string to upper case

Take a string from the user as input and convert it to upper case. This has quite a lot of use cases. Forms on the web are a good example of this. When the user enters their name, it is capitalized and stored in the database for consistency
solution
user_input = input ( "Enter your name - ")
print ( user_input.capitalize() )

Check if all the characters in a string are alpha-numeric

Take a string from the user as input and verify if all the characters in the string are alphanumeric. user ids for websites are a good example of this. We don’t want strange characters in the user id, right ? We want them to be either alphabetic or numeric.
solution
user_input = input ( "Choose a user id - ")
if user_input.isalnum() == False :
    print ( "Make sure all the characters are either alphabetic or numeric")

Find the greater of 3 numbers

Without using a data structure (for ex., a list ), find the greatest of 3 numbers. This will let you practice your if-else skills.
solution
print ( "Enter 3 numbers - ")

first   = int( input ( "Enter first number - ") )
second  = int( input ( "Enter second number - ") )
third   = int( input ( "Enter third number - ") )

if first > second :
    greater = first
else :
    greater = second

if third > greater :
    print ("The largest number is - ", third)
else :
    print ( "The largest number is - ", greater)

Find the number of occurrences of a character in a user entered string

You will have to traverse a string in a loop and increment a counter to find the number of occurrences. Do not use the count() function. Check for both capital or small letters.
solution
s = input ( "Enter the string to be searched - ").upper()
a = input ( "Enter the alphabet to be searched - ").upper()

counter = 0

for i in range(len(s)) :
    if a == s[i] :
        counter = counter + 1

print ("The character ",a," occurs ", counter, " number of times")

# Using the count() function - This becomes just a one liner.
# print ("The character ",a," occurs ", s.count(a), " number of times")

Find the number of occurrences of a lower case and upper case characters in a string

In a user entered string, find the number of occurrences of lower case and upper case characters
solution
s = input ( "Enter the string to be searched - ")

counter_l = 0   # counter for lower case
counter_u = 0   # counter for upper case

for i in range(len(s)) :
    if s[i].islower() == True :
        counter_l = counter_l + 1
    elif s[i].isupper() == True:
        counter_u = counter_u + 1

print ( "The number of upper case characters is ", counter_u)
print ( "The number of lower case characters is ", counter_l)

Python Strings Cheat sheet

python strings

some of the most used python string methods