Python Regular Expressions, RegEx Functions/methods, Metacharacters, Special Sequences, Search & Replace, and Matching Versus Searching.
Python Regular Expressions
What is Regular Expression?
A Regular Expression (RegEx or RE) in a programming language is a special text string used for describing a search pattern. It is extremely useful for extracting information from text such as code, files, logs, spreadsheets, or even documents.
Python has a built-in package called re, which can be used to work with Regular Expressions.
import re
We use RegEx functions/methods, Metacharacters, and Special Sequences for creating Regular Expressions.
I. RegEx Functions/Methods
The re module provides users a variety of functions to search for a pattern in a particular string.
findall – It returns a list containing all matches
search – It returns a Match object if there is a match anywhere in the string
split – It returns a list where the string has been split at each match
sub – It replaces one or many matches with a string
1. re.findall()
The re.findall() function returns a list of strings containing all matches of the specified pattern.
Example:
import re
mystr = “at what time?”
match = re.findall(‘at’,mystr)
print (match)
Note: The above example will return a list of all the instances of the substring at in the given string.
2. re.search()
The re.search() function returns a match object in case a match is found.
Example:
import re
string = “at what time?”
match = re.search(‘at’,string)
if (match):
print (“String is found at: ” ,match.start())
else:
print (“String is not found!”)
Note: The start() method/function returns the start index of the matched string.
3. re.plit() Function
The split() function returns a list where the string has been split at each match.
Example:
Split at each white-space character:
import re
txt = “India Country”
x = re.split(“\s”, txt)
print(x) # [‘India’, ‘Country’]
4. re.sub()
The re.sub() function is used to replace occurrences of a particular sub-string with another sub-string.
Example:
import re
string = “at what time?”
match = re.sub(“\s”,”,”,string)
print (match)
II. Metacharacters
Metacharacters are characters with a special meaning.
Examples:
1. [] – A set of characters
import re
txt = “I am a Python Programmer”
#Find all lower case characters alphabetically between “a” and “m”:
x = re.findall(“[a-e]”, txt)
print(x)
2. \ – Signals a special sequence
import re
txt = “Is it 20 Rs. or 30 Rs.”
#Find all digit characters:
x = re.findall(“\d”, txt)
print(x) # [‘2’, ‘0’, ‘3’, ‘0’]
3. ^ – Starts with
import re
txt = “My Country is India”
#Check if the string starts with ‘My’:
x = re.findall(“^My”, txt)
if x:
print(“Yes, the string starts with ‘My'”)
else:
print(“No match”)
4. $ – Ends with
import re
txt = “My Country is India”
#Check if the string ends with ‘India’:
x = re.findall(“India$”, txt)
if x:
print(“Yes, the string ends with ‘India'”)
else:
print(“No match”)
5. | – Either or
import re
txt = “India is my country and I love my country”
#Check if the string contains either “falls” or “stays”:
x = re.findall(“my|hate”, txt)
print(x)
if x:
print(“Yes, there is at least one match!”)
else:
print(“No match”)
Output:
[‘my’, ‘my’]
Yes, there is at least one match!
III. Special Sequences
A special sequence is a \ followed by one character and has a special meaning.
1. \A
Returns a match if the specified characters are at the beginning of the string
import re
txt = “Python is a Lightweight Programming Language”
#Check if the string starts with “Python”:
x = re.findall(“\APython”, txt)
print(x)
if x:
print(“Yes, there is a match!”)
else:
print(“No match”)
2. \D
It returns a match where the string DOES NOT contain digits
import re
txt = “India@123”
#Return a match at every no-digit character:
x = re.findall(“\D”, txt)
print(x)
if x:
print(“Yes, there is at least one match!”)
else:
print(“No match”)
3. \d
It returns a match where the string contains digits (numbers from 0-9)
import re
txt = “India@123”
#Check if the string contains any digits (numbers from 0-9):
x = re.findall(“\d”, txt)
print(x)
if x:
print(“Yes, there is at least one match!”)
else:
print(“No match”)
4. \S
It returns a match where the string DOES NOT contain a white space character.
import re
txt = “A B C”
#Return a match at every NON white-space character:
x = re.findall(“\S”, txt)
print(x)
if x:
print(“Yes, there is at least one match!”)
else:
print(“No match”)
5. \s
It returns a match where the string contains a white space character.
import re
txt = “I am a Python Programmer”
#Return a match at every white-space character:
x = re.findall(“\s”, txt)
print(x)
if x:
print(“Yes, there is at least one match!”)
else:
print(“No match”)
Python Regular Expressions
Python Complete Tutorial
Python Video Tutorial
Python Programming Syllabus
Python Programming Quiz
Python Interview Questions for Fresher
1. Introduction to Python Programming Language
2. Download and Install Python
4. Python Keywords and Identifiers
9. Python Conditional Statements
11. Python Branching Statements
14. Python Data Structures – Lists
15. Python Data Structures – Sets
16. Python Data Structures – Tuples
17. Python Data Structures – Dictionaries
18. Python User Defined Functions
24. Python Object-Oriented Programming
Follow me on social media: