Python Find Position of a Regex Match

In this article, we will see how to locate the position of a regex match in a string using the start(), end(), and span() methods of the Python re.Match object.

We will solve the following three scenarios

Get the start and end position of a regex match in a string
Find the indexes of all regex matches
Get the positions and values of each match

Note: Python re module offers us the search(), match(), and finditer() methods to match the regex pattern, which returns us the Match object instance if a match found. Use this Match object to extract the information about the matching string using the start(), end(), and span() method.

These Match object methods are used to access the index positions of the matching string.

start() returns the starting position of the match
end() return the ending position of the match
span() return a tuple containing the (start, end) positions of the match

Example to get the position of a regex match
- Access matching string using start(), and end()
Find the indexes of all regex matches
- find all the indexes of all the occurrences of a word in a string
Points to be remembered while using the start() method

Example to get the position of a regex match

In this example, we will search any 4 digit number inside the string. To achieve this, we must first write the regular expression pattern.

Pattern to match any 4 digit number: \d{4}

Steps:

Search the pattern using the search() method.
Next, we can extract the match value using group()
Now, we can use the start() and end() methods to get the starting and ending index of the match.
Also, we can use the span() method() to get both start and end indexes in a single tuple.

import re

target_string = "Abraham Lincoln was born on February 12, 1809,"
# \d to match digits
res = re.search(r'\d{4}', target_string)
# match value
print(res.group()) 
# Output 1809

# start and end position
print(res.span())
# Output (41, 45)

# start position
print(res.start())
# Output 41

# end position
print(res.end())
# Output 45Code language: Python (python)

Access matching string using start(), and end()

Now, you can save these positions and use them whenever you want to retrieve a matching string from the target string. We can use string slicing to access the matching string directly using the index positions obtained from the start(), end() method.

Example

import re

target_string = "Abraham Lincoln was born on February 12, 1809,"
res = re.search(r'\d{4}', target_string)
print(res.group())
# Output 1809

# save start and end positions
start = res.start()
end = res.end()
print(target_string[start:end])
# Output 1809Code language: Python (python)

Find the indexes of all regex matches

Assume you are finding all matches to the regular expression in Python, apart from all match values you also want the indexes of all regex matches. In such cases, we need to use the finditer() method of Python re module instead of findall().

Because the findall() method returns all matches in the form of a Python list, on the other hand, finditer() returns an iterator yielding match objects matching the regex pattern. Later, we iterate each Match object to extract all matches along with their positions.

In this example, we will find all 5-letter words inside the following string and also print their start and end positions.

import re

target_string = "Jessa scored 56 and Kelly scored 65 marks"
count = 0
# \w matches any alphanumeric character
# \b indicate word boundary
# {5} indicate five-letter word
for match in re.finditer(r'\b\w{5}\b', target_string):
    count += 1
    print("match", count, match.group(), "start index", match.start(), "End index", match.end())Code language: Python (python)

Output

match 1 Jessa start index 0 End index 5
match 2 Kelly start index 20 End index 25
match 3 marks start index 36 End index 41

find all the indexes of all the occurrences of a word in a string

Example

import re

target_string = "Emma knows Python. Emma knows ML and AI"
# find all occurrences of  word emma
# index of each occurrences
cnt = 0
for match in re.finditer(r'emma', target_string, re.IGNORECASE):
    cnt += 1
    print(cnt, "st match start index", match.start(), "End index", match.end())Code language: Python (python)

Output

1 st match start index 0 End index 4
2 nd match start index 19 End index 23

Points to be remembered while using the start() method

Since the re.match() method only checks if the regular expression matches at the start of a string, start() will always be zero.

However, the re.search() method scans through the entire target string and looks for occurrences of the pattern that we want to find, so the match may not start at zero in that case.

Now let’s match any ten consecutive alphanumeric characters in the target string using both match() and search() method.

Example

import re

target_string = "Emma is a basketball player who was born on June 17, 1993"
# match method with pattern and target string using match()
result = re.match(r"\w{10}", target_string)
# printing  match
print("Match: ", result) # None

# using search()
result = re.search(r"\w{10}", target_string)
# printing match
print("Match value: ", result.group()) # basketball
print("Match starts at", result.start()) # index 10Code language: Python (python)

Python find the position of a regex match using span(), start(), and end()

Table of contents

Example to get the position of a regex match

Access matching string using start(), and end()

Find the indexes of all regex matches

find all the indexes of all the occurrences of a word in a string

Points to be remembered while using the start() method

About Vishal

Related Tutorial Topics:

All Coding Exercises:

Python Exercises and Quizzes

About PYnative

Follow Us

Explore Python

Coding Exercises

Legal Stuff