PYnative

Python Programming

  • Learn Python
  • Exercises
  • Quizzes
  • Code Editor
  • Tricks
Home » Python » RegEx » Python Regex Match: A Comprehensive guide for pattern matching

Python Regex Match: A Comprehensive guide for pattern matching

Updated on: April 2, 2021 | Leave a Comment

Python re.match() method looks for the regex pattern only at the beginning of the target string and returns match object if match found; otherwise, it will return None.

In this article, You will learn how to match a regex pattern inside the target string using the match(), search(), and findall() method of a re module.

The re.match() method will start matching a regex pattern from the very first character of the text, and if the match found, it will return a re.Match object. Later we can use the re.Match object to extract the matching string.

After reading this article you will able to perform the following regex pattern matching operations in Python.

OperationMeaning
re.match(pattern, str)Matches pattern only at the beginning of the string
re.search(pattern, str)Matches pattern anywhere in the string. Return only first match
re.search(pattern$, str)Dollar ($) matches pattern at the end of the string.
re.findall(pattern, str)Returns all matches to the pattern
re.findall(^pattern, str, re.M)Caret (^) and re.M flag to match the pattern at the beginning of each new line of a string
re.fullmatch(pattern, str)Returns a match object if and only if the entire target string matches the pattern.
Python regex matching operations

Table of contents

  • How to use re.match()
    • Syntax of re.match()
    • Return value
  • Match regex pattern at the beginning of the string
  • Match regex pattern anywhere in the string
  • Match regex at the end of the string
  • Match the exact word or string
  • Understand the Match object
  • Match regex pattern that starts and ends with the given text
  • More matching operations
  • Regex Search vs. match
    • The behavior of search vs. match with a multiline string
  • re.fullmatch()
  • Why and when to use re.match() and re.fullmatch()

How to use re.match()

Before moving further, let’s see the syntax of  re.match()

Syntax of re.match()

re.match(pattern, string, flags=0)

The regular expression pattern and target string are the mandatory arguments, and flags are optional.

  1. pattern: The regular expression pattern we want to match at the beginning of the target string. Since we are not defining and compiling this pattern beforehand (like the compile method). The practice is to write the actual pattern using a raw string.
  2. string: The second argument is the variable pointing to the target string (In which we want to look for occurrences of the pattern).
  3. flags: Finally, the third argument is optional and it refers to regex flags by default no flags are applied.
    There are many flag values we can use. For example, the re.I is used for performing case-insensitive searching. We can also combine multiple flags using bitwise OR (the | operator).

Return value

If zero or more characters at the beginning of the string match the regular expression pattern, It returns a corresponding match object instance i.e., re.Match object. The match object contains the locations at which the match starts and ends and the actual match value.

If it fails to locate the occurrences of the pattern that we want to find or such a pattern doesn’t exist in a target string it will return a None type

Match regex pattern in Python

Now, Let’s see how to use re.match().

Match regex pattern at the beginning of the string

Now, Let’s see the example to match any four-letter word at the beginning of the string. (Check if the string starts with a given pattern).

Pattern to match: \w{4}

What does this pattern mean?

  • The \w is a regex special sequence that represents any alphanumeric character meaning letters (uppercase or lowercase), digits, and the underscore character.
  • Then the 4 inside curly braces say that the character has to occur exactly four times in a row (four consecutive characters).

In simple words, it means to match any four-letter word at the beginning of the following string.

target_string = "Emma is a basketball player who was born on June 17, 1993"

As we can see in the above string Emma is the four-letter word present at the beginning of the target string, so we should get Emma as an output.

import re

target_string = "Emma is a basketball player who was born on June 17"
result = re.match(r"\w{4}", target_string) #

# printing the Match object
print("Match object: ", result)
# Output re.Match object; span=(0, 4), match='Emma'

# Extract match value
print("Match value: ", result.group())
# Output 'Emma'

As you can see, the match starts at index 0 and ends before index 4. because the re.match() method always performance pattern matching at the beginning of the target string.

Let’s understand the above example

  • I used a raw string to specify the regular expression pattern. As you may already know, the backslash has a special meaning in some cases because it may indicate an escape character or escape sequence to avoid that used raw string.
  • Next, we wrote a regex pattern to match any four-letter word.
  • Next, we passed this pattern to match() method to look for a pattern at the string’s start.
  • Next, it found a match and returned us the re.Match object.
  • In the end, we used the group() method of a Match object to retrieve the exact match value, i.e., Emma.

Match regex pattern anywhere in the string

Let’s assume you want to match any six-letter word inside the following target string

target_string = "Jessa loves Python and pandas"

If you use a match() method to match any six-letter word inside the string you will get None because it returns a match only if the pattern is located at the beginning of the string. And as we can see the six-letter word is not present at the start.

So to match the regex pattern anywhere in the string you need to use either search() or findall() method of a RE module.

Let’s see the demo.

Example to match six-letter word anywhere in the string

import re

target_string = "Jessa loves Python and pandas"
# Match six-letter word
pattern = r"\w{6}"

# match() method
result = re.match(pattern, target_string)
print(result)
# Output None

# search() method
result = re.search(pattern, target_string)
print(result.group()) 
# Output 'Python'

# findall() method
result = re.findall(pattern, target_string)
print(result) 
# Output ['Python', 'pandas'] 

Match regex at the end of the string

Sometimes we want to match the pattern at the end of the string. For example, you want to check whether a string is ending with a specific word, number or, character.

Using a dollar ($) metacharacter we can match the regular expression pattern at the end of the string.

Example to match the four-digit number at the end of the string

import re

target_string = "Emma is a basketball player who was born on June 17, 1993"

# match at the end
result = re.search(r"\d{4}$", target_string)
print("Matching number: ", result.group())  
# Output 1993

Match the exact word or string

In this section, we will see how to write a regex pattern to match an exact word or a substring inside the target string. Let’s see the example to match the word “player” in the target string.

Example:

import re

target_string = "Emma is a basketball player who was born on June 17"
result = re.findall(r"player", target_string)
print("Matching string literal: ", result) 
# Output ['player']

Understand the Match object

As you know, the match() and search() method returns a re.Match object if a match found. Let’s see the structure of a re.Match object.

re.Match object; span=(0, 4), match='Emma'

This re.Match object contains the following items.

  1. A span attribute that shows the locations at which the match starts and ends. i.e., is the tuple object contains the start and end index of a successful match.
    Save this tuple and use it whenever you want to retrieve a matching string from the target string
  2. Second, A match attribute contains an actual match value that we can retrieve using a group() method.

The Match object has several methods and attributes to get the information about the matching string. Let’s see those.

MethodDescription
group()Return the string matched by the regex
start()Return the starting position of the match
end()Return the ending position of the match
span()Return a tuple containing the (start, end) positions of the match.
Python regex match object

Example to get the information about the matching string

import re

target_string = "Jessa and Kelly"

# Match five-letter word
res = re.match(r"\b\w{5}\b", target_string)

# printing entire match object
print(res)
# Output re.Match object; span=(0, 5), match='Jessa'

# Extract Matching value
print(res.group())
# Output Jessa

# Start index of a match
print(res.start())
# Output  0

# End index of a match
print("End index: ", res.end())  # 5

# Start and end index of a match
pos = res.span()
print(pos)
# Output (0, 5)

# Use span to retrieve the matching string
print(target_string[pos[0]:pos[1]])
# Output 'Jessa'

Match regex pattern that starts and ends with the given text

Let’s assume you want to check if a given string starts and ends with a particular text. We can do this using the following two regex metacharacter with re.match() method.

  • Use the caret metacharacter to match at the start
  • Use dollar metacharacter to match at the end

Now, let’s check if the given string starts with the letter ‘p’ and ends with the letter ‘t’

Example

import re

# string starts with letter 'p' ends with letter 's'
def starts_ends_with(str1):
    res = re.match(r'^(P).*(s)$', str1)
    if res:
        print(res.group())
    else:
        print('None')

str1 = "PYnative is for Python developers"
starts_ends_with(str1)
# Output 'PYnative is for Python developers'

str2 = "PYnative is for Python"
starts_ends_with(str2)
# Output None

More matching operations

In this section, let’s see some common regex matching operations such as

  • Match any character
  • Match number
  • Match digits
  • match special characters
import re

str1 = "Emma 12 25"
# Match any character
print(re.match(r'.', str1))
# output 'E'

# Match all digits
print(re.findall(r'\d', str1))
# Output ['1', '2', '2', '5']

# Match all numbers
# + indicate 1 or more occurence of \d
print(re.findall(r'\d+', str1))
# output ['12', '25']

# Match all special characters and symbols
str2 = "Hello #Jessa!@#$%"
print(re.findall(r'\W', str2))
# Output [' ', '#', '!', '@', '#', '$', '%']

Also, read match/capture regex group

Regex Search vs. match

In this section, we will understand the difference between the search() and match() methods. You will also get to know when to use the match and search method while performing regex operations.

Python RE module offers two different methods to perform regex pattern matching.

  • The match() checks for a match only at the beginning of the string.
  • The search() checks for a match anywhere in the string.

How re.match() works

The match method returns a corresponding match object instance if zero or more characters at the beginning of the string match the regular expression pattern.

In simple words, the re.match returns a match object only if the pattern is located at the beginning of the string; otherwise, it will return None.

How re.search() works

On the other hand, the search method scans the entire string to look for a pattern and returns only the first match. I.e., As soon as it gets the first match, it stops its execution.

Let’s see the example to understand the difference between search and match. In this example, we will see how to match the regex pattern using the match and search method.

Now, Let’s try to match any2 digit number inside the following target string using search and match method.

Emma is a baseball player who was born on June 17, 1993

As you can see, a two-digit number is not present at the start of a string, So the match() method should return None, and the search method should return the match.

Because the match() method tries to find a match only at the start and search(), try to find a match anywhere in the string.

import re

target_string = "Emma is a baseball player who was born on June 17, 1993"

# Match 2-digit number
# Using match()
result = re.match(r'\d{2}', target_string)
print(result)
# Output None

# Using search()
result = re.search(r'\d{2}', target_string)
print(result.group())
# Output 17

The behavior of search vs. match with a multiline string

Let’s see example code to understand how the search and match method behaves when a string contains newlines.

We use the re.M flag with caret (^) metacharacter to match each regex pattern at each newline’s start. But you must note that even in MULTILINE mode, match() will only match at the beginning of the string and not at the beginning of each line.

On the other hand, the search method scans the entire multi-line string to look for a pattern and returns only the first match

Let’s see the example to understand the difference between search and match when searching inside a multi-line string.

import re

multi_line_string = """emma 
love Python"""

# Matches at the start
print(re.match('emma', multi_line_string).group())
# Output 'emma'

# re.match doesn't match at the start of each newline
# It only match at the start of the string
# Won't match
print(re.match('love', multi_line_string, re.MULTILINE))
# Output None

# found "love" at start of newline
print(re.search('love', multi_line_string).group())
# Output 'love'

pattern = re.compile('Python$', re.MULTILINE)
# No Match
print(pattern.match(multi_line_string))
# Output None

# found 'Python" at the end
print(pattern.search(multi_line_string).group())
# Output 'Python'

re.fullmatch()

Unlike the match() method, which performs the pattern matching only at the beginning of the string, the re.fullmatch method returns a match object if and only if the entire target string from the first to the last character matches the regular expression pattern.

If the match performed successfully it will return the entire string as a match value because we always match the entire string in fullmatch.

For example, you want the target string to have exactly 42 characters in length. Let’s create a regular expression pattern that will check if the target string is 42 characters long.

Pattern to match: .{42}

What does this pattern mean?

This pattern says I want to match a string of 42 characters.

Now let’s have a closer look at the pattern itself. First, you will see the dot in regular expressions syntax.

  • The DOT is a special character matching any character, no matter if it’s a letter, digit, whitespace, or a symbol except the newline character, which in Python is a backslash.
  • Next, 42 inside the curly braces says that string must be 42 characters long

Now, let’s see the example.

import re

# string length of 42
str1 = "My name is maximums and my salary is 1000$"
print("str1 length: ", len(str1))

result = re.fullmatch(r".{42}", str1)

# print entire match object
print(result)

# print actual match value
print("Match: ", result.group())

Output:

str1 length:  42
re.Match object; span=(0, 42), match='My name is maximums and my salary is 1000$'
Match:  My name is maximums and my salary is 1000$

As you can see from the output, we got a match object, meaning the match was performed successfully.

Note: If the string contains one or more newline characters, the match will fail because the special character excludes the new line. Therefore if our target string had had multiple lines or paragraphs, the match would have failed. we cal solve such problems using the flags attribute.

Why and when to use re.match() and re.fullmatch()

  • Use re.match() method when you want to find the pattern at the beginning of the string (starting with the string’s first character).
  • If you want to match a full string against a pattern then use re.fullmatch(). The re.fullmatch method returns a match object if and only if the entire target string from the first to the last character matches the regular expression pattern.

Previous:

Python Regex Compile

Next:

Python Regex Search

Filed Under: Python, Python RegEx

Did you find this page helpful? Let others know about it. Sharing helps me continue to create free Python resources.

TweetF  sharein  shareP  Pin

About Vishal

Founder of PYnative.com I am a Python developer and I love to write articles to help developers. Follow me on Twitter. All the best for your future Python endeavors!

Related Tutorial Topics:

Python Python RegEx

Python Exercises and Quizzes

Free coding exercises and quizzes cover Python basics, data structure, data analytics, and more.

  • 15+ Topic-specific Exercises and Quizzes
  • Each Exercise contains 10 questions
  • Each Quiz contains 12-15 MCQ
Exercises
Quizzes

Leave a Reply Cancel reply

your email address will NOT be published. all comments are moderated according to our comment policy.

Use <pre> tag for posting code. E.g. <pre> Your entire code </pre>

Posted In

Python Python RegEx
TweetF  sharein  shareP  Pin

  Python RegEx

  • Python RegEx
  • Python regex compile
  • Python regex match
  • Python regex search
  • Python regex findall
  • Python regex split
  • Python regex replace
  • Python regex capturing groups
  • Regex Metacharacters
  • Regex special sequences
  • Regex Flags

All Python Topics

Python Basics Python Exercises Python Quizzes Python File Handling Python OOP Python Date and Time Python Random Python Regex Python Pandas Python Databases Python MySQL Python PostgreSQL Python SQLite Python JSON

About PYnative

PYnative.com is for Python lovers. Here, You can get Tutorials, Exercises, and Quizzes to practice and improve your Python skills.

Explore Python

  • Learn Python
  • Python Basics
  • Python Databases
  • Python Exercises
  • Python Quizzes
  • Online Python Code Editor
  • Python Tricks

Follow Us

To get New Python Tutorials, Exercises, and Quizzes

  • Twitter
  • Facebook
  • Sitemap

Legal Stuff

  • About Us
  • Contact Us

We use cookies to improve your experience. While using PYnative, you agree to have read and accepted our Terms Of Use, Cookie Policy, and Privacy Policy.

Copyright © 2018–2023 pynative.com