How to Add Boundaries to a Regular Expression in Python

In this article, we show how to add boundaries to a regular expression in Python.

With boundaries, we can clearly define areas where we want expressions to match or not.

The best way to understand this is through an example.

Let's say that we want to match the word, no, by itself within a text we are searching.

This poses a problem, especially if you have words, such as 'snow' or 'nothing' within the text.

Words such as 'snow' and 'nothing' have 'no' in it. However, your intention may just to see how many times the word, 'no', appears.

Therefore, you want to add boundaries to the regular expression, so that you clearly define what you want.

If we just have code without baries, let's see what occurs.

>>> import re >>> phrase="To answer that question, no, I can't go. It's snowing outside, so I can get nothing today" >>> regex= re.compile(r"no") >>> match= re.findall(regex, phrase) >>> print(match) ['no', 'no', 'no']

So you see that the regular expression returns 3 'no' from the text.

However, this is not what we want. We want to count the occurrences of the word, 'no'.

Therefore, we add boundaries to the regular expression so that no non-space characters can be before or after the word, no

This is shown in the following code below.

>>> import re >>> phrase="To answer that question, no, I can't go. It's snowing outside, so I can get nothing today" >>> regex= re.compile(r"\bno\b") >>> match= re.findall(regex, phrase) >>> print(match) ['no']

So after adding boundaries to the regular expression enclosing 'no', we only get returned a single, no, because the word, 'no', only appears once in the text.

So this is how adding boundaries to a regular expression can allow us to match precisely that expression without anything preceding or coming after it.

Related Resources

How to Randomly Select From or Shuffle a List in Python

HTML Comment Box is loading comments...

Learning about Electronics

How to Add Boundaries to a Regular Expression in Python