Python Regex Quantifiers – Question Mark (?) vs Plus (+) vs Asterisk (*)

In this tutorial, I’ll show you the difference of the regular expression quantifiers in Python.

What’s the difference between the question mark quantifier (?), the plus quantifier (+), and the asterisk quantifier (*)?

Say, you have regular expression pattern A.

  • Regex A? matches zero or one occurrences of A.
  • Regex A* matches zero or more occurrences of A.
  • Regex A+ matches one or more occurrences of A.

Try it yourself:

Related article: Python Regex Superpower – The Ultimate Guide

Asterisk vs Question Mark

You can read the Python Re A? quantifier as zero-or-one regex: the preceding regex A is matched either zero times or exactly once. But it’s not matched more often.

Analogously, you can read the Python Re A* operator as the zero-or-more regex (I know it sounds a bit clunky): the preceding regex A is matched an arbitrary number of times.

Here’s an example that shows the difference:

>>> import re
>>> re.findall('ab?', 'abbbbbbb')
['ab']
>>> re.findall('ab*', 'abbbbbbb')
['abbbbbbb']

The regex ‘ab?’ matches the character ‘a’ in the string, followed by character ‘b’ if it exists (which it does in the code).

The regex ‘ab*’ matches the character ‘a’ in the string, followed by as many characters ‘b’ as possible.

Asterisk vs Plus

You can read the Python Re A* quantifier as zero-or-more regex: the preceding regex A is matched an arbitrary number of times.

Analogously, you can read the Python Re A+ operator as the at-least-once regex: the preceding regex A is matched an arbitrary number of times too—but at least once.

Here’s an example that shows the difference:

>>> import re
>>> re.findall('ab*', 'aaaaaaaa')
['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a']
>>> re.findall('ab+', 'aaaaaaaa')
[]

The regex ‘ab*’ matches the character ‘a’ in the string, followed by an arbitary number of occurrences of character ‘b’. The substring ‘a’ perfectly matches this formulation. Therefore, you find that the regex matches eight times in the string.

The regex ‘ab+’ matches the character ‘a’, followed by as many characters ‘b’ as possible—but at least one. However, the character ‘b’ does not exist so there’s no match.

Summary: When applied to regular expression A, Python’s A* quantifier matches zero or more occurrences of A. The * quantifier is called asterisk operator and it always applies only to the preceding regular expression. For example, the regular expression ‘yes*’ matches strings ‘ye’, ‘yes’, and ‘yesssssss’. But it does not match the empty string because the asterisk quantifier * does not apply to the whole regex ‘yes’ but only to the preceding regex ‘s’.

Question Mark vs Plus

You can read the Python Re A? quantifier as zero-or-one regex: the preceding regex A is matched either zero times or exactly once. But it’s not matched more often.

Analogously, you can read the Python Re A+ operator as the at-least-once regex: the preceding regex A is matched an arbitrary number of times but at least once.

Here’s an example that shows the difference:

>>> import re
>>> re.findall('ab?', 'aaaaaaaa')
['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a']
>>> re.findall('ab+', 'aaaaaaaa')
[]

The regex ‘ab?’ matches the character ‘a’ in the string, followed by character ‘b’ if it exists—but it doesn’t in the code.

The regex ‘ab+’ matches the character ‘a’ in the string, followed by as many characters ‘b’ as possible—but at least one. However, the character ‘b’ does not exist so there’s no match.

Where to Go From Here?

You’ve learned the difference of the regex quantifiers in Python.

Summary: Regex A? matches zero or one occurrences of A. Regex A* matches zero or more occurrences of A. Regex A+ matches one or more occurrences of A.

Want to earn money while you learn Python? Average Python programmers earn more than $50 per hour. You can become average, can’t you?

Join the free webinar that shows you how to become a thriving coding business owner online!

[Webinar] Are You a Six-Figure Freelance Developer?

Join us. It’s fun! 🙂