You Cannot Use Python Regex in startswith(). Do This Instead.

I’m sitting in front of my computer refactoring Python code and have just thought of the following question:

Can You Use a Regular Expression with the Python string.startswith() Method?

The short answer is no. The string.startswith() method doesn’t allow regular expression inputs. And you don’t need it because regular expressions can already check if a string starts with a pattern using the re.match(pattern, string) function from the re module.

In fact, shortly after asking the question, I realized that using a regex with the startswith() method doesn’t make sense. Why? If you want to use regular expressions, use the re module. Regular expressions are infinitely more powerful than the startswith() method!

For example, to check whether a string starts with 'hello', you’d use the regex 'hello.*'. Now you don’t need the startswith() method anymore because the regex already takes care of that.

If you already learned something from this tutorial, why not joining my free Python training program? I call it the Finxter Email Computer Science Academy—and it’s just that: a free, easy-to-use email academy that teaches you Python in small daily doses for beginners and pros alike!

How Does the Python startswith() Method Work?

Here’s an overview of the string.startswith() method:

str.startswith(prefix[, start[, end]])

Argument	Needed?	Description
`prefix`	required	String value to be searched at the beginning of string `str`.
`start`	optional	Index of the first position where `prefix` is to be checked. Default: `start=0`.
`end`	optional	Index of the last position where `prefix` is to be checked. Default: `end=len(str)-1`.

Let’s look at some examples using the Python startswith() method. In each one, I will modify the code to show different use cases. Let’s start with the most basic scenario.

Related article: Python Regex Superpower – The Ultimate Guide

Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.

Python startswith() — Most Basic Example

Suppose you have a list of strings where each string is a tweet.

tweets = ["to thine own self be true",
          "coffee break python",
          "i like coffee"]

Let’s say you work in the coffee industry and you want to get all tweets that start with the string "coffee". We’ll use the startswith() method with a single argument:

>>> for tweet in tweets:
...   if tweet.startswith("coffee"):
...       print(tweet)
coffee break python

There is only one tweet in our dataset that starts with the string "coffee". So that is the only one printed out.

Python startswith() — Optional Arguments

The startswith() method has two optional arguments: start and end. You can use these to define a range of indices to check. By default startswith checks the entire string.

The start argument tells startswith() where to begin searching. The default value is 0, so it begins at the start of the string.

Thus, the following code outputs the same result as above:

>>> for tweet in tweets:
...   if tweet.startswith("coffee", 0):
...       print(tweet)
coffee break python

What happens if we set start=7?

>>> for tweet in tweets:
...   if tweet.startswith("coffee", 7):
...       print(tweet)
i like coffee

Why does it print 'i like coffee'? By calling the find() method, we see that the substring 'coffee' begins at index 7.

>>> 'i like coffee'.find('coffee')
7

Hence, when checking tweet.startswith("coffee", 7) for the tweet 'i like coffee', the result is True.

Let’s add another argument – the end index – to the last snippet:

>>> for tweet in tweets:
...   if tweet.startswith("coffee", 7, 9):
...       print(tweet)

Nothing is printed on the console. This is because we are only searching over 2 characters – beginning from index 7 (inclusive) and ending at index 9 (exclusive). But we are searching for "coffee" and it is 6 characters long. Because the condition 6 > 2 holds, startswith() doesn’t find any matches and so returns nothing.

Now that you know everything about Python’s startswith method, let’s go back to our original question:

Can You Use a Regular Expression with the Python startswith() Method?

No. The startswith method does not allow for a regular expressions. You can only search for a string.

A regular expression can describe an infinite set of matching strings. For example, 'A*' matches all words starting with 'A'. This can be computationally expensive. So, for performance reasons, it makes sense that startswith() doesn’t accept regular expressions.

Instead, you can use the re.match() method:

re.match()

The re.match(pattern, string) method returns a match object if the pattern matches at the beginning of the string.

The match object contains useful information such as the matching groups and the matching positions.

An optional argument flags allows you to customize the regex engine, for example, to ignore capitalization.

Specification: re.match(pattern, string, flags=0)

The re.match() method has up to three arguments.

pattern: the regular expression pattern that you want to match.
string: the string which you want to search for the pattern.
flags (optional argument): a more advanced modifier that allows you to customize the behavior of the function. Want to know how to use those flags? Check out this detailed article on the Finxter blog.

Return Value:

The re.match() method returns a match object. You can learn everything about match objects and the re.match() method in my detailed blog guide:

[Full Tutorial] Python Regex Match

Here’s the video in case you’re more a multimodal learner:

But is it also true that startswith only accepts a single string as argument? Not at all. It is possible to do the following:

Python startswith() Tuple – Check For Multiple Strings

>>> for tweet in tweets:
...   if tweet.startswith(("coffee", "i")):
...       print(tweet)
coffee break python
i like coffee

This snippet prints all strings that start with either "coffee" or "i". It is pretty efficient too. Unfortunately, you can only check a finite set of arguments. If you need to check an infinite set, you cannot use this method.

What Happens If I Pass A Regular Expression To startswith()?

Let’s check whether a tweet starts with any version of the "coffee" string. In other words, we want to apply the regex "coff*" so that we match strings like "coffee", "coffees" and "coffe".

>>> tweets = ["to thine own self be true",
                "coffee break python",
                "coffees are awesome",
                "coffe is cool"]

>>> for tweet in tweets:
        if tweet.startswith("coff*"):
            print(tweet)
# No output :(

This doesn’t work. In regular expressions, * is a wildcard and represents any character. But in the startswith() method, it just means the star character '*'.

Since none of the tweets start with the literal string 'coff*', Python prints nothing to the screen.

So you might ask:

What Are The Alternatives to Using Regular Expressions in startswith()?

There is one alternative that is simple and clean: use the re module. This is Python’s built-in module built to work with regular expressions.

>>> import re
>>> tweets = ["to thine own self be true",
                "coffee break python",
                "coffees are awesome",
                "coffe is cool"]

# Success!
>>> for tweet in tweets:
        if re.match("coff*", tweet):
            print(tweet)
coffee break python
coffees are awesome
coffe is cool

Success! We’ve now printed all the tweets we expected. That is, all tweets that start with "coff" plus an arbitrary number of characters.

💡 Note: This approach is quite slow. Evaluating regular expressions is an expensive operation. But the clarity of the code has improved and we got the result we wanted. Slow and successful is better than fast and unsuccessful.

The function re.match() takes two arguments.

First, the regular expression to be matched.
Second, the string you want to search.

If a matching substring is found, it returns True. If not, it returns False. In this case, it returns False for "to thine own self be true" and True for the rest.

So let’s summarize the article.

Summary: Can You Use a Regular Expression with the Python startswith Method?

No, you cannot use a regular expression with the Python startswith function. But you can use the Python regular expression module re instead. It’s as simple as calling the function re.match(s1, s2). This finds the regular expression s1 in the string s2.

Python Startswith() List

Given that we can pass a tuple to startswith(), what happens if we pass a list?

>>> s = 'a string!'
>>> if s.startswith(['a', 'b', 'c']):
        print('yay!')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: startswith first arg must be str or a tuple of str, not list

Python raises a TypeError. We can only pass a tuple to startswith(). So if we have a list of prefixes we want to check, we can call tuple() before passing it to startswith.

>>> if s.startswith(tuple(['a', 'b', 'c'])):
        print('yay!')
yay!

This works well and is fine performance-wise.

Yet, one of Python’s key features is its flexibility. So is it possible to get the same outcome without changing our list of letters to a tuple?

Of course! 🙂

We have two options:

any + list comprehension
any + map

The any() function is a way to combine the logical OR statements together. It takes one argument – an iterable of conditional statements. So instead of writing

if s.startswith('a') or s.startswith('b') or s.startswith('c'):
    # some code

We write

# any takes 1 argument - an iterable
if any([s.startswith('a'),
        s.startswith('b'),
        s.startswith('c')]):
    # some code

This is much easier to read and is especially useful if you are using many mathematical statements. We can improve this by first creating a list of conditions and passing this to any().

letters = ['a', 'b', 'c']
conditions = [s.startswith(l) for l in letters]

if any(conditions):
    # do something

Alternatively, we can use map instead of a list comprehension statement.

letters = ['a', 'b', 'c']
if any(map(s.startswith, letters)):
    # do something

Both have the same outcome. I personally prefer list comprehensions and think they are more readable. But choose whichever you prefer.

Regex Humor

*Wait, forgot to escape a space. Wheeeeee[taptaptap]eeeeee.* (source)

Python Regex Course

Google engineers are regular expression masters. The Google search engine is a massive text-processing engine that extracts value from trillions of webpages.

Facebook engineers are regular expression masters. Social networks like Facebook, WhatsApp, and Instagram connect humans via text messages.

Amazon engineers are regular expression masters. Ecommerce giants ship products based on textual product descriptions. Regular expressions rule the game when text processing meets computer science.

If you want to become a regular expression master too, check out the most comprehensive Python regex course on the planet: