Extract File Name From the Path, No Matter What the os/path Format

Summary: os.path.basename(path) enables us to get the file name from the path, no matter what the os/path format. Another workaround is to use the ntpath module, which is equivalent to os.path.


✨Problem: How to extract the filename from a path, no matter what the operating system or path format is?

For example, let’s suppose that you want all the following paths to return demo.py:

➀ C:\Users\SHUBHAM SAYON\Desktop\codes\demo.py
➀ /home/username/Desktop/codes/demo.py
➀ /home/username/Desktop/../demo.py

Expected Output in each case:

demo.py

Recommended: How To Get The Filename Without The Extension From A Path In Python?

Let us dive into the solutions without further delay.

❖ Method 1: Using os.path.basename

os.path.basename is a built-in method of the os module in Python that is used to derive the basename of a file from its path. It accepts the path as an input and then returns the basename of the file. Thus, to get the filename from its path, this is exactly the function that you would want to use.

Example 1: In Windows

import os
file_path = r'C:\Users\SHUBHAM SAYON\Desktop\codes\demo.py'
print(os.path.basename(file_path)) 

# OUTPUT: demo.py

Example 2: In Linux

Caution: If you use the os.path.basename() function on a POSIX system in order to get the basename from a Windows-styled path, for example: “C:\\my\\file.txt“, the entire path will be returned.

Tidbit: os.path.basename() method actually uses the os.path.split() method internally and splits the specified path into a head and tail pair and finally returns the tail part. 

❖ Method 2: Using the ntpath Module

The ntpath module can be used to handle Windows paths efficiently on other platforms. os.path.basename function does not work in all the cases, like when we are running the script on a Linux host, and you attempt to process a Windows-style path, the process will fail.

This is where the ntpath module proves to be useful. Generally, the Windows path uses either the backslash or the forward-slash as a path separator. Therefore, the ntpath module, equivalent to the os.path while running on Windows, will work for all the paths on all platforms.

In case the file ends with a slash, then the basename will be empty, so you can make your own function and deal with it:

import ntpath


def path_foo(path):
    head, tail = ntpath.split(path)
    return tail or ntpath.basename(head)


paths = [r'C:\Users\SHUBHAM SAYON\Desktop\codes\demo.py',
         r'/home/username/Desktop/codes/demo.py',
         r'/home/username/Desktop/../demo.py']
print([path_foo(path) for path in paths])

# ['demo.py', 'demo.py', 'demo.py']

❖ Method 3: Using pathlib.Path()

If you are using Python 3.4 or above, then the pathlib.Path() function of the pathlib module is another option that can be used to extract the file name from the path, no matter what the path format. The method takes the whole path as an input and extracts the file name from the path and returns the file name.

from pathlib import Path
file_path = r'C:\Users\SHUBHAM SAYON\Desktop\codes\demo.py'
file_name = Path(file_path).name
print(file_name)

# demo.py

Note: The .name property followed by the pathname is used to return the full name of the final child element in the path, regardless of whatever the path format is and regardless of whether it is a file or a folder.

πŸ’‘Bonus Tip: You can also use Path("File Path").stem to get the file name without the file extension.

Example:

from pathlib import Path
file_path = r'C:\Users\SHUBHAM SAYON\Desktop\codes\demo.py'
file_name = Path(file_path).stem
print(file_name)

# demo

❖ Method 4: Using split()

If you do not intend to use any built-in module to extract the filename irrespective of the OS/platform in use, then you can simply use the split() method.

Example:

import os
file_path = r'C:\Users\SHUBHAM SAYON\Desktop\codes\demo.py'
head, tail = os.path.split(file_path)
print(tail)

# demo.py

Explanation: In the above example os.path.split() method is used to split the entire path string into head and tail pairs. Here, tail represents/stores the ending path name component, which is the base filename, and head represents everything that leads up to that. Therefore, the tail variable stores the name of the file that we need.

➀ A Quick Recap to split():
split() is a built-in method in Python that splits a string into a list based on the separator provided as an argument to it. If no argument is provided, then by default, the separator is any whitespace.

Learn more about the split() method here.

Alternatively, for more accurate results you can also use a combination of the strip() and split() methods as shown below.

file_path = r'C:\Users\SHUBHAM SAYON\Desktop\codes\demo.py'
f_name = file_path.strip('/').strip('\\').split('/')[-1].split('\\')[-1]
print(f_name)
# demo.py

Explanation: The strip method takes care of the forward and backward slashes, which makes the path string fullproof against any OS or path format, and then the split method ensures that the entire path string is split into numerous strings within a list. Lastly, we will just return the last element from this list to get the filename.

❖ Method 5: Using Regex

If you have a good grip on regular expressions then here’s a regex specific solution for you that will most probably work on any OS.

import re
file_path = r'C:\Users\SHUBHAM SAYON\Desktop\codes\\'
def base_name(path):
    basename = re.search(r'[^\\/]+(?=[\\/]?$)', path)
    if basename:
        return basename.group(0)

paths = [r'C:\Users\SHUBHAM SAYON\Desktop\codes\demo.py',
         r'/home/username/Desktop/codes/demo.py',
         r'/home/username/Desktop/../demo.py']
print([base_name(path) for path in paths])

# ['demo.py', 'demo.py', 'demo.py']

Do you want to master the regex superpower? Check out my new book The Smartest Way to Learn Regular Expressions in Python with the innovative 3-step approach for active learning: (1) study a book chapter, (2) solve a code puzzle, and (3) watch an educational chapter video.

Conclusion

To sum thungs up, you can use one of the following methods to extract the filename from a given path irrespective of the OS/path format:

  • os.path.basename('path')
  • ntpath.basename()
  • pathlib.Path('path').name
  • os.path.split('path')
  • using regex

Please stay tuned and subscribe for more interesting articles!


To become a PyCharm master, check out our full course on the Finxter Computer Science Academy available for free for all Finxter Premium Members: