In this Python tutorial, we will see how to list all files of a directory having a specific extension.
Sometimes we need to list files having a specific extension before performing any operation on them. For example, if you wanted to copy only text files from one location to another. In this case, we need to make sure we are only looking for files having a .txt
extension.
We will use the following three methods.
Table of contents
How to list files in directory with extension txt
A file extension, or filename extension, is a suffix at the end of a file. It comes after the period. Extension specifies a file type such as text, CSV file, pdf, or image file. For example, for a text file, it is txt
. For image file it is jpg
, jpeg
, or bmp
.
Here are the steps to get the list of files having the txt extension using a glob module.
- Import glob module
The glob module, part of the Python Standard Library, is used to find the files and folders whose names follow a specific pattern. The searching rules are similar to the Unix Shell path expansion rules.
- Construct a pattern to search for the files having the specific extension
For example,
directory_path/*.txt
to list all text files present in a given directory path. Here the*
means file name can be anything, but it must have atxt
extension. - Use glob() method
The
gob.glob(pathname)
method returns a list of files that matches the path and pattern specified in the pathname argument. in this case, it will return all text files.
Example: list files in directory with extension txt
The following text files are present in my current working directory.
sales.txt profit.txt samples.txt
Example 1: List all txt
files present in the ‘account’ directory.
import glob
# absolute path to search all text files inside a specific folder
path = r'E:/demos/files_demos/account/*.txt'
files = glob.glob(path)
print(files)
Output:
['E:/account\\profit.txt', 'E:/account\\sales.txt', 'E:/account\\sample.txt']
If you want to list files from a current directory the use glob.glob('./*.txt')
.
Note: This solution is fast because it only looks for a specific pattern instead of traversing the entire directory file by file to check if it has a specific extension, resulting in performance benefits.
Os module to list files in directory with extension
This module helps us to work with operating system-dependent functionality in Python. The os module provides functions for interacting with the operating system.
Use the below steps: –
- Use the
os.listdir('path')
function to get the list of all files of a directory. This function returns the names of the files and directories present in the directory. - Next, use a for loop to iterate all files from a list.
- Next, use the if condition in each iteration to check if the file name ends with a txt extension. If yes, add it to the final list
Example:
import os
# folder path
dir_path = r'E:\account'
# list to store files
res = []
# Iterate directory
for file in os.listdir(dir_path):
# check only text files
if file.endswith('.txt'):
res.append(file)
print(res)
Output:
['profit.txt', 'sales.txt', 'sample.txt']
Note: This solution is slow because it traverses the entire directory file by file to check if it has a specific extension, resulting in performance overhead if the directory contains many files. So I suggest you use the first solution, i.e., glob module.
list files in directory and subdirectories with extension txt
We can use the following two approaches: –
- glob module
os.walk()
function
Glob module to list files from subdirectories with txt extension
Set the recursive
attribute of a glob()
method to True to list text files from subdirectories.
Use Python 3.5+ to find files recursively using the glob module. If you are using the older version of Python, then use the os.walk()
method.
The glob module supports the **
directive. If you want it recursive you can use glob.glob('**/*.txt')
and set a recursive flag to True
, the glob() method parses the given path and looks recursively in the directories.
Example:
import glob
# absolute path to search all text files inside a specific folder
path = r'E:/account/**/*.txt'
files = glob.glob(path, recursive=True)
print(files)
Output:
['E:/account\\profit.txt', 'E:/account\\sales.txt', 'E:/account\\sample.txt', 'E:/account\\reports_2021\\december_2021.txt']
os.walk()
to list files in directory and subdirectories with extension txt
It is a recursive function, i.e., Every time the generator is called it creates a tuple of values (current_path, directories in current_path, files in current_path) and it will follow each directory recursively to get a list of files and directories until no further sub-directories are available from the initial directory.
- Call the
os.walk(''path')
function. It will yield two lists for each directory it visits. The first list contains files, and the second list includes directories. - Next, Iterate the list of files using a for loop
- Next, use the if condition in each iteration to check if the file name ends with a txt extension. If yes, add it to the final list.
Example:
import os
# list to store txt files
res = []
# os.walk() returns subdirectories, file from current directory and
# And follow next directory from subdirectory list recursively until last directory
for root, dirs, files in os.walk(r"E:\demos\files_demos\account"):
for file in files:
if file.endswith(".txt"):
res.append(os.path.join(root, file))
print(res)
['E:/account\\profit.txt', 'E:/account\\sales.txt', 'E:/account\\sample.txt', 'E:/account\\reports_2021\\december_2021.txt']
Leave a Reply