Python glob Module

The glob module in Python provides a convenient way to search for files matching a specified pattern. It uses Unix shell-style wildcards for pattern matching, making it easy to locate files and directories.

Table of Contents

  1. Introduction
  2. Key Functions
    • glob
    • iglob
  3. Wildcards
    • *
    • ?
    • []
  4. Examples
    • Basic Usage
    • Recursive Search
    • Using iglob for Memory Efficiency
  5. Real-World Use Case
  6. Conclusion
  7. References

Introduction

The glob module allows for pattern matching and file path expansion using Unix shell-style wildcards. It simplifies the process of locating files and directories that match a specific pattern, making it used for file manipulation tasks.

Key Functions

glob

Returns a list of paths matching a pathname pattern.

import glob

# Get all .txt files in the current directory
files = glob.glob('*.txt')
print(files)  # ['file1.txt', 'file2.txt']

iglob

Returns an iterator which yields the same values as glob() without storing them all simultaneously.

import glob

# Get all .txt files in the current directory using an iterator
for file in glob.iglob('*.txt'):
    print(file)

Wildcards

*

Matches zero or more characters.

import glob

# Match all .txt files
files = glob.glob('*.txt')
print(files)  # ['file1.txt', 'file2.txt']

?

Matches exactly one character.

import glob

# Match all .txt files with a single character prefix
files = glob.glob('?.txt')
print(files)  # ['a.txt', 'b.txt']

[]

Matches any one of the enclosed characters.

import glob

# Match all .txt files starting with either a or b
files = glob.glob('[ab]*.txt')
print(files)  # ['a.txt', 'b.txt', 'ab.txt']

Examples

Basic Usage

import glob

# Get all Python files in the current directory
python_files = glob.glob('*.py')
print(python_files)  # ['script1.py', 'script2.py']

Recursive Search

import glob

# Get all .txt files in the current directory and subdirectories
files = glob.glob('**/*.txt', recursive=True)
print(files)  # ['dir1/file1.txt', 'dir2/file2.txt', 'dir1/dir3/file3.txt']

Using iglob for Memory Efficiency

import glob

# Use iglob to iterate over matching files without loading them all at once
for file in glob.iglob('**/*.py', recursive=True):
    print(file)

Real-World Use Case

Finding and Processing Log Files

import glob

# Get all log files in the logs directory and its subdirectories
log_files = glob.glob('logs/**/*.log', recursive=True)

# Process each log file
for log_file in log_files:
    with open(log_file, 'r') as f:
        content = f.read()
        # Perform some processing on the content
        print(f"Processing {log_file}: {content[:100]}...")  # Print the first 100 characters of each log file

Conclusion

The glob module in Python provides a powerful and flexible way to search for files matching specific patterns. It simplifies tasks related to file manipulation and allows for efficient processing of large numbers of files.

References

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top