Python os Module: File and Directory Operations Guide

Q: What is the os module in Python?

The os module is part of Python's standard library. It provides functions for interacting with the operating system, including file and directory manipulation, path handling, environment variable access, and process management. You import it with import os -- no installation needed.

Q: What is the difference between os.path.join and string concatenation for paths?

os.path.join() automatically uses the correct path separator for the current operating system (/ on Unix, \ on Windows). String concatenation requires manual separators, causing bugs across platforms. Always use os.path.join().

Q: How do I recursively list all files in a directory?

Use os.walk() to traverse a directory tree. It yields (dirpath, dirnames, filenames) tuples for each directory. Combine with os.path.join(dirpath, filename) to get full paths.

Q: Should I use os.path or pathlib?

For new Python 3 projects, pathlib offers cleaner syntax. However, os and os.path are not deprecated and remain the right choice for environment variables, process operations, and legacy codebases. Many projects use both.

Q: How do I safely delete a file or directory in Python?

Use os.remove(path) to delete a file and os.rmdir(path) for empty directories. Check os.path.exists(path) first or use try/except. For non-empty directories, use shutil.rmtree(path).

Name: Soren Atelier

Updated on 2/10/2026

Every Python script that reads a config file, saves output, organizes data, or automates deployment needs to interact with the file system. But file paths behave differently on Windows, macOS, and Linux. A hardcoded /home/user/data path breaks on Windows. Manually concatenating strings with + and / leads to double-slash bugs and missing separators. And without proper checks, your script might delete the wrong file or crash when a directory already exists.

Python's os module solves these problems. It provides a portable, cross-platform interface for working with the operating system -- creating directories, listing files, reading environment variables, manipulating paths, and walking directory trees. It ships with every Python installation, requires no pip install, and handles platform differences automatically.

This guide covers every essential function in the os module, organized by use case, with practical examples you can copy directly into your projects.

Getting the Current Working Directory

Before doing anything with files, you need to know where your script is running. os.getcwd() returns the current working directory as an absolute path.

import os
 
# Get current working directory
cwd = os.getcwd()
print(cwd)  # e.g., /home/user/projects/myapp
 
# Change working directory
os.chdir('/tmp')
print(os.getcwd())  # /tmp
 
# Change back
os.chdir(cwd)

Use os.getcwd() when building relative paths or when you need to restore the working directory after temporarily changing it.

Directory Operations

Creating Directories

os.mkdir() creates a single directory. It raises FileExistsError if the directory already exists, and FileNotFoundError if the parent directory does not exist.

import os
 
# Create a single directory
os.mkdir('output')
 
# Create only if it doesn't exist
if not os.path.exists('output'):
    os.mkdir('output')

os.makedirs() creates directories recursively -- including all missing parent directories. The exist_ok=True parameter prevents errors when the directory already exists.

import os
 
# Create nested directories in one call
os.makedirs('data/raw/2026/january', exist_ok=True)
 
# Without exist_ok, this raises FileExistsError if 'data' already exists
# os.makedirs('data/raw/2026/january')  # may raise error

Listing Directory Contents

os.listdir() returns a list of all entries (files and directories) in a given path.

import os
 
# List everything in the current directory
entries = os.listdir('.')
print(entries)  # ['main.py', 'data', 'output', 'README.md']
 
# List contents of a specific directory
data_files = os.listdir('/var/log')
print(data_files)

os.scandir() is a more efficient alternative that returns DirEntry objects with cached file attributes. Use it when you need file metadata alongside names.

import os
 
with os.scandir('.') as entries:
    for entry in entries:
        info = entry.stat()
        print(f"{entry.name:30s}  {'DIR' if entry.is_dir() else 'FILE':4s}  {info.st_size} bytes")

Removing Directories

os.rmdir() removes an empty directory. For non-empty directories, use shutil.rmtree() instead.

import os
import shutil
 
# Remove an empty directory
os.rmdir('output')
 
# Remove a directory and all its contents (use with caution)
shutil.rmtree('data/raw/2026')

File Operations

Removing Files

os.remove() (or its alias os.unlink()) deletes a single file. It raises FileNotFoundError if the file does not exist.

import os
 
# Delete a file
os.remove('temp_output.csv')
 
# Safe deletion with existence check
filepath = 'temp_output.csv'
if os.path.exists(filepath):
    os.remove(filepath)
    print(f"Deleted {filepath}")
else:
    print(f"{filepath} not found")

Renaming and Moving Files

os.rename() renames or moves a file or directory. If the destination already exists, behavior is platform-dependent -- it may overwrite on Unix but raise an error on Windows. Use os.replace() for guaranteed atomic replacement.

import os
 
# Rename a file
os.rename('old_report.csv', 'new_report.csv')
 
# Move a file to a different directory
os.rename('report.csv', 'archive/report_2026.csv')
 
# Atomic replacement (overwrites destination on all platforms)
os.replace('new_data.csv', 'data.csv')

Getting File Information

os.stat() returns detailed file metadata including size, permissions, and timestamps.

import os
from datetime import datetime
 
info = os.stat('data.csv')
print(f"Size: {info.st_size} bytes")
print(f"Modified: {datetime.fromtimestamp(info.st_mtime)}")
print(f"Created: {datetime.fromtimestamp(info.st_ctime)}")
print(f"Permissions: {oct(info.st_mode)}")

Path Operations with os.path

The os.path submodule is where most day-to-day file system work happens. It handles path construction, validation, and decomposition across platforms.

Building Paths Safely

Never concatenate paths with +. Use os.path.join() to build paths correctly regardless of operating system.

import os
 
# Correct: os.path.join handles separators
path = os.path.join('data', 'raw', 'sales.csv')
print(path)  # 'data/raw/sales.csv' on Unix, 'data\\raw\\sales.csv' on Windows
 
# Wrong: string concatenation can break
bad_path = 'data' + '/' + 'raw' + '/' + 'sales.csv'  # breaks on Windows

Checking Existence and Type

import os
 
# Check if a path exists (file or directory)
print(os.path.exists('/etc/hosts'))      # True (on Linux/macOS)
 
# Check specifically for a file
print(os.path.isfile('main.py'))         # True
print(os.path.isfile('data'))            # False (it's a directory)
 
# Check specifically for a directory
print(os.path.isdir('data'))             # True
print(os.path.isdir('main.py'))          # False

Decomposing Paths

Extract components from a file path without manual string splitting.

import os
 
filepath = '/home/user/projects/report_final.csv'
 
# Get the filename
print(os.path.basename(filepath))    # 'report_final.csv'
 
# Get the directory
print(os.path.dirname(filepath))     # '/home/user/projects'
 
# Split into directory and filename
directory, filename = os.path.split(filepath)
print(directory)   # '/home/user/projects'
print(filename)    # 'report_final.csv'
 
# Split filename and extension
name, ext = os.path.splitext(filepath)
print(name)  # '/home/user/projects/report_final'
print(ext)   # '.csv'
 
# Get absolute path from relative path
print(os.path.abspath('data.csv'))   # '/home/user/projects/data.csv'
 
# Resolve user home directory
print(os.path.expanduser('~/Documents'))  # '/home/user/Documents'

Path Operations Quick Reference

Function	Purpose	Example Output
`os.path.join('a', 'b.txt')`	Build path	`'a/b.txt'`
`os.path.exists(path)`	Path exists?	`True` / `False`
`os.path.isfile(path)`	Is a file?	`True` / `False`
`os.path.isdir(path)`	Is a directory?	`True` / `False`
`os.path.basename(path)`	Filename only	`'report.csv'`
`os.path.dirname(path)`	Directory only	`'/home/user'`
`os.path.splitext(path)`	Split name + extension	`('report', '.csv')`
`os.path.abspath(path)`	Absolute path	`'/full/path/to/file'`
`os.path.getsize(path)`	File size in bytes	`4096`
`os.path.expanduser('~')`	Home directory	`'/home/user'`

Environment Variables

The os module provides direct access to system environment variables -- essential for reading configuration, API keys, and deployment settings.

import os
 
# Read an environment variable (returns None if not set)
db_host = os.getenv('DATABASE_HOST')
print(db_host)
 
# Read with a default value
db_port = os.getenv('DATABASE_PORT', '5432')
print(db_port)  # '5432' if DATABASE_PORT is not set
 
# Access via os.environ dictionary (raises KeyError if missing)
try:
    secret = os.environ['API_SECRET']
except KeyError:
    print("API_SECRET not configured")
 
# Set an environment variable (for child processes)
os.environ['APP_MODE'] = 'production'
 
# List all environment variables
for key, value in os.environ.items():
    print(f"{key}={value}")

The difference between os.getenv() and os.environ[] matters: getenv returns None (or a default) when the variable is missing, while os.environ[] raises a KeyError. Use getenv for optional configuration, os.environ for required settings that should fail loudly when absent.

Walking Directory Trees with os.walk

os.walk() recursively traverses a directory tree, yielding a 3-tuple (dirpath, dirnames, filenames) for each directory it visits.

import os
 
# Walk through a project directory
for dirpath, dirnames, filenames in os.walk('/home/user/project'):
    # Skip hidden directories
    dirnames[:] = [d for d in dirnames if not d.startswith('.')]
 
    print(f"\nDirectory: {dirpath}")
    print(f"  Subdirectories: {dirnames}")
    print(f"  Files: {filenames}")

The dirnames[:] in-place modification controls which subdirectories os.walk enters. This is a powerful pattern for skipping .git, __pycache__, or node_modules directories.

Calculate Total Directory Size

import os
 
def get_directory_size(path):
    total = 0
    for dirpath, dirnames, filenames in os.walk(path):
        for filename in filenames:
            filepath = os.path.join(dirpath, filename)
            # Skip symbolic links
            if not os.path.islink(filepath):
                total += os.path.getsize(filepath)
    return total
 
size_bytes = get_directory_size('/home/user/project')
size_mb = size_bytes / (1024 * 1024)
print(f"Total size: {size_mb:.2f} MB")

Common Patterns

Find All Files by Extension

import os
 
def find_files(directory, extension):
    """Recursively find all files with a given extension."""
    matches = []
    for dirpath, dirnames, filenames in os.walk(directory):
        for filename in filenames:
            if filename.endswith(extension):
                matches.append(os.path.join(dirpath, filename))
    return matches
 
# Find all Python files
python_files = find_files('/home/user/project', '.py')
for f in python_files:
    print(f)
 
# Find all CSV data files
csv_files = find_files('data', '.csv')
print(f"Found {len(csv_files)} CSV files")

Create a Nested Output Structure

import os
 
def setup_project_dirs(base_path):
    """Create a standard project directory structure."""
    dirs = [
        'data/raw',
        'data/processed',
        'data/output',
        'logs',
        'config',
        'reports/figures',
    ]
    for d in dirs:
        full_path = os.path.join(base_path, d)
        os.makedirs(full_path, exist_ok=True)
        print(f"Created: {full_path}")
 
setup_project_dirs('my_project')

Safe File Operations with Temporary Files

import os
import tempfile
 
# Create a temporary file that auto-deletes
with tempfile.NamedTemporaryFile(mode='w', suffix='.csv', delete=False) as tmp:
    tmp.write('col1,col2\n1,2\n3,4\n')
    tmp_path = tmp.name
 
print(f"Temp file: {tmp_path}")
print(f"Exists: {os.path.exists(tmp_path)}")
 
# Clean up
os.remove(tmp_path)

os.path vs pathlib: Which Should You Use?

Python 3.4 introduced pathlib as an object-oriented alternative to os.path. Both work, but they have different strengths.

Feature	os.path	pathlib
Style	Functional (string-based)	Object-oriented
Available since	Python 2	Python 3.4+
Path concatenation	`os.path.join('a', 'b')`	`Path('a') / 'b'`
Check existence	`os.path.exists(p)`	`p.exists()`
Read file	`open(os.path.join(d, f))`	`Path(d, f).read_text()`
Glob patterns	`import glob; glob.glob(...)`	`Path('.').glob('*.py')`
Recursive glob	`os.walk()` + filter	`Path('.').rglob('*.py')`
File extension	`os.path.splitext(f)[1]`	`p.suffix`
File stem	`os.path.splitext(os.path.basename(f))[0]`	`p.stem`
Cross-platform	Yes	Yes
Third-party compatibility	Universal	Most libraries accept Path objects

When to use os.path: Legacy codebases, scripts that must support Python 2, or when working with libraries that only accept string paths.

When to use pathlib: New projects, when you want cleaner syntax, or when chaining multiple path operations.

# os.path approach
import os
config_path = os.path.join(os.path.expanduser('~'), '.config', 'myapp', 'settings.json')
if os.path.isfile(config_path):
    with open(config_path) as f:
        data = f.read()
 
# pathlib approach
from pathlib import Path
config_path = Path.home() / '.config' / 'myapp' / 'settings.json'
if config_path.is_file():
    data = config_path.read_text()

Both approaches are valid. The os module is not deprecated and remains the standard for process-level operations, environment variables, and low-level system calls. pathlib simply offers a more ergonomic API for path manipulation.

Cross-Platform Considerations

The os module automatically adapts to the current operating system, but there are details worth knowing.

import os
 
# os.sep is the platform path separator
print(os.sep)       # '/' on Unix, '\\' on Windows
 
# os.linesep is the platform line ending
print(repr(os.linesep))  # '\n' on Unix, '\r\n' on Windows
 
# os.name identifies the OS family
print(os.name)      # 'posix' on Linux/macOS, 'nt' on Windows
 
# os.path.join handles separators automatically
path = os.path.join('data', 'output', 'results.csv')
# 'data/output/results.csv' on Unix
# 'data\\output\\results.csv' on Windows

Key rules for cross-platform scripts:

Always use os.path.join() instead of string concatenation with / or \\.
Use os.path.expanduser('~') instead of hardcoding /home/username.
Use os.linesep or open files in text mode (which handles line endings) instead of hardcoding \n.
Test your path logic on both platforms if your script will be shared.

Automating File System Tasks with RunCell

When working with file system operations in Jupyter notebooks -- organizing datasets, setting up project structures, or auditing file trees -- RunCell (opens in a new tab) adds an AI agent layer on top of your notebook environment. You can describe what you want ("find all CSV files over 100MB in this directory tree and list them by size") and RunCell generates and runs the os module code for you, making repetitive file management tasks faster.

FAQ

What is the os module in Python?

The os module is part of Python's standard library. It provides functions for interacting with the operating system, including file and directory manipulation, path handling, environment variable access, and process management. You import it with import os -- no installation needed.

What is the difference between os.path.join and string concatenation for paths?

os.path.join() automatically uses the correct path separator for the current operating system (/ on Unix, \ on Windows). String concatenation with + requires you to manually insert separators, which leads to bugs when code runs on a different platform. Always use os.path.join().

How do I recursively list all files in a directory?

Use os.walk() to traverse a directory tree. It yields (dirpath, dirnames, filenames) tuples for each directory. Combine it with os.path.join(dirpath, filename) to build full paths for each file.

Should I use os.path or pathlib?

For new Python 3 projects, pathlib offers cleaner, more readable syntax with its object-oriented API. However, os and os.path are not deprecated and are still the right choice for environment variables (os.environ), process operations, and legacy codebases. Many projects use both.

How do I safely delete a file or directory in Python?

Use os.remove(path) to delete a file and os.rmdir(path) to remove an empty directory. Always check os.path.exists(path) first, or wrap the call in a try/except block to handle FileNotFoundError. For non-empty directories, use shutil.rmtree(path).

Conclusion

Python's os module is the foundation for every file system operation your scripts will perform. os.path.join() builds paths safely across platforms. os.makedirs() creates nested directories in a single call. os.walk() traverses entire directory trees. os.environ and os.getenv() handle configuration without hardcoding secrets. And os.stat() gives you detailed file metadata.

For path-heavy code, consider combining os with pathlib for cleaner syntax. But the os module remains essential -- it is the standard way to interact with the operating system in Python, and every Python developer should know its core functions.

📚