Python Argparse: Build Command-Line Interfaces the Right Way
Updated on
You wrote a Python script that processes data exactly the way you need it. Then your colleague asks to use it. "Just change the filename on line 14 and the threshold on line 37," you tell them. They change the wrong line. The script breaks. You spend 20 minutes debugging someone else's edit to your working code. This happens every time someone needs to tweak a parameter, and it gets worse as the script grows. Hardcoded values inside scripts do not scale -- not for teams, not for automation, not for production.
Python's argparse module fixes this by turning any script into a proper command-line tool. Users pass arguments when they run the script. The module handles parsing, type conversion, validation, and help message generation. It ships with the standard library, so there is nothing to install. This guide walks through every argparse feature you need -- from basic positional arguments to subcommands and real-world CLI design patterns.
What argparse Does and Why It Beats sys.argv
Every Python script has access to sys.argv, a raw list of strings from the command line. You can parse it manually:
import sys
filename = sys.argv[1]
threshold = float(sys.argv[2])
verbose = "--verbose" in sys.argvThis works for throwaway scripts but falls apart quickly. There are no help messages. No type validation. No error messages when users forget an argument. No way to handle optional flags properly. The indexing breaks the moment you add or remove a parameter.
argparse solves all of these problems with a declarative API. You define what arguments your script accepts, and argparse handles the rest:
import argparse
parser = argparse.ArgumentParser(description="Process data files")
parser.add_argument("filename", help="Input file to process")
parser.add_argument("--threshold", type=float, default=0.5, help="Detection threshold")
parser.add_argument("--verbose", action="store_true", help="Enable detailed output")
args = parser.parse_args()
print(f"Processing {args.filename} with threshold {args.threshold}")What you get for free:
- Automatic
--help: Runpython script.py --helpand users see every argument, its type, and its default value. - Type conversion:
--threshold 0.8automatically becomes a float. If someone passes--threshold abc, argparse prints a clear error. - Validation: Missing required arguments produce helpful error messages instead of cryptic
IndexErrortracebacks. - Flexible ordering: Optional arguments can appear in any order.
--verbose --threshold 0.8works the same as--threshold 0.8 --verbose.
Here is how the help output looks:
$ python script.py --help
usage: script.py [-h] [--threshold THRESHOLD] [--verbose] filename
Process data files
positional arguments:
filename Input file to process
options:
-h, --help show this help message and exit
--threshold THRESHOLD Detection threshold
--verbose Enable detailed outputYour First CLI Script
Let us build a complete working script from scratch. This tool greets a user by name, with an optional count of how many times to repeat the greeting.
#!/usr/bin/env python3
"""greet.py -- A simple greeting CLI tool."""
import argparse
def main():
parser = argparse.ArgumentParser(
prog="greet",
description="Greet someone by name",
)
parser.add_argument("name", help="The person to greet")
parser.add_argument(
"-c", "--count",
type=int,
default=1,
help="Number of times to greet (default: 1)",
)
parser.add_argument(
"-u", "--uppercase",
action="store_true",
help="Print greeting in uppercase",
)
args = parser.parse_args()
greeting = f"Hello, {args.name}!"
if args.uppercase:
greeting = greeting.upper()
for _ in range(args.count):
print(greeting)
if __name__ == "__main__":
main()Test it from the terminal:
$ python greet.py Alice
Hello, Alice!
$ python greet.py Bob --count 3
Hello, Bob!
Hello, Bob!
Hello, Bob!
$ python greet.py Charlie -c 2 -u
HELLO, CHARLIE!
HELLO, CHARLIE!
$ python greet.py
usage: greet [-h] [-c COUNT] [-u] name
greet: error: the following arguments are required: nameNotice three things about this script:
- The parsing logic lives inside
main(), not at module level. This makes the script importable without triggering argument parsing. - The
if __name__ == "__main__"guard ensuresmain()only runs when the script is executed directly, not when imported. - Short and long flags (
-cand--count) give users the choice between brevity and clarity.
These three patterns show up in every well-written argparse script. Follow them from the start.
Positional Arguments
Positional arguments are defined without dashes. They are required by default and matched by their position on the command line.
import argparse
parser = argparse.ArgumentParser(description="Copy files")
parser.add_argument("source", help="Source file path")
parser.add_argument("destination", help="Destination file path")
args = parser.parse_args()
print(f"Copying {args.source} -> {args.destination}")$ python copy.py report.csv /backup/report.csv
Copying report.csv -> /backup/report.csvMultiple Positional Values with nargs
The nargs parameter controls how many values an argument consumes:
import argparse
parser = argparse.ArgumentParser(description="Merge files")
# One or more files (at least one required)
parser.add_argument("files", nargs="+", help="Files to merge")
# Exactly two values
parser.add_argument("--range", nargs=2, type=int, metavar=("START", "END"),
help="Row range to extract")
args = parser.parse_args()
print(f"Merging: {args.files}")
if args.range:
print(f"Range: {args.range[0]} to {args.range[1]}")$ python merge.py data1.csv data2.csv data3.csv --range 10 500
Merging: ['data1.csv', 'data2.csv', 'data3.csv']
Range: 10 to 500Here is the complete nargs reference:
| nargs value | Meaning | Result type | Example |
|---|---|---|---|
| (omitted) | Exactly one value | Single value | "input.csv" |
N (integer) | Exactly N values | List of N items | [10, 100] |
"?" | Zero or one value | Single value or default | "config.yml" or None |
"*" | Zero or more values | List (possibly empty) | [] or ["a", "b"] |
"+" | One or more values | List (error if empty) | ["a"] or ["a", "b"] |
argparse.REMAINDER | All remaining args | List | Everything after this arg |
Type Conversion on Positional Arguments
Positional arguments are strings by default. Add type to convert them:
import argparse
parser = argparse.ArgumentParser(description="Calculate rectangle area")
parser.add_argument("width", type=float, help="Rectangle width")
parser.add_argument("height", type=float, help="Rectangle height")
args = parser.parse_args()
area = args.width * args.height
print(f"Area: {area:.2f}")$ python area.py 3.5 7.2
Area: 25.20
$ python area.py three 7.2
usage: area.py [-h] width height
area.py: error: argument width: invalid float value: 'three'The error message is automatic and tells the user exactly what went wrong.
Optional Arguments
Optional arguments start with - (short form) or -- (long form). They are optional by default and can appear in any order.
import argparse
parser = argparse.ArgumentParser(description="Data exporter")
parser.add_argument("-o", "--output", default="output.csv", help="Output file path")
parser.add_argument("-d", "--delimiter", default=",", help="Field delimiter")
parser.add_argument("-v", "--verbose", action="store_true", help="Show detailed progress")
parser.add_argument("-q", "--quiet", action="store_true", help="Suppress all output")
args = parser.parse_args()
print(f"Output: {args.output}")
print(f"Delimiter: {repr(args.delimiter)}")
print(f"Verbose: {args.verbose}")$ python export.py --output results.tsv --delimiter "\t" --verbose
Output: results.tsv
Delimiter: '\\t'
Verbose: True
$ python export.py -o results.json -v
Output: results.json
Delimiter: ','
Verbose: TrueDefault Values
Every optional argument has a default value. If you do not set one explicitly, it defaults to None:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--name", default="World") # Default: "World"
parser.add_argument("--port", type=int, default=8080) # Default: 8080
parser.add_argument("--log-file") # Default: None
args = parser.parse_args()
print(f"Name: {args.name}, Port: {args.port}, Log: {args.log_file}")Note: when an argument name uses dashes (--log-file), argparse converts it to underscores for the attribute name: args.log_file.
The store_true and store_false Actions
Boolean flags do not take a value. They are either present or absent:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--verbose", action="store_true", help="Enable verbose mode")
parser.add_argument("--no-cache", action="store_true", help="Disable caching")
parser.add_argument("--dry-run", action="store_true", help="Preview without executing")
args = parser.parse_args()
# --verbose present -> args.verbose is True
# --verbose absent -> args.verbose is FalseThe count Action for Verbosity Levels
Some tools use repeated flags for verbosity levels (-v, -vv, -vvv):
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-v", "--verbose", action="count", default=0,
help="Increase verbosity (-v, -vv, -vvv)")
args = parser.parse_args()
if args.verbose >= 3:
print("TRACE level: showing everything")
elif args.verbose >= 2:
print("DEBUG level: detailed output")
elif args.verbose >= 1:
print("INFO level: progress updates")
else:
print("Default: errors only")$ python tool.py -vvv
TRACE level: showing everythingThe append Action for Repeated Arguments
Use action="append" when users should be able to specify the same flag multiple times:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-t", "--tag", action="append", default=[],
help="Add a tag (can be repeated)")
args = parser.parse_args()
print(f"Tags: {args.tag}")$ python tool.py -t bug -t urgent -t backend
Tags: ['bug', 'urgent', 'backend']Type and Choices
Built-in Type Conversion
The type parameter accepts any callable that takes a string and returns a value. You can use built-in types, pathlib.Path for file paths, or custom functions:
import argparse
from pathlib import Path
parser = argparse.ArgumentParser()
parser.add_argument("--count", type=int, help="Number of items")
parser.add_argument("--rate", type=float, help="Processing rate")
parser.add_argument("--config", type=Path, help="Config file path")
args = parser.parse_args()Restricting Values with choices
Use choices to limit an argument to specific allowed values:
import argparse
parser = argparse.ArgumentParser(description="Deploy application")
parser.add_argument("environment", choices=["dev", "staging", "production"],
help="Target environment")
parser.add_argument("--log-level", choices=["DEBUG", "INFO", "WARNING", "ERROR"],
default="INFO", help="Logging level")
parser.add_argument("--replicas", type=int, choices=range(1, 11),
default=1, help="Number of replicas (1-10)")
args = parser.parse_args()$ python deploy.py testing
usage: deploy.py [-h] {dev,staging,production}
deploy.py: error: argument environment: invalid choice: 'testing'
(choose from 'dev', 'staging', 'production')Custom Type Functions
Custom type functions are one of the most powerful features in argparse. They let you validate and transform input at parse time:
import argparse
from datetime import datetime
def positive_int(value):
"""Accept only positive integers."""
ivalue = int(value)
if ivalue <= 0:
raise argparse.ArgumentTypeError(f"{value} is not a positive integer")
return ivalue
def date_type(value):
"""Parse YYYY-MM-DD date strings."""
try:
return datetime.strptime(value, "%Y-%m-%d").date()
except ValueError:
raise argparse.ArgumentTypeError(
f"Invalid date: '{value}'. Expected format: YYYY-MM-DD"
)
def percentage(value):
"""Accept floats between 0 and 100."""
fvalue = float(value)
if not 0 <= fvalue <= 100:
raise argparse.ArgumentTypeError(
f"{value} is not a valid percentage (0-100)"
)
return fvalue
parser = argparse.ArgumentParser()
parser.add_argument("--workers", type=positive_int, default=4)
parser.add_argument("--since", type=date_type, help="Start date (YYYY-MM-DD)")
parser.add_argument("--sample", type=percentage, default=100.0,
help="Sample percentage (0-100)")
args = parser.parse_args()$ python tool.py --workers -3
error: argument --workers: -3 is not a positive integer
$ python tool.py --since 2026-13-45
error: argument --since: Invalid date: '2026-13-45'. Expected format: YYYY-MM-DD
$ python tool.py --sample 150
error: argument --sample: 150 is not a valid percentage (0-100)Every validation error produces a clear, actionable message without you writing any if/else logic in your main code.
Required Optional Arguments
You can force an optional argument to be required:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--config", required=True, help="Path to config file")
parser.add_argument("--output", required=True, help="Output directory")
parser.add_argument("--format", default="json", help="Output format")
args = parser.parse_args()$ python tool.py --format csv
usage: tool.py [-h] --config CONFIG --output OUTPUT [--format FORMAT]
tool.py: error: the following arguments are required: --config, --outputWhy this is sometimes a code smell: If every "optional" argument is actually required, you probably want positional arguments instead. Required optionals make sense when the flag name adds clarity (--config is clearer than a bare path) or when you have many required parameters and named flags prevent ordering mistakes.
A better pattern for truly mandatory inputs is often a positional argument:
# Instead of --config (required optional)
parser.add_argument("config", help="Path to config file")
# Use required optionals when the name matters
parser.add_argument("--api-key", required=True, help="API authentication key")Mutually Exclusive Groups
Sometimes arguments conflict with each other. A script might produce either JSON or CSV output, but not both at the same time. add_mutually_exclusive_group() enforces this constraint:
import argparse
parser = argparse.ArgumentParser(description="Export data")
# Output format: pick exactly one
format_group = parser.add_mutually_exclusive_group(required=True)
format_group.add_argument("--json", action="store_true", help="Export as JSON")
format_group.add_argument("--csv", action="store_true", help="Export as CSV")
format_group.add_argument("--parquet", action="store_true", help="Export as Parquet")
# Verbosity: pick at most one
verbosity = parser.add_mutually_exclusive_group()
verbosity.add_argument("-v", "--verbose", action="store_true", help="Detailed output")
verbosity.add_argument("-q", "--quiet", action="store_true", help="Minimal output")
parser.add_argument("input_file", help="Input data file")
args = parser.parse_args()
if args.json:
print(f"Exporting {args.input_file} as JSON")
elif args.csv:
print(f"Exporting {args.input_file} as CSV")
elif args.parquet:
print(f"Exporting {args.input_file} as Parquet")$ python export.py data.csv --json --csv
error: argument --csv: not allowed with argument --json
$ python export.py data.csv
error: one of the arguments --json --csv --parquet is required
$ python export.py data.csv --parquet -v
Exporting data.csv as ParquetA practical alternative for format selection uses choices instead of separate flags:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--format", choices=["json", "csv", "parquet"],
required=True, help="Output format")
args = parser.parse_args()Both approaches work. Mutually exclusive groups are better when each option needs its own additional arguments. The choices approach is better when the options are simple strings.
Subcommands with add_subparsers
Real-world CLI tools use subcommands. Think git commit, docker build, pip install. Each subcommand has its own set of arguments, its own help text, and its own handler function. argparse supports this pattern through add_subparsers():
#!/usr/bin/env python3
"""project.py -- A project management CLI tool."""
import argparse
def handle_init(args):
"""Initialize a new project."""
print(f"Creating project '{args.name}'")
print(f" Template: {args.template}")
print(f" Directory: {args.name}/")
def handle_build(args):
"""Build the project."""
mode = "release" if args.optimize else "debug"
print(f"Building in {mode} mode")
if args.target:
print(f" Target: {args.target}")
def handle_test(args):
"""Run tests."""
print(f"Running tests (coverage={'on' if args.coverage else 'off'})")
if args.pattern:
print(f" Pattern: {args.pattern}")
def handle_deploy(args):
"""Deploy the project."""
print(f"Deploying to {args.environment}")
if args.dry_run:
print(" (DRY RUN -- no actual deployment)")
def main():
# Top-level parser
parser = argparse.ArgumentParser(
prog="project",
description="Manage your project lifecycle",
)
parser.add_argument("--version", action="version", version="project 1.0.0")
subparsers = parser.add_subparsers(dest="command", required=True,
help="Available commands")
# init subcommand
init_cmd = subparsers.add_parser("init", help="Create a new project")
init_cmd.add_argument("name", help="Project name")
init_cmd.add_argument("--template", default="basic",
choices=["basic", "web", "api", "ml"],
help="Project template (default: basic)")
init_cmd.set_defaults(func=handle_init)
# build subcommand
build_cmd = subparsers.add_parser("build", help="Build the project")
build_cmd.add_argument("--optimize", action="store_true",
help="Enable release optimizations")
build_cmd.add_argument("--target", help="Build target platform")
build_cmd.set_defaults(func=handle_build)
# test subcommand
test_cmd = subparsers.add_parser("test", help="Run tests")
test_cmd.add_argument("--coverage", action="store_true",
help="Generate coverage report")
test_cmd.add_argument("--pattern", "-p", help="Test name pattern to match")
test_cmd.set_defaults(func=handle_test)
# deploy subcommand
deploy_cmd = subparsers.add_parser("deploy", help="Deploy the project")
deploy_cmd.add_argument("environment",
choices=["staging", "production"],
help="Target environment")
deploy_cmd.add_argument("--dry-run", action="store_true",
help="Preview without deploying")
deploy_cmd.set_defaults(func=handle_deploy)
args = parser.parse_args()
args.func(args)
if __name__ == "__main__":
main()$ python project.py init myapp --template web
Creating project 'myapp'
Template: web
Directory: myapp/
$ python project.py build --optimize
Building in release mode
$ python project.py test --coverage -p "test_api*"
Running tests (coverage=on)
Pattern: test_api*
$ python project.py deploy production --dry-run
Deploying to production
(DRY RUN -- no actual deployment)
$ python project.py --help
usage: project [-h] [--version] {init,build,test,deploy} ...
Manage your project lifecycle
positional arguments:
{init,build,test,deploy}
Available commands
init Create a new project
build Build the project
test Run tests
deploy Deploy the project
$ python project.py deploy --help
usage: project deploy [-h] [--dry-run] {staging,production}
positional arguments:
{staging,production} Target environment
options:
-h, --help show this help message and exit
--dry-run Preview without deployingThe key pattern here is set_defaults(func=handler). Each subcommand stores its handler function in the parsed namespace, and the main function dispatches to it with args.func(args). This is the standard approach used in production CLI tools.
Argument Groups for Better Help
When your tool has many arguments, you can group them under logical headings:
import argparse
parser = argparse.ArgumentParser(description="Data pipeline tool")
input_group = parser.add_argument_group("Input options")
input_group.add_argument("--source", required=True, help="Data source path")
input_group.add_argument("--format", choices=["csv", "json", "parquet"], default="csv")
input_group.add_argument("--encoding", default="utf-8", help="File encoding")
transform_group = parser.add_argument_group("Transform options")
transform_group.add_argument("--filter", help="Filter expression")
transform_group.add_argument("--sort-by", help="Column to sort by")
transform_group.add_argument("--limit", type=int, help="Max rows to process")
output_group = parser.add_argument_group("Output options")
output_group.add_argument("-o", "--output", required=True, help="Output path")
output_group.add_argument("--compress", action="store_true", help="Compress output")
args = parser.parse_args()The --help output groups arguments under their headings, making long argument lists scannable. This is purely a presentation feature -- it does not change how arguments are parsed.
Custom Help and Formatting
The formatter_class Parameter
import argparse
parser = argparse.ArgumentParser(
prog="analyzer",
description="Analyze datasets and generate reports.\n\n"
"Supports CSV, JSON, and Parquet input formats.\n"
"Output can be filtered, sorted, and aggregated.",
epilog="Examples:\n"
" analyzer data.csv --format json --top 10\n"
" analyzer data.csv --columns name age --sort-by age\n"
" analyzer data.csv --filter 'age > 30' --output results.csv",
formatter_class=argparse.RawDescriptionHelpFormatter,
)| Formatter | What it does |
|---|---|
HelpFormatter | Default. Wraps text to fit terminal width. |
RawDescriptionHelpFormatter | Preserves newlines in description and epilog. |
RawTextHelpFormatter | Preserves newlines everywhere, including argument help. |
ArgumentDefaultsHelpFormatter | Appends (default: X) to every argument's help text. |
metavar and %(default)s
Control what appears in usage messages:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--top", type=int, default=10, metavar="N",
help="Show top N results (default: %(default)s)")
parser.add_argument("--format", default="table", metavar="FMT",
help="Output format (default: %(default)s)")This produces --top N in usage text instead of --top TOP, and the help text shows the actual default value.
argparse vs sys.argv vs click vs typer
Python has several approaches to command-line argument parsing. Here is how they compare:
| Feature | sys.argv | argparse | click | typer |
|---|---|---|---|---|
| Part of stdlib | Yes | Yes | No (pip install) | No (pip install) |
| Type conversion | Manual | Built-in | Built-in | Automatic (type hints) |
| Help generation | None | Automatic | Automatic | Automatic |
| Subcommands | Manual | add_subparsers() | @group.command() | app.command() |
| Validation | Manual | choices, custom type | click.Choice, callbacks | Validators |
| Boilerplate | High | Medium | Low | Very low |
| API style | Imperative | Imperative | Decorators | Decorators + type hints |
| Tab completion | None | None | Via plugin | Built-in |
| Prompting / colors | Manual | None | Built-in | Built-in |
| Testing | Manual | parse_args([...]) | CliRunner | CliRunner |
| Best for | Throwaway scripts | Stdlib-only projects | Complex CLIs | Modern Python 3.7+ |
Here is the same tool implemented with each approach:
# --- sys.argv: raw, fragile, no help ---
import sys
name = sys.argv[1] if len(sys.argv) > 1 else "World"
count = int(sys.argv[2]) if len(sys.argv) > 2 else 1
for _ in range(count):
print(f"Hello, {name}!")# --- argparse: stdlib, explicit argument definitions ---
import argparse
parser = argparse.ArgumentParser(description="Greet someone")
parser.add_argument("name", nargs="?", default="World", help="Name to greet")
parser.add_argument("-c", "--count", type=int, default=1, help="Repetitions")
args = parser.parse_args()
for _ in range(args.count):
print(f"Hello, {args.name}!")# --- click: decorators, pip install click ---
import click
@click.command()
@click.argument("name", default="World")
@click.option("-c", "--count", default=1, type=int, help="Repetitions")
def greet(name, count):
"""Greet someone."""
for _ in range(count):
click.echo(f"Hello, {name}!")
greet()# --- typer: type hints, pip install typer ---
import typer
def greet(name: str = "World", count: int = 1):
"""Greet someone."""
for _ in range(count):
print(f"Hello, {name}!")
typer.run(greet)When to use each:
- sys.argv -- Only for one-off scripts where you grab a single value and do not care about validation.
- argparse -- When you need a real CLI but cannot add external dependencies. This is the right default for most projects.
- click -- When you are building a complex CLI with nested command groups, interactive prompts, colored output, and plugin systems.
- typer -- When you want minimal boilerplate and use Python 3.7+. It infers arguments from type hints and builds on click.
Real-World Project: CSV Data Processor CLI
Here is a complete, production-style CLI tool that reads CSV files, filters data, computes aggregations, and outputs results in multiple formats. This demonstrates how argparse patterns combine in a real project.
#!/usr/bin/env python3
"""csv_processor.py -- A CLI tool for analyzing CSV data."""
import argparse
import csv
import json
import sys
from collections import defaultdict
from pathlib import Path
def read_csv(filepath, encoding="utf-8"):
"""Read a CSV file and return a list of dictionaries."""
with open(filepath, "r", encoding=encoding) as f:
reader = csv.DictReader(f)
return list(reader)
def cmd_info(args):
"""Display information about a CSV file."""
rows = read_csv(args.file, args.encoding)
columns = list(rows[0].keys()) if rows else []
print(f"File: {args.file}")
print(f"Rows: {len(rows)}")
print(f"Columns: {len(columns)}")
if args.verbose:
print("\nColumn details:")
for col in columns:
non_empty = sum(1 for r in rows if r[col].strip())
unique = len(set(r[col] for r in rows))
print(f" {col:30s} non-empty: {non_empty:>6d} unique: {unique:>6d}")
def cmd_filter(args):
"""Filter rows where a column matches a value."""
rows = read_csv(args.file, args.encoding)
if args.column not in rows[0]:
print(f"Error: column '{args.column}' not found.", file=sys.stderr)
print(f"Available columns: {', '.join(rows[0].keys())}", file=sys.stderr)
sys.exit(1)
if args.match == "exact":
filtered = [r for r in rows if r[args.column] == args.value]
elif args.match == "contains":
filtered = [r for r in rows if args.value.lower() in r[args.column].lower()]
elif args.match == "startswith":
filtered = [r for r in rows if r[args.column].startswith(args.value)]
elif args.match == "gt":
threshold = float(args.value)
filtered = [r for r in rows if _safe_float(r[args.column], float("-inf")) > threshold]
elif args.match == "lt":
threshold = float(args.value)
filtered = [r for r in rows if _safe_float(r[args.column], float("inf")) < threshold]
print(f"Matched {len(filtered)} of {len(rows)} rows", file=sys.stderr)
_output_rows(filtered, args)
def cmd_aggregate(args):
"""Group by a column and compute aggregations."""
rows = read_csv(args.file, args.encoding)
if args.group_by not in rows[0]:
print(f"Error: column '{args.group_by}' not found.", file=sys.stderr)
sys.exit(1)
groups = defaultdict(list)
for row in rows:
key = row[args.group_by]
groups[key].append(row)
results = []
for key, group_rows in sorted(groups.items()):
result = {args.group_by: key, "count": len(group_rows)}
if args.sum_column:
total = sum(_safe_float(r[args.sum_column], 0) for r in group_rows)
result[f"sum_{args.sum_column}"] = round(total, 2)
if args.avg_column:
values = [_safe_float(r[args.avg_column], None) for r in group_rows]
values = [v for v in values if v is not None]
if values:
result[f"avg_{args.avg_column}"] = round(sum(values) / len(values), 2)
results.append(result)
if args.sort_by == "count":
results.sort(key=lambda r: r["count"], reverse=True)
if args.top:
results = results[:args.top]
_output_rows(results, args)
def cmd_convert(args):
"""Convert CSV to another format."""
rows = read_csv(args.file, args.encoding)
if args.columns:
rows = [{k: r.get(k, "") for k in args.columns} for r in rows]
if args.limit:
rows = rows[:args.limit]
_output_rows(rows, args)
count = min(args.limit, len(rows)) if args.limit else len(rows)
print(f"Converted {count} rows", file=sys.stderr)
def _safe_float(value, default):
"""Try to convert a string to float, return default on failure."""
try:
return float(value)
except (ValueError, TypeError):
return default
def _output_rows(rows, args):
"""Write rows to the specified output in the specified format."""
if not rows:
return
output_file = open(args.output, "w", newline="") if args.output else sys.stdout
if args.format == "json":
json.dump(rows, output_file, indent=2)
output_file.write("\n")
elif args.format == "csv":
writer = csv.DictWriter(output_file, fieldnames=rows[0].keys())
writer.writeheader()
writer.writerows(rows)
elif args.format == "tsv":
writer = csv.DictWriter(output_file, fieldnames=rows[0].keys(),
delimiter="\t")
writer.writeheader()
writer.writerows(rows)
elif args.format == "table":
_print_table(rows, output_file)
if args.output:
output_file.close()
def _print_table(rows, out):
"""Print rows as an aligned text table."""
if not rows:
return
headers = list(rows[0].keys())
widths = {h: len(h) for h in headers}
for row in rows:
for h in headers:
widths[h] = max(widths[h], len(str(row.get(h, ""))))
header_line = " ".join(h.ljust(widths[h]) for h in headers)
separator = " ".join("-" * widths[h] for h in headers)
out.write(header_line + "\n")
out.write(separator + "\n")
for row in rows:
line = " ".join(str(row.get(h, "")).ljust(widths[h]) for h in headers)
out.write(line + "\n")
def main():
parser = argparse.ArgumentParser(
prog="csv_processor",
description="Analyze, filter, aggregate, and convert CSV files.",
epilog="Examples:\n"
" csv_processor info sales.csv --verbose\n"
" csv_processor filter sales.csv --column region --value East\n"
" csv_processor aggregate sales.csv --group-by region --sum revenue\n"
" csv_processor convert sales.csv --format json -o sales.json\n",
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument("--encoding", default="utf-8", help="File encoding")
subparsers = parser.add_subparsers(dest="command", required=True,
help="Available commands")
# --- info ---
info_p = subparsers.add_parser("info", help="Show CSV file information")
info_p.add_argument("file", help="CSV file to inspect")
info_p.add_argument("-v", "--verbose", action="store_true",
help="Show per-column statistics")
info_p.set_defaults(func=cmd_info, format="table", output=None)
# --- filter ---
filter_p = subparsers.add_parser("filter", help="Filter rows by column value")
filter_p.add_argument("file", help="Input CSV file")
filter_p.add_argument("--column", "-c", required=True, help="Column to filter on")
filter_p.add_argument("--value", "-V", required=True, help="Value to match against")
filter_p.add_argument("--match", "-m",
choices=["exact", "contains", "startswith", "gt", "lt"],
default="exact", help="Match mode (default: exact)")
filter_p.add_argument("--format", "-f",
choices=["csv", "json", "tsv", "table"],
default="table", help="Output format")
filter_p.add_argument("-o", "--output", help="Output file (default: stdout)")
filter_p.set_defaults(func=cmd_filter)
# --- aggregate ---
agg_p = subparsers.add_parser("aggregate", help="Group and aggregate data")
agg_p.add_argument("file", help="Input CSV file")
agg_p.add_argument("--group-by", "-g", required=True, help="Column to group by")
agg_p.add_argument("--sum", dest="sum_column", help="Column to sum")
agg_p.add_argument("--avg", dest="avg_column", help="Column to average")
agg_p.add_argument("--sort-by", choices=["name", "count"], default="name",
help="Sort results by name or count")
agg_p.add_argument("--top", type=int, help="Show only top N groups")
agg_p.add_argument("--format", "-f",
choices=["csv", "json", "tsv", "table"],
default="table", help="Output format")
agg_p.add_argument("-o", "--output", help="Output file (default: stdout)")
agg_p.set_defaults(func=cmd_aggregate)
# --- convert ---
conv_p = subparsers.add_parser("convert", help="Convert CSV to another format")
conv_p.add_argument("file", help="Input CSV file")
conv_p.add_argument("--format", "-f", required=True,
choices=["json", "tsv", "csv", "table"],
help="Target format")
conv_p.add_argument("--columns", nargs="+", metavar="COL",
help="Include only these columns")
conv_p.add_argument("--limit", type=int, help="Max rows to convert")
conv_p.add_argument("-o", "--output", help="Output file (default: stdout)")
conv_p.set_defaults(func=cmd_convert)
args = parser.parse_args()
args.func(args)
if __name__ == "__main__":
main()This tool combines subcommands, custom type handling, multiple output formats, argument groups, and proper error reporting. Each subcommand has its own focused set of arguments. Users get full --help for every subcommand.
Visualizing the results: After filtering or aggregating CSV data with a CLI tool like this, you often want to explore the output visually. PyGWalker (opens in a new tab) turns any pandas DataFrame into an interactive, Tableau-like visualization interface -- directly in a Jupyter notebook or script. Pipe your CLI output into a DataFrame and use PyGWalker to build charts without writing plotting code:
import pandas as pd
import pygwalker as pyg
# Load the CSV processor output
df = pd.read_csv("aggregated_results.csv")
# Launch interactive visual explorer
walker = pyg.walk(df)For interactive development of CLI tools like this, RunCell (opens in a new tab) lets you build and test CLI tools inside Jupyter with AI assistance. You can prototype argument parsing logic in notebook cells, test with different argument combinations, and then export to a standalone script.
Common Errors and Fixes
1. error: unrecognized arguments
This happens when you pass an argument that was not defined:
$ python tool.py --output results.csv --compress
error: unrecognized arguments: --compressFix: Either add the argument to your parser, or use parse_known_args() if you intentionally want to ignore unknown arguments:
args, unknown = parser.parse_known_args()
# args contains recognized arguments
# unknown is a list of unrecognized strings2. error: argument --count: invalid int value
$ python tool.py --count three
error: argument --count: invalid int value: 'three'Fix: This is argparse working correctly. The user passed a non-integer string to a type=int argument. Your custom type functions should raise ArgumentTypeError with a clear message.
3. Attribute names with dashes
parser.add_argument("--log-level", default="INFO")
args = parser.parse_args()
# This is a syntax error:
# print(args.log-level) # Python interprets this as args.log minus level
# Correct:
print(args.log_level) # Dashes become underscores4. Subcommand not triggering any action
# Problem: no error when no subcommand is given
subparsers = parser.add_subparsers(dest="command")
# Solution: add required=True
subparsers = parser.add_subparsers(dest="command", required=True)Without required=True, running the script with no subcommand silently does nothing. In Python 3.7+, always set required=True on subparsers.
5. Duplicate help messages from propagation
If both a parent parser and a subparser define the same argument, users get confused:
# BAD: --verbose defined on both parent and child
parser.add_argument("--verbose", action="store_true")
sub = subparsers.add_parser("run")
sub.add_argument("--verbose", action="store_true") # Shadows parent's --verbose
# GOOD: put shared arguments on the parent only
parser.add_argument("--verbose", action="store_true")6. parse_args() running on import
# BAD: parse_args() fires when another module imports this file
parser = argparse.ArgumentParser()
parser.add_argument("--name")
args = parser.parse_args() # Crashes if imported without CLI args
# GOOD: wrap everything in main()
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--name")
args = parser.parse_args()
return args
if __name__ == "__main__":
main()7. FileType opens files too early
# Risky: file is opened at parse time, before your code validates other args
parser.add_argument("output", type=argparse.FileType("w"))
# Safer: accept a path string, open the file yourself with a context manager
parser.add_argument("output", help="Output file path")
args = parser.parse_args()
with open(args.output, "w") as f:
f.write("data")Advanced Patterns
Parent Parsers for Shared Arguments
When multiple subcommands share the same set of arguments, use parent parsers to avoid repetition:
import argparse
# Shared arguments
parent = argparse.ArgumentParser(add_help=False)
parent.add_argument("--verbose", "-v", action="store_true")
parent.add_argument("--output", "-o", default="output.csv")
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers(dest="command", required=True)
# Both subcommands inherit --verbose and --output
cmd_a = subparsers.add_parser("analyze", parents=[parent])
cmd_a.add_argument("file", help="File to analyze")
cmd_b = subparsers.add_parser("compare", parents=[parent])
cmd_b.add_argument("file_a", help="First file")
cmd_b.add_argument("file_b", help="Second file")Environment Variable Defaults
Pull defaults from environment variables for deployment flexibility:
import argparse
import os
parser = argparse.ArgumentParser()
parser.add_argument("--api-key",
default=os.environ.get("API_KEY"),
help="API key (or set API_KEY env var)")
parser.add_argument("--port", type=int,
default=int(os.environ.get("PORT", "8080")),
help="Server port (default: $PORT or 8080)")
args = parser.parse_args()
if not args.api_key:
parser.error("--api-key is required (or set API_KEY environment variable)")Testing argparse Scripts
You can integrate argparse testing into your unittest or pytest test suites. Test your CLI tools by passing argument lists to parse_args():
import argparse
def create_parser():
parser = argparse.ArgumentParser()
parser.add_argument("name")
parser.add_argument("--count", type=int, default=1)
return parser
# In your test file
def test_parser_defaults():
parser = create_parser()
args = parser.parse_args(["Alice"])
assert args.name == "Alice"
assert args.count == 1
def test_parser_with_options():
parser = create_parser()
args = parser.parse_args(["Bob", "--count", "5"])
assert args.name == "Bob"
assert args.count == 5Separating create_parser() from main() makes your argument definitions testable without running the full script.
FAQ
What is Python argparse used for?
Python argparse is the standard library module for parsing command-line arguments. It reads strings from sys.argv, converts them to typed Python objects, validates inputs, generates --help output, and reports errors when users provide incorrect arguments. It is used to turn scripts into proper CLI tools that accept configurable inputs without editing source code.
What is the difference between positional and optional arguments in argparse?
Positional arguments have no dashes in their name (e.g., parser.add_argument("filename")). They are required by default and matched by position. Optional arguments start with dashes (e.g., parser.add_argument("--verbose")). They are optional by default and can appear in any order. You can make optional arguments required with required=True.
How do I create subcommands like git commit or docker build?
Use parser.add_subparsers() to create a subcommand group. Call subparsers.add_parser("name") for each subcommand. Each subparser gets its own set of arguments. Use set_defaults(func=handler_function) to link each subcommand to a handler, then call args.func(args) to dispatch.
Should I use argparse or click for my Python CLI?
Use argparse when you need zero external dependencies and your CLI is straightforward. Use click when you need advanced features like interactive prompts, colored terminal output, progress bars, or deeply nested command groups. Typer is another option that uses Python type hints and requires even less boilerplate than click.
How do I validate custom input types with argparse?
Write a function that takes a string and returns the converted value. If the input is invalid, raise argparse.ArgumentTypeError with a descriptive message. Pass this function as the type parameter: parser.add_argument("--date", type=my_date_parser). argparse calls your function automatically and displays the error message if validation fails.
Can I use argparse with environment variables?
Yes. Set the default parameter to read from os.environ: parser.add_argument("--api-key", default=os.environ.get("API_KEY")). This lets users configure values through environment variables while still allowing command-line flags to override them.
Conclusion
Python argparse transforms scripts from fragile, edit-the-source-code tools into proper command-line programs. The module handles type conversion, validation, help generation, and error messages so you focus on what the script does, not how it receives input.
The patterns to remember:
- Use positional arguments for required inputs, dashed arguments for optional ones.
- Set
typefor automatic conversion,choicesfor restricted values,nargsfor multiple values. - Use
action="store_true"for boolean flags andaction="count"for verbosity levels. - Build complex tools with
add_subparsers()and theset_defaults(func=handler)dispatch pattern. - Use mutually exclusive groups when arguments conflict.
- Wrap parsing in
main()and guard it withif __name__ == "__main__". - Write custom type functions for domain-specific validation.
argparse is not the newest argument parsing library in Python. But it ships with every Python installation, handles the vast majority of CLI needs, and produces tools that behave the way users expect. For most Python developers, that is the right choice. When your CLI tool needs to invoke external commands, combine argparse with the subprocess module for a complete command-line automation solution.
Related Guides
- Python subprocess -- Run external commands from your CLI tools
- Python pathlib -- Modern file path handling for CLI arguments
- Python type hints -- Add type annotations to your CLI functions
- Python unittest -- Write tests for your argparse-based tools