Skip to content

Python Argparse: Build Command-Line Interfaces the Right Way

Updated on

You wrote a Python script that processes data exactly the way you need it. Then your colleague asks to use it. "Just change the filename on line 14 and the threshold on line 37," you tell them. They change the wrong line. The script breaks. You spend 20 minutes debugging someone else's edit to your working code. This happens every time someone needs to tweak a parameter, and it gets worse as the script grows. Hardcoded values inside scripts do not scale -- not for teams, not for automation, not for production.

📚

Python's argparse module fixes this by turning any script into a proper command-line tool. Users pass arguments when they run the script. The module handles parsing, type conversion, validation, and help message generation. It ships with the standard library, so there is nothing to install. This guide walks through every argparse feature you need -- from basic positional arguments to subcommands and real-world CLI design patterns.

What argparse Does and Why It Beats sys.argv

Every Python script has access to sys.argv, a raw list of strings from the command line. You can parse it manually:

import sys
 
filename = sys.argv[1]
threshold = float(sys.argv[2])
verbose = "--verbose" in sys.argv

This works for throwaway scripts but falls apart quickly. There are no help messages. No type validation. No error messages when users forget an argument. No way to handle optional flags properly. The indexing breaks the moment you add or remove a parameter.

argparse solves all of these problems with a declarative API. You define what arguments your script accepts, and argparse handles the rest:

import argparse
 
parser = argparse.ArgumentParser(description="Process data files")
parser.add_argument("filename", help="Input file to process")
parser.add_argument("--threshold", type=float, default=0.5, help="Detection threshold")
parser.add_argument("--verbose", action="store_true", help="Enable detailed output")
args = parser.parse_args()
 
print(f"Processing {args.filename} with threshold {args.threshold}")

What you get for free:

  • Automatic --help: Run python script.py --help and users see every argument, its type, and its default value.
  • Type conversion: --threshold 0.8 automatically becomes a float. If someone passes --threshold abc, argparse prints a clear error.
  • Validation: Missing required arguments produce helpful error messages instead of cryptic IndexError tracebacks.
  • Flexible ordering: Optional arguments can appear in any order. --verbose --threshold 0.8 works the same as --threshold 0.8 --verbose.

Here is how the help output looks:

$ python script.py --help
usage: script.py [-h] [--threshold THRESHOLD] [--verbose] filename
 
Process data files
 
positional arguments:
  filename              Input file to process
 
options:
  -h, --help            show this help message and exit
  --threshold THRESHOLD Detection threshold
  --verbose             Enable detailed output

Your First CLI Script

Let us build a complete working script from scratch. This tool greets a user by name, with an optional count of how many times to repeat the greeting.

#!/usr/bin/env python3
"""greet.py -- A simple greeting CLI tool."""
 
import argparse
 
def main():
    parser = argparse.ArgumentParser(
        prog="greet",
        description="Greet someone by name",
    )
    parser.add_argument("name", help="The person to greet")
    parser.add_argument(
        "-c", "--count",
        type=int,
        default=1,
        help="Number of times to greet (default: 1)",
    )
    parser.add_argument(
        "-u", "--uppercase",
        action="store_true",
        help="Print greeting in uppercase",
    )
 
    args = parser.parse_args()
 
    greeting = f"Hello, {args.name}!"
    if args.uppercase:
        greeting = greeting.upper()
 
    for _ in range(args.count):
        print(greeting)
 
if __name__ == "__main__":
    main()

Test it from the terminal:

$ python greet.py Alice
Hello, Alice!
 
$ python greet.py Bob --count 3
Hello, Bob!
Hello, Bob!
Hello, Bob!
 
$ python greet.py Charlie -c 2 -u
HELLO, CHARLIE!
HELLO, CHARLIE!
 
$ python greet.py
usage: greet [-h] [-c COUNT] [-u] name
greet: error: the following arguments are required: name

Notice three things about this script:

  1. The parsing logic lives inside main(), not at module level. This makes the script importable without triggering argument parsing.
  2. The if __name__ == "__main__" guard ensures main() only runs when the script is executed directly, not when imported.
  3. Short and long flags (-c and --count) give users the choice between brevity and clarity.

These three patterns show up in every well-written argparse script. Follow them from the start.

Positional Arguments

Positional arguments are defined without dashes. They are required by default and matched by their position on the command line.

import argparse
 
parser = argparse.ArgumentParser(description="Copy files")
parser.add_argument("source", help="Source file path")
parser.add_argument("destination", help="Destination file path")
args = parser.parse_args()
 
print(f"Copying {args.source} -> {args.destination}")
$ python copy.py report.csv /backup/report.csv
Copying report.csv -> /backup/report.csv

Multiple Positional Values with nargs

The nargs parameter controls how many values an argument consumes:

import argparse
 
parser = argparse.ArgumentParser(description="Merge files")
 
# One or more files (at least one required)
parser.add_argument("files", nargs="+", help="Files to merge")
 
# Exactly two values
parser.add_argument("--range", nargs=2, type=int, metavar=("START", "END"),
                    help="Row range to extract")
 
args = parser.parse_args()
print(f"Merging: {args.files}")
if args.range:
    print(f"Range: {args.range[0]} to {args.range[1]}")
$ python merge.py data1.csv data2.csv data3.csv --range 10 500
Merging: ['data1.csv', 'data2.csv', 'data3.csv']
Range: 10 to 500

Here is the complete nargs reference:

nargs valueMeaningResult typeExample
(omitted)Exactly one valueSingle value"input.csv"
N (integer)Exactly N valuesList of N items[10, 100]
"?"Zero or one valueSingle value or default"config.yml" or None
"*"Zero or more valuesList (possibly empty)[] or ["a", "b"]
"+"One or more valuesList (error if empty)["a"] or ["a", "b"]
argparse.REMAINDERAll remaining argsListEverything after this arg

Type Conversion on Positional Arguments

Positional arguments are strings by default. Add type to convert them:

import argparse
 
parser = argparse.ArgumentParser(description="Calculate rectangle area")
parser.add_argument("width", type=float, help="Rectangle width")
parser.add_argument("height", type=float, help="Rectangle height")
args = parser.parse_args()
 
area = args.width * args.height
print(f"Area: {area:.2f}")
$ python area.py 3.5 7.2
Area: 25.20
 
$ python area.py three 7.2
usage: area.py [-h] width height
area.py: error: argument width: invalid float value: 'three'

The error message is automatic and tells the user exactly what went wrong.

Optional Arguments

Optional arguments start with - (short form) or -- (long form). They are optional by default and can appear in any order.

import argparse
 
parser = argparse.ArgumentParser(description="Data exporter")
parser.add_argument("-o", "--output", default="output.csv", help="Output file path")
parser.add_argument("-d", "--delimiter", default=",", help="Field delimiter")
parser.add_argument("-v", "--verbose", action="store_true", help="Show detailed progress")
parser.add_argument("-q", "--quiet", action="store_true", help="Suppress all output")
 
args = parser.parse_args()
print(f"Output: {args.output}")
print(f"Delimiter: {repr(args.delimiter)}")
print(f"Verbose: {args.verbose}")
$ python export.py --output results.tsv --delimiter "\t" --verbose
Output: results.tsv
Delimiter: '\\t'
Verbose: True
 
$ python export.py -o results.json -v
Output: results.json
Delimiter: ','
Verbose: True

Default Values

Every optional argument has a default value. If you do not set one explicitly, it defaults to None:

import argparse
 
parser = argparse.ArgumentParser()
parser.add_argument("--name", default="World")      # Default: "World"
parser.add_argument("--port", type=int, default=8080) # Default: 8080
parser.add_argument("--log-file")                     # Default: None
 
args = parser.parse_args()
print(f"Name: {args.name}, Port: {args.port}, Log: {args.log_file}")

Note: when an argument name uses dashes (--log-file), argparse converts it to underscores for the attribute name: args.log_file.

The store_true and store_false Actions

Boolean flags do not take a value. They are either present or absent:

import argparse
 
parser = argparse.ArgumentParser()
parser.add_argument("--verbose", action="store_true", help="Enable verbose mode")
parser.add_argument("--no-cache", action="store_true", help="Disable caching")
parser.add_argument("--dry-run", action="store_true", help="Preview without executing")
 
args = parser.parse_args()
# --verbose present -> args.verbose is True
# --verbose absent  -> args.verbose is False

The count Action for Verbosity Levels

Some tools use repeated flags for verbosity levels (-v, -vv, -vvv):

import argparse
 
parser = argparse.ArgumentParser()
parser.add_argument("-v", "--verbose", action="count", default=0,
                    help="Increase verbosity (-v, -vv, -vvv)")
args = parser.parse_args()
 
if args.verbose >= 3:
    print("TRACE level: showing everything")
elif args.verbose >= 2:
    print("DEBUG level: detailed output")
elif args.verbose >= 1:
    print("INFO level: progress updates")
else:
    print("Default: errors only")
$ python tool.py -vvv
TRACE level: showing everything

The append Action for Repeated Arguments

Use action="append" when users should be able to specify the same flag multiple times:

import argparse
 
parser = argparse.ArgumentParser()
parser.add_argument("-t", "--tag", action="append", default=[],
                    help="Add a tag (can be repeated)")
args = parser.parse_args()
print(f"Tags: {args.tag}")
$ python tool.py -t bug -t urgent -t backend
Tags: ['bug', 'urgent', 'backend']

Type and Choices

Built-in Type Conversion

The type parameter accepts any callable that takes a string and returns a value. You can use built-in types, pathlib.Path for file paths, or custom functions:

import argparse
from pathlib import Path
 
parser = argparse.ArgumentParser()
parser.add_argument("--count", type=int, help="Number of items")
parser.add_argument("--rate", type=float, help="Processing rate")
parser.add_argument("--config", type=Path, help="Config file path")
args = parser.parse_args()

Restricting Values with choices

Use choices to limit an argument to specific allowed values:

import argparse
 
parser = argparse.ArgumentParser(description="Deploy application")
parser.add_argument("environment", choices=["dev", "staging", "production"],
                    help="Target environment")
parser.add_argument("--log-level", choices=["DEBUG", "INFO", "WARNING", "ERROR"],
                    default="INFO", help="Logging level")
parser.add_argument("--replicas", type=int, choices=range(1, 11),
                    default=1, help="Number of replicas (1-10)")
args = parser.parse_args()
$ python deploy.py testing
usage: deploy.py [-h] {dev,staging,production}
deploy.py: error: argument environment: invalid choice: 'testing'
  (choose from 'dev', 'staging', 'production')

Custom Type Functions

Custom type functions are one of the most powerful features in argparse. They let you validate and transform input at parse time:

import argparse
from datetime import datetime
 
def positive_int(value):
    """Accept only positive integers."""
    ivalue = int(value)
    if ivalue <= 0:
        raise argparse.ArgumentTypeError(f"{value} is not a positive integer")
    return ivalue
 
def date_type(value):
    """Parse YYYY-MM-DD date strings."""
    try:
        return datetime.strptime(value, "%Y-%m-%d").date()
    except ValueError:
        raise argparse.ArgumentTypeError(
            f"Invalid date: '{value}'. Expected format: YYYY-MM-DD"
        )
 
def percentage(value):
    """Accept floats between 0 and 100."""
    fvalue = float(value)
    if not 0 <= fvalue <= 100:
        raise argparse.ArgumentTypeError(
            f"{value} is not a valid percentage (0-100)"
        )
    return fvalue
 
parser = argparse.ArgumentParser()
parser.add_argument("--workers", type=positive_int, default=4)
parser.add_argument("--since", type=date_type, help="Start date (YYYY-MM-DD)")
parser.add_argument("--sample", type=percentage, default=100.0,
                    help="Sample percentage (0-100)")
args = parser.parse_args()
$ python tool.py --workers -3
error: argument --workers: -3 is not a positive integer
 
$ python tool.py --since 2026-13-45
error: argument --since: Invalid date: '2026-13-45'. Expected format: YYYY-MM-DD
 
$ python tool.py --sample 150
error: argument --sample: 150 is not a valid percentage (0-100)

Every validation error produces a clear, actionable message without you writing any if/else logic in your main code.

Required Optional Arguments

You can force an optional argument to be required:

import argparse
 
parser = argparse.ArgumentParser()
parser.add_argument("--config", required=True, help="Path to config file")
parser.add_argument("--output", required=True, help="Output directory")
parser.add_argument("--format", default="json", help="Output format")
args = parser.parse_args()
$ python tool.py --format csv
usage: tool.py [-h] --config CONFIG --output OUTPUT [--format FORMAT]
tool.py: error: the following arguments are required: --config, --output

Why this is sometimes a code smell: If every "optional" argument is actually required, you probably want positional arguments instead. Required optionals make sense when the flag name adds clarity (--config is clearer than a bare path) or when you have many required parameters and named flags prevent ordering mistakes.

A better pattern for truly mandatory inputs is often a positional argument:

# Instead of --config (required optional)
parser.add_argument("config", help="Path to config file")
 
# Use required optionals when the name matters
parser.add_argument("--api-key", required=True, help="API authentication key")

Mutually Exclusive Groups

Sometimes arguments conflict with each other. A script might produce either JSON or CSV output, but not both at the same time. add_mutually_exclusive_group() enforces this constraint:

import argparse
 
parser = argparse.ArgumentParser(description="Export data")
 
# Output format: pick exactly one
format_group = parser.add_mutually_exclusive_group(required=True)
format_group.add_argument("--json", action="store_true", help="Export as JSON")
format_group.add_argument("--csv", action="store_true", help="Export as CSV")
format_group.add_argument("--parquet", action="store_true", help="Export as Parquet")
 
# Verbosity: pick at most one
verbosity = parser.add_mutually_exclusive_group()
verbosity.add_argument("-v", "--verbose", action="store_true", help="Detailed output")
verbosity.add_argument("-q", "--quiet", action="store_true", help="Minimal output")
 
parser.add_argument("input_file", help="Input data file")
args = parser.parse_args()
 
if args.json:
    print(f"Exporting {args.input_file} as JSON")
elif args.csv:
    print(f"Exporting {args.input_file} as CSV")
elif args.parquet:
    print(f"Exporting {args.input_file} as Parquet")
$ python export.py data.csv --json --csv
error: argument --csv: not allowed with argument --json
 
$ python export.py data.csv
error: one of the arguments --json --csv --parquet is required
 
$ python export.py data.csv --parquet -v
Exporting data.csv as Parquet

A practical alternative for format selection uses choices instead of separate flags:

import argparse
 
parser = argparse.ArgumentParser()
parser.add_argument("--format", choices=["json", "csv", "parquet"],
                    required=True, help="Output format")
args = parser.parse_args()

Both approaches work. Mutually exclusive groups are better when each option needs its own additional arguments. The choices approach is better when the options are simple strings.

Subcommands with add_subparsers

Real-world CLI tools use subcommands. Think git commit, docker build, pip install. Each subcommand has its own set of arguments, its own help text, and its own handler function. argparse supports this pattern through add_subparsers():

#!/usr/bin/env python3
"""project.py -- A project management CLI tool."""
 
import argparse
 
def handle_init(args):
    """Initialize a new project."""
    print(f"Creating project '{args.name}'")
    print(f"  Template: {args.template}")
    print(f"  Directory: {args.name}/")
 
def handle_build(args):
    """Build the project."""
    mode = "release" if args.optimize else "debug"
    print(f"Building in {mode} mode")
    if args.target:
        print(f"  Target: {args.target}")
 
def handle_test(args):
    """Run tests."""
    print(f"Running tests (coverage={'on' if args.coverage else 'off'})")
    if args.pattern:
        print(f"  Pattern: {args.pattern}")
 
def handle_deploy(args):
    """Deploy the project."""
    print(f"Deploying to {args.environment}")
    if args.dry_run:
        print("  (DRY RUN -- no actual deployment)")
 
def main():
    # Top-level parser
    parser = argparse.ArgumentParser(
        prog="project",
        description="Manage your project lifecycle",
    )
    parser.add_argument("--version", action="version", version="project 1.0.0")
 
    subparsers = parser.add_subparsers(dest="command", required=True,
                                        help="Available commands")
 
    # init subcommand
    init_cmd = subparsers.add_parser("init", help="Create a new project")
    init_cmd.add_argument("name", help="Project name")
    init_cmd.add_argument("--template", default="basic",
                          choices=["basic", "web", "api", "ml"],
                          help="Project template (default: basic)")
    init_cmd.set_defaults(func=handle_init)
 
    # build subcommand
    build_cmd = subparsers.add_parser("build", help="Build the project")
    build_cmd.add_argument("--optimize", action="store_true",
                           help="Enable release optimizations")
    build_cmd.add_argument("--target", help="Build target platform")
    build_cmd.set_defaults(func=handle_build)
 
    # test subcommand
    test_cmd = subparsers.add_parser("test", help="Run tests")
    test_cmd.add_argument("--coverage", action="store_true",
                          help="Generate coverage report")
    test_cmd.add_argument("--pattern", "-p", help="Test name pattern to match")
    test_cmd.set_defaults(func=handle_test)
 
    # deploy subcommand
    deploy_cmd = subparsers.add_parser("deploy", help="Deploy the project")
    deploy_cmd.add_argument("environment",
                            choices=["staging", "production"],
                            help="Target environment")
    deploy_cmd.add_argument("--dry-run", action="store_true",
                            help="Preview without deploying")
    deploy_cmd.set_defaults(func=handle_deploy)
 
    args = parser.parse_args()
    args.func(args)
 
if __name__ == "__main__":
    main()
$ python project.py init myapp --template web
Creating project 'myapp'
  Template: web
  Directory: myapp/
 
$ python project.py build --optimize
Building in release mode
 
$ python project.py test --coverage -p "test_api*"
Running tests (coverage=on)
  Pattern: test_api*
 
$ python project.py deploy production --dry-run
Deploying to production
  (DRY RUN -- no actual deployment)
 
$ python project.py --help
usage: project [-h] [--version] {init,build,test,deploy} ...
 
Manage your project lifecycle
 
positional arguments:
  {init,build,test,deploy}
                        Available commands
    init                Create a new project
    build               Build the project
    test                Run tests
    deploy              Deploy the project
 
$ python project.py deploy --help
usage: project deploy [-h] [--dry-run] {staging,production}
 
positional arguments:
  {staging,production}  Target environment
 
options:
  -h, --help            show this help message and exit
  --dry-run             Preview without deploying

The key pattern here is set_defaults(func=handler). Each subcommand stores its handler function in the parsed namespace, and the main function dispatches to it with args.func(args). This is the standard approach used in production CLI tools.

Argument Groups for Better Help

When your tool has many arguments, you can group them under logical headings:

import argparse
 
parser = argparse.ArgumentParser(description="Data pipeline tool")
 
input_group = parser.add_argument_group("Input options")
input_group.add_argument("--source", required=True, help="Data source path")
input_group.add_argument("--format", choices=["csv", "json", "parquet"], default="csv")
input_group.add_argument("--encoding", default="utf-8", help="File encoding")
 
transform_group = parser.add_argument_group("Transform options")
transform_group.add_argument("--filter", help="Filter expression")
transform_group.add_argument("--sort-by", help="Column to sort by")
transform_group.add_argument("--limit", type=int, help="Max rows to process")
 
output_group = parser.add_argument_group("Output options")
output_group.add_argument("-o", "--output", required=True, help="Output path")
output_group.add_argument("--compress", action="store_true", help="Compress output")
 
args = parser.parse_args()

The --help output groups arguments under their headings, making long argument lists scannable. This is purely a presentation feature -- it does not change how arguments are parsed.

Custom Help and Formatting

The formatter_class Parameter

import argparse
 
parser = argparse.ArgumentParser(
    prog="analyzer",
    description="Analyze datasets and generate reports.\n\n"
                "Supports CSV, JSON, and Parquet input formats.\n"
                "Output can be filtered, sorted, and aggregated.",
    epilog="Examples:\n"
           "  analyzer data.csv --format json --top 10\n"
           "  analyzer data.csv --columns name age --sort-by age\n"
           "  analyzer data.csv --filter 'age > 30' --output results.csv",
    formatter_class=argparse.RawDescriptionHelpFormatter,
)
FormatterWhat it does
HelpFormatterDefault. Wraps text to fit terminal width.
RawDescriptionHelpFormatterPreserves newlines in description and epilog.
RawTextHelpFormatterPreserves newlines everywhere, including argument help.
ArgumentDefaultsHelpFormatterAppends (default: X) to every argument's help text.

metavar and %(default)s

Control what appears in usage messages:

import argparse
 
parser = argparse.ArgumentParser()
parser.add_argument("--top", type=int, default=10, metavar="N",
                    help="Show top N results (default: %(default)s)")
parser.add_argument("--format", default="table", metavar="FMT",
                    help="Output format (default: %(default)s)")

This produces --top N in usage text instead of --top TOP, and the help text shows the actual default value.

argparse vs sys.argv vs click vs typer

Python has several approaches to command-line argument parsing. Here is how they compare:

Featuresys.argvargparseclicktyper
Part of stdlibYesYesNo (pip install)No (pip install)
Type conversionManualBuilt-inBuilt-inAutomatic (type hints)
Help generationNoneAutomaticAutomaticAutomatic
SubcommandsManualadd_subparsers()@group.command()app.command()
ValidationManualchoices, custom typeclick.Choice, callbacksValidators
BoilerplateHighMediumLowVery low
API styleImperativeImperativeDecoratorsDecorators + type hints
Tab completionNoneNoneVia pluginBuilt-in
Prompting / colorsManualNoneBuilt-inBuilt-in
TestingManualparse_args([...])CliRunnerCliRunner
Best forThrowaway scriptsStdlib-only projectsComplex CLIsModern Python 3.7+

Here is the same tool implemented with each approach:

# --- sys.argv: raw, fragile, no help ---
import sys
 
name = sys.argv[1] if len(sys.argv) > 1 else "World"
count = int(sys.argv[2]) if len(sys.argv) > 2 else 1
for _ in range(count):
    print(f"Hello, {name}!")
# --- argparse: stdlib, explicit argument definitions ---
import argparse
 
parser = argparse.ArgumentParser(description="Greet someone")
parser.add_argument("name", nargs="?", default="World", help="Name to greet")
parser.add_argument("-c", "--count", type=int, default=1, help="Repetitions")
args = parser.parse_args()
for _ in range(args.count):
    print(f"Hello, {args.name}!")
# --- click: decorators, pip install click ---
import click
 
@click.command()
@click.argument("name", default="World")
@click.option("-c", "--count", default=1, type=int, help="Repetitions")
def greet(name, count):
    """Greet someone."""
    for _ in range(count):
        click.echo(f"Hello, {name}!")
 
greet()
# --- typer: type hints, pip install typer ---
import typer
 
def greet(name: str = "World", count: int = 1):
    """Greet someone."""
    for _ in range(count):
        print(f"Hello, {name}!")
 
typer.run(greet)

When to use each:

  • sys.argv -- Only for one-off scripts where you grab a single value and do not care about validation.
  • argparse -- When you need a real CLI but cannot add external dependencies. This is the right default for most projects.
  • click -- When you are building a complex CLI with nested command groups, interactive prompts, colored output, and plugin systems.
  • typer -- When you want minimal boilerplate and use Python 3.7+. It infers arguments from type hints and builds on click.

Real-World Project: CSV Data Processor CLI

Here is a complete, production-style CLI tool that reads CSV files, filters data, computes aggregations, and outputs results in multiple formats. This demonstrates how argparse patterns combine in a real project.

#!/usr/bin/env python3
"""csv_processor.py -- A CLI tool for analyzing CSV data."""
 
import argparse
import csv
import json
import sys
from collections import defaultdict
from pathlib import Path
 
 
def read_csv(filepath, encoding="utf-8"):
    """Read a CSV file and return a list of dictionaries."""
    with open(filepath, "r", encoding=encoding) as f:
        reader = csv.DictReader(f)
        return list(reader)
 
 
def cmd_info(args):
    """Display information about a CSV file."""
    rows = read_csv(args.file, args.encoding)
    columns = list(rows[0].keys()) if rows else []
 
    print(f"File:    {args.file}")
    print(f"Rows:    {len(rows)}")
    print(f"Columns: {len(columns)}")
 
    if args.verbose:
        print("\nColumn details:")
        for col in columns:
            non_empty = sum(1 for r in rows if r[col].strip())
            unique = len(set(r[col] for r in rows))
            print(f"  {col:30s}  non-empty: {non_empty:>6d}  unique: {unique:>6d}")
 
 
def cmd_filter(args):
    """Filter rows where a column matches a value."""
    rows = read_csv(args.file, args.encoding)
 
    if args.column not in rows[0]:
        print(f"Error: column '{args.column}' not found.", file=sys.stderr)
        print(f"Available columns: {', '.join(rows[0].keys())}", file=sys.stderr)
        sys.exit(1)
 
    if args.match == "exact":
        filtered = [r for r in rows if r[args.column] == args.value]
    elif args.match == "contains":
        filtered = [r for r in rows if args.value.lower() in r[args.column].lower()]
    elif args.match == "startswith":
        filtered = [r for r in rows if r[args.column].startswith(args.value)]
    elif args.match == "gt":
        threshold = float(args.value)
        filtered = [r for r in rows if _safe_float(r[args.column], float("-inf")) > threshold]
    elif args.match == "lt":
        threshold = float(args.value)
        filtered = [r for r in rows if _safe_float(r[args.column], float("inf")) < threshold]
 
    print(f"Matched {len(filtered)} of {len(rows)} rows", file=sys.stderr)
    _output_rows(filtered, args)
 
 
def cmd_aggregate(args):
    """Group by a column and compute aggregations."""
    rows = read_csv(args.file, args.encoding)
 
    if args.group_by not in rows[0]:
        print(f"Error: column '{args.group_by}' not found.", file=sys.stderr)
        sys.exit(1)
 
    groups = defaultdict(list)
    for row in rows:
        key = row[args.group_by]
        groups[key].append(row)
 
    results = []
    for key, group_rows in sorted(groups.items()):
        result = {args.group_by: key, "count": len(group_rows)}
 
        if args.sum_column:
            total = sum(_safe_float(r[args.sum_column], 0) for r in group_rows)
            result[f"sum_{args.sum_column}"] = round(total, 2)
 
        if args.avg_column:
            values = [_safe_float(r[args.avg_column], None) for r in group_rows]
            values = [v for v in values if v is not None]
            if values:
                result[f"avg_{args.avg_column}"] = round(sum(values) / len(values), 2)
 
        results.append(result)
 
    if args.sort_by == "count":
        results.sort(key=lambda r: r["count"], reverse=True)
 
    if args.top:
        results = results[:args.top]
 
    _output_rows(results, args)
 
 
def cmd_convert(args):
    """Convert CSV to another format."""
    rows = read_csv(args.file, args.encoding)
 
    if args.columns:
        rows = [{k: r.get(k, "") for k in args.columns} for r in rows]
 
    if args.limit:
        rows = rows[:args.limit]
 
    _output_rows(rows, args)
    count = min(args.limit, len(rows)) if args.limit else len(rows)
    print(f"Converted {count} rows", file=sys.stderr)
 
 
def _safe_float(value, default):
    """Try to convert a string to float, return default on failure."""
    try:
        return float(value)
    except (ValueError, TypeError):
        return default
 
 
def _output_rows(rows, args):
    """Write rows to the specified output in the specified format."""
    if not rows:
        return
 
    output_file = open(args.output, "w", newline="") if args.output else sys.stdout
 
    if args.format == "json":
        json.dump(rows, output_file, indent=2)
        output_file.write("\n")
    elif args.format == "csv":
        writer = csv.DictWriter(output_file, fieldnames=rows[0].keys())
        writer.writeheader()
        writer.writerows(rows)
    elif args.format == "tsv":
        writer = csv.DictWriter(output_file, fieldnames=rows[0].keys(),
                                delimiter="\t")
        writer.writeheader()
        writer.writerows(rows)
    elif args.format == "table":
        _print_table(rows, output_file)
 
    if args.output:
        output_file.close()
 
 
def _print_table(rows, out):
    """Print rows as an aligned text table."""
    if not rows:
        return
 
    headers = list(rows[0].keys())
    widths = {h: len(h) for h in headers}
    for row in rows:
        for h in headers:
            widths[h] = max(widths[h], len(str(row.get(h, ""))))
 
    header_line = "  ".join(h.ljust(widths[h]) for h in headers)
    separator = "  ".join("-" * widths[h] for h in headers)
    out.write(header_line + "\n")
    out.write(separator + "\n")
    for row in rows:
        line = "  ".join(str(row.get(h, "")).ljust(widths[h]) for h in headers)
        out.write(line + "\n")
 
 
def main():
    parser = argparse.ArgumentParser(
        prog="csv_processor",
        description="Analyze, filter, aggregate, and convert CSV files.",
        epilog="Examples:\n"
               "  csv_processor info sales.csv --verbose\n"
               "  csv_processor filter sales.csv --column region --value East\n"
               "  csv_processor aggregate sales.csv --group-by region --sum revenue\n"
               "  csv_processor convert sales.csv --format json -o sales.json\n",
        formatter_class=argparse.RawDescriptionHelpFormatter,
    )
    parser.add_argument("--encoding", default="utf-8", help="File encoding")
 
    subparsers = parser.add_subparsers(dest="command", required=True,
                                        help="Available commands")
 
    # --- info ---
    info_p = subparsers.add_parser("info", help="Show CSV file information")
    info_p.add_argument("file", help="CSV file to inspect")
    info_p.add_argument("-v", "--verbose", action="store_true",
                        help="Show per-column statistics")
    info_p.set_defaults(func=cmd_info, format="table", output=None)
 
    # --- filter ---
    filter_p = subparsers.add_parser("filter", help="Filter rows by column value")
    filter_p.add_argument("file", help="Input CSV file")
    filter_p.add_argument("--column", "-c", required=True, help="Column to filter on")
    filter_p.add_argument("--value", "-V", required=True, help="Value to match against")
    filter_p.add_argument("--match", "-m",
                          choices=["exact", "contains", "startswith", "gt", "lt"],
                          default="exact", help="Match mode (default: exact)")
    filter_p.add_argument("--format", "-f",
                          choices=["csv", "json", "tsv", "table"],
                          default="table", help="Output format")
    filter_p.add_argument("-o", "--output", help="Output file (default: stdout)")
    filter_p.set_defaults(func=cmd_filter)
 
    # --- aggregate ---
    agg_p = subparsers.add_parser("aggregate", help="Group and aggregate data")
    agg_p.add_argument("file", help="Input CSV file")
    agg_p.add_argument("--group-by", "-g", required=True, help="Column to group by")
    agg_p.add_argument("--sum", dest="sum_column", help="Column to sum")
    agg_p.add_argument("--avg", dest="avg_column", help="Column to average")
    agg_p.add_argument("--sort-by", choices=["name", "count"], default="name",
                        help="Sort results by name or count")
    agg_p.add_argument("--top", type=int, help="Show only top N groups")
    agg_p.add_argument("--format", "-f",
                        choices=["csv", "json", "tsv", "table"],
                        default="table", help="Output format")
    agg_p.add_argument("-o", "--output", help="Output file (default: stdout)")
    agg_p.set_defaults(func=cmd_aggregate)
 
    # --- convert ---
    conv_p = subparsers.add_parser("convert", help="Convert CSV to another format")
    conv_p.add_argument("file", help="Input CSV file")
    conv_p.add_argument("--format", "-f", required=True,
                        choices=["json", "tsv", "csv", "table"],
                        help="Target format")
    conv_p.add_argument("--columns", nargs="+", metavar="COL",
                        help="Include only these columns")
    conv_p.add_argument("--limit", type=int, help="Max rows to convert")
    conv_p.add_argument("-o", "--output", help="Output file (default: stdout)")
    conv_p.set_defaults(func=cmd_convert)
 
    args = parser.parse_args()
    args.func(args)
 
 
if __name__ == "__main__":
    main()

This tool combines subcommands, custom type handling, multiple output formats, argument groups, and proper error reporting. Each subcommand has its own focused set of arguments. Users get full --help for every subcommand.

Visualizing the results: After filtering or aggregating CSV data with a CLI tool like this, you often want to explore the output visually. PyGWalker (opens in a new tab) turns any pandas DataFrame into an interactive, Tableau-like visualization interface -- directly in a Jupyter notebook or script. Pipe your CLI output into a DataFrame and use PyGWalker to build charts without writing plotting code:

import pandas as pd
import pygwalker as pyg
 
# Load the CSV processor output
df = pd.read_csv("aggregated_results.csv")
 
# Launch interactive visual explorer
walker = pyg.walk(df)

For interactive development of CLI tools like this, RunCell (opens in a new tab) lets you build and test CLI tools inside Jupyter with AI assistance. You can prototype argument parsing logic in notebook cells, test with different argument combinations, and then export to a standalone script.

Common Errors and Fixes

1. error: unrecognized arguments

This happens when you pass an argument that was not defined:

$ python tool.py --output results.csv --compress
error: unrecognized arguments: --compress

Fix: Either add the argument to your parser, or use parse_known_args() if you intentionally want to ignore unknown arguments:

args, unknown = parser.parse_known_args()
# args contains recognized arguments
# unknown is a list of unrecognized strings

2. error: argument --count: invalid int value

$ python tool.py --count three
error: argument --count: invalid int value: 'three'

Fix: This is argparse working correctly. The user passed a non-integer string to a type=int argument. Your custom type functions should raise ArgumentTypeError with a clear message.

3. Attribute names with dashes

parser.add_argument("--log-level", default="INFO")
args = parser.parse_args()
 
# This is a syntax error:
# print(args.log-level)  # Python interprets this as args.log minus level
 
# Correct:
print(args.log_level)  # Dashes become underscores

4. Subcommand not triggering any action

# Problem: no error when no subcommand is given
subparsers = parser.add_subparsers(dest="command")
# Solution: add required=True
subparsers = parser.add_subparsers(dest="command", required=True)

Without required=True, running the script with no subcommand silently does nothing. In Python 3.7+, always set required=True on subparsers.

5. Duplicate help messages from propagation

If both a parent parser and a subparser define the same argument, users get confused:

# BAD: --verbose defined on both parent and child
parser.add_argument("--verbose", action="store_true")
sub = subparsers.add_parser("run")
sub.add_argument("--verbose", action="store_true")  # Shadows parent's --verbose
 
# GOOD: put shared arguments on the parent only
parser.add_argument("--verbose", action="store_true")

6. parse_args() running on import

# BAD: parse_args() fires when another module imports this file
parser = argparse.ArgumentParser()
parser.add_argument("--name")
args = parser.parse_args()  # Crashes if imported without CLI args
 
# GOOD: wrap everything in main()
def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--name")
    args = parser.parse_args()
    return args
 
if __name__ == "__main__":
    main()

7. FileType opens files too early

# Risky: file is opened at parse time, before your code validates other args
parser.add_argument("output", type=argparse.FileType("w"))
 
# Safer: accept a path string, open the file yourself with a context manager
parser.add_argument("output", help="Output file path")
args = parser.parse_args()
with open(args.output, "w") as f:
    f.write("data")

Advanced Patterns

Parent Parsers for Shared Arguments

When multiple subcommands share the same set of arguments, use parent parsers to avoid repetition:

import argparse
 
# Shared arguments
parent = argparse.ArgumentParser(add_help=False)
parent.add_argument("--verbose", "-v", action="store_true")
parent.add_argument("--output", "-o", default="output.csv")
 
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers(dest="command", required=True)
 
# Both subcommands inherit --verbose and --output
cmd_a = subparsers.add_parser("analyze", parents=[parent])
cmd_a.add_argument("file", help="File to analyze")
 
cmd_b = subparsers.add_parser("compare", parents=[parent])
cmd_b.add_argument("file_a", help="First file")
cmd_b.add_argument("file_b", help="Second file")

Environment Variable Defaults

Pull defaults from environment variables for deployment flexibility:

import argparse
import os
 
parser = argparse.ArgumentParser()
parser.add_argument("--api-key",
                    default=os.environ.get("API_KEY"),
                    help="API key (or set API_KEY env var)")
parser.add_argument("--port", type=int,
                    default=int(os.environ.get("PORT", "8080")),
                    help="Server port (default: $PORT or 8080)")
args = parser.parse_args()
 
if not args.api_key:
    parser.error("--api-key is required (or set API_KEY environment variable)")

Testing argparse Scripts

You can integrate argparse testing into your unittest or pytest test suites. Test your CLI tools by passing argument lists to parse_args():

import argparse
 
def create_parser():
    parser = argparse.ArgumentParser()
    parser.add_argument("name")
    parser.add_argument("--count", type=int, default=1)
    return parser
 
# In your test file
def test_parser_defaults():
    parser = create_parser()
    args = parser.parse_args(["Alice"])
    assert args.name == "Alice"
    assert args.count == 1
 
def test_parser_with_options():
    parser = create_parser()
    args = parser.parse_args(["Bob", "--count", "5"])
    assert args.name == "Bob"
    assert args.count == 5

Separating create_parser() from main() makes your argument definitions testable without running the full script.

FAQ

What is Python argparse used for?

Python argparse is the standard library module for parsing command-line arguments. It reads strings from sys.argv, converts them to typed Python objects, validates inputs, generates --help output, and reports errors when users provide incorrect arguments. It is used to turn scripts into proper CLI tools that accept configurable inputs without editing source code.

What is the difference between positional and optional arguments in argparse?

Positional arguments have no dashes in their name (e.g., parser.add_argument("filename")). They are required by default and matched by position. Optional arguments start with dashes (e.g., parser.add_argument("--verbose")). They are optional by default and can appear in any order. You can make optional arguments required with required=True.

How do I create subcommands like git commit or docker build?

Use parser.add_subparsers() to create a subcommand group. Call subparsers.add_parser("name") for each subcommand. Each subparser gets its own set of arguments. Use set_defaults(func=handler_function) to link each subcommand to a handler, then call args.func(args) to dispatch.

Should I use argparse or click for my Python CLI?

Use argparse when you need zero external dependencies and your CLI is straightforward. Use click when you need advanced features like interactive prompts, colored terminal output, progress bars, or deeply nested command groups. Typer is another option that uses Python type hints and requires even less boilerplate than click.

How do I validate custom input types with argparse?

Write a function that takes a string and returns the converted value. If the input is invalid, raise argparse.ArgumentTypeError with a descriptive message. Pass this function as the type parameter: parser.add_argument("--date", type=my_date_parser). argparse calls your function automatically and displays the error message if validation fails.

Can I use argparse with environment variables?

Yes. Set the default parameter to read from os.environ: parser.add_argument("--api-key", default=os.environ.get("API_KEY")). This lets users configure values through environment variables while still allowing command-line flags to override them.

Conclusion

Python argparse transforms scripts from fragile, edit-the-source-code tools into proper command-line programs. The module handles type conversion, validation, help generation, and error messages so you focus on what the script does, not how it receives input.

The patterns to remember:

  • Use positional arguments for required inputs, dashed arguments for optional ones.
  • Set type for automatic conversion, choices for restricted values, nargs for multiple values.
  • Use action="store_true" for boolean flags and action="count" for verbosity levels.
  • Build complex tools with add_subparsers() and the set_defaults(func=handler) dispatch pattern.
  • Use mutually exclusive groups when arguments conflict.
  • Wrap parsing in main() and guard it with if __name__ == "__main__".
  • Write custom type functions for domain-specific validation.

argparse is not the newest argument parsing library in Python. But it ships with every Python installation, handles the vast majority of CLI needs, and produces tools that behave the way users expect. For most Python developers, that is the right choice. When your CLI tool needs to invoke external commands, combine argparse with the subprocess module for a complete command-line automation solution.

Related Guides

📚