Python subprocess 模块是什么？

Python subprocess 模块允许你创建新进程、连接到其输入/输出/错误管道并获取返回码。它替代了多个旧的命令执行函数，如 os.system() 和 os.popen()。从 Python 2.4+ 开始可用，提供了一种统一且强大的方式来与操作系统命令交互。

subprocess 中的 shell=True 危险吗？

当传递未经净化的用户参数时，shell=True 可能会产生命令注入漏洞。如果攻击者能够控制命令字符串，就可以执行任意系统命令。使用 shell=False（默认值）并将命令作为列表传递。只有在需要变量扩展等 shell 功能，或使用不作为独立可执行文件存在的内置 shell 命令时，才使用 shell=True。

既然推荐使用 subprocess，为什么 os.system() 仍然可以工作？

os.system() 的存在是为了向后兼容，对于基本命令来说更简单。但它缺乏捕获输出、正确处理错误、设置超时或可靠访问返回码的能力。subprocess 提供了所有这些功能，并取代了 os.system()、os.popen() 和 commands 模块。对于任何需要可靠性和控制的生产代码，使用 subprocess。

Python subprocess：从 Python 运行外部命令（完整指南）

Q: subprocess.run() 和 subprocess.Popen 有什么区别？

subprocess.run() 是高级接口，执行命令、等待完成，并返回包含返回码、stdout 和 stderr 的 CompletedProcess。subprocess.Popen 是低级接口，提供对进程的完全控制，允许在进程运行时与其通信、实时读取输出并处理异步操作。大多数情况下使用 run()；需要更精细控制时使用 Popen。

Q: 如何捕获 subprocess 命令的输出？

在 subprocess.run() 中使用 capture_output=True 来同时捕获 stdout 和 stderr。这等同于传递 stdout=subprocess.PIPE 和 stderr=subprocess.PIPE。结果将作为字节类型通过 result.stdout 和 result.stderr 获取。添加 text=True 可获得字符串而非字节。示例：result = subprocess.run(['cmd', 'arg'], capture_output=True, text=True)。

Q: 如何在 Python 中处理 subprocess 超时？

将 timeout 参数（以秒为单位）传递给 subprocess.run()。如果进程在指定时间内未完成，将引发 subprocess.TimeoutExpired 并终止子进程。对于 Popen，使用 communicate(timeout=N) 而不是 wait() 来避免死锁。始终在 try-except 块中处理 TimeoutExpired，以便正确清理超时的子进程。

Name: Soren Atelier

更新于 2026/2/12

Python 脚本经常需要调用外部程序。你可能需要运行一条 shell 命令来压缩文件、调用 git 做版本控制、使用 ffmpeg 这类系统工具处理视频，或在数据流水线中执行某个已编译的二进制程序。但如果你直接使用 os.system() 或反引号风格的“黑魔法”，代码往往会变得脆弱、不安全，而且一旦出问题几乎无法调试。

这种痛点会很快放大。输出会消失在虚空中，因为你没有办法捕获它。错误会悄无声息地被忽略，因为返回码没人检查。只要有一个用户提供的文件名里包含空格或分号，你看似无害的脚本就可能变成一个 shell 注入漏洞。而当子进程卡住时，你整个 Python 程序也会跟着卡住——没有超时、没有恢复、没有解释。

Python 的 subprocess 模块是标准答案。它用一套统一一致的 API 替代了 os.system()、os.popen() 以及已废弃的 commands 模块，支持创建进程、捕获输出、处理错误、设置超时、构建管道等能力。本指南覆盖你高效且安全使用它所需要的一切。

使用 subprocess.run() 快速开始

subprocess.run() 函数在 Python 3.5 引入，是运行外部命令的推荐方式。它会执行命令、等待其结束，并返回一个 CompletedProcess 对象。

import subprocess
 
# Run a simple command
result = subprocess.run(["ls", "-la"], capture_output=True, text=True)
 
print(result.stdout)       # standard output as a string
print(result.stderr)       # standard error as a string
print(result.returncode)   # 0 means success

关键参数：

capture_output=True 捕获 stdout 和 stderr（等价于 stdout=subprocess.PIPE, stderr=subprocess.PIPE）
text=True 将输出解码为字符串而非 bytes
命令以字符串列表的形式传入，每个参数都是列表中的独立元素

import subprocess
 
# Run a command with arguments
result = subprocess.run(
    ["python", "--version"],
    capture_output=True,
    text=True
)
print(result.stdout.strip())  # e.g., "Python 3.12.1"

理解命令参数：列表 vs 字符串

subprocess.run() 的第一个参数可以是列表或字符串。这个差异对正确性与安全性都很关键。

列表形式（推荐）

列表中的每个元素都是一个独立参数。Python 会将它们直接传给操作系统，不经过 shell 解释。

import subprocess
 
# Each argument is a separate list element
result = subprocess.run(
    ["grep", "-r", "TODO", "/home/user/project"],
    capture_output=True,
    text=True
)
print(result.stdout)

即使文件名里有空格、引号或特殊字符，也能正确工作，因为每个参数都会原样传递：

import subprocess
 
# Filename with spaces -- works correctly as a list element
result = subprocess.run(
    ["cat", "my file with spaces.txt"],
    capture_output=True,
    text=True
)

字符串形式（需要 shell=True）

传入单个字符串通常需要 shell=True，这会调用系统 shell（Unix 上是 /bin/sh，Windows 上是 cmd.exe）来解释命令。

import subprocess
 
# String form requires shell=True
result = subprocess.run(
    "ls -la | grep '.py'",
    shell=True,
    capture_output=True,
    text=True
)
print(result.stdout)

这能启用 shell 特性，比如管道（|）、重定向（>）、通配符（*.py）以及环境变量展开（$HOME）。但同时也会带来严重的安全风险，我们会在后面的安全章节详细说明。

捕获输出

分别捕获 stdout 和 stderr

import subprocess
 
result = subprocess.run(
    ["python", "-c", "import sys; print('out'); print('err', file=sys.stderr)"],
    capture_output=True,
    text=True
)
 
print(f"stdout: {result.stdout}")   # "out\n"
print(f"stderr: {result.stderr}")   # "err\n"

将 stderr 合并到 stdout

有时你希望把所有输出合并到一个流里。使用 stderr=subprocess.STDOUT：

import subprocess
 
result = subprocess.run(
    ["python", "-c", "import sys; print('out'); print('err', file=sys.stderr)"],
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT,
    text=True
)
 
print(result.stdout)  # Contains both "out\n" and "err\n"

丢弃输出

将输出指向 subprocess.DEVNULL 来抑制输出：

import subprocess
 
# Run silently -- discard all output
result = subprocess.run(
    ["apt-get", "update"],
    stdout=subprocess.DEVNULL,
    stderr=subprocess.DEVNULL
)

二进制输出

省略 text=True 以获取原始 bytes。对图片、压缩文件等二进制数据很有用：

import subprocess
 
# Capture binary output (e.g., from curl)
result = subprocess.run(
    ["curl", "-s", "https://example.com/image.png"],
    capture_output=True
)
 
image_bytes = result.stdout  # bytes object
print(f"Downloaded {len(image_bytes)} bytes")

错误处理

手动检查返回码

默认情况下，subprocess.run() 在命令失败时不会抛异常。你需要自己检查 returncode：

import subprocess
 
result = subprocess.run(
    ["ls", "/nonexistent/path"],
    capture_output=True,
    text=True
)
 
if result.returncode != 0:
    print(f"Command failed with code {result.returncode}")
    print(f"Error: {result.stderr}")

使用 check=True 在失败时自动抛异常

check=True 会在返回码非 0 时抛出 subprocess.CalledProcessError：

import subprocess
 
try:
    result = subprocess.run(
        ["ls", "/nonexistent/path"],
        capture_output=True,
        text=True,
        check=True
    )
except subprocess.CalledProcessError as e:
    print(f"Command failed with return code {e.returncode}")
    print(f"stderr: {e.stderr}")
    print(f"stdout: {e.stdout}")

对于“应该永远成功”的命令，这是推荐模式。它会强制你显式处理失败，而不是悄悄忽略。

处理“命令不存在”

如果可执行文件不存在，Python 会抛出 FileNotFoundError：

import subprocess
 
try:
    result = subprocess.run(
        ["nonexistent_command"],
        capture_output=True,
        text=True
    )
except FileNotFoundError:
    print("Command not found -- is it installed and in PATH?")

超时

耗时很久或卡死的进程会让脚本永久阻塞。timeout 参数（秒）会在超时后终止进程，并抛出 subprocess.TimeoutExpired：

import subprocess
 
try:
    result = subprocess.run(
        ["sleep", "30"],
        timeout=5,
        capture_output=True,
        text=True
    )
except subprocess.TimeoutExpired:
    print("Process timed out after 5 seconds")

这对网络命令、外部 API 调用或任何可能挂起的进程都至关重要：

import subprocess
 
def run_with_timeout(cmd, timeout_seconds=30):
    """Run a command with timeout and error handling."""
    try:
        result = subprocess.run(
            cmd,
            capture_output=True,
            text=True,
            timeout=timeout_seconds,
            check=True
        )
        return result.stdout
    except subprocess.TimeoutExpired:
        print(f"Command timed out after {timeout_seconds}s: {' '.join(cmd)}")
        return None
    except subprocess.CalledProcessError as e:
        print(f"Command failed (code {e.returncode}): {e.stderr}")
        return None
    except FileNotFoundError:
        print(f"Command not found: {cmd[0]}")
        return None
 
# Usage
output = run_with_timeout(["ping", "-c", "4", "example.com"], timeout_seconds=10)
if output:
    print(output)

向进程传递输入

使用 input 参数向进程的 stdin 发送数据：

import subprocess
 
# Send text to stdin
result = subprocess.run(
    ["grep", "error"],
    input="line 1\nerror on line 2\nline 3\nerror on line 4\n",
    capture_output=True,
    text=True
)
 
print(result.stdout)
# "error on line 2\nerror on line 4\n"

这可以替代常见的“用 shell 管道把数据传来传去”的写法：

import subprocess
import json
 
# Send JSON to a processing command
data = {"name": "Alice", "score": 95}
json_string = json.dumps(data)
 
result = subprocess.run(
    ["python", "-c", "import sys, json; d = json.load(sys.stdin); print(d['name'])"],
    input=json_string,
    capture_output=True,
    text=True
)
 
print(result.stdout.strip())  # "Alice"

环境变量

默认情况下，子进程会继承当前环境。你可以修改它：

import subprocess
import os
 
# Add or override environment variables
custom_env = os.environ.copy()
custom_env["API_KEY"] = "secret123"
custom_env["DEBUG"] = "true"
 
result = subprocess.run(
    ["python", "-c", "import os; print(os.environ.get('API_KEY'))"],
    env=custom_env,
    capture_output=True,
    text=True
)
 
print(result.stdout.strip())  # "secret123"

始终用 os.environ.copy() 作为基础。直接传入一个不包含现有环境的 dict 会导致继承环境被清空，从而破坏依赖 PATH、HOME 等变量的命令。

工作目录

cwd 参数为子进程设置工作目录：

import subprocess
 
# Run git status in a specific repository
result = subprocess.run(
    ["git", "status", "--short"],
    cwd="/home/user/my-project",
    capture_output=True,
    text=True
)
 
print(result.stdout)

subprocess.run() vs Popen：何时用哪个

subprocess.run() 是对 subprocess.Popen 的便捷封装。大多数场景用 run() 就够了。只有当你需要以下能力时再用 Popen：

实时流式输出（按行读取、边产出边处理）
与运行中的进程交互（循环发送输入、读取输出）
构建多步骤管道，把多个进程串联起来
非阻塞执行，并手动管理进程生命周期

对比表

特性	`subprocess.run()`	`subprocess.Popen`	`os.system()`
推荐	Yes (Python 3.5+)	Yes (advanced)	No (deprecated pattern)
捕获输出	Yes (`capture_output=True`)	Yes (via PIPE)	No
返回值	`CompletedProcess` object	`Popen` process object	Exit code (int)
超时支持	Yes (`timeout` param)	Manual (via `wait`/`communicate`)	No
错误检查	`check=True` raises exception	Manual	Must parse exit code
stdin 输入	`input` parameter	`communicate()` or `stdin.write()`	No
实时输出	No (waits for completion)	Yes (stream line by line)	Output goes to terminal
管道	Limited (single command)	Yes (chain multiple Popen)	Yes (via shell string)
安全性	Safe with list args	Safe with list args	Shell injection risk
Shell 特性	Only with `shell=True`	Only with `shell=True`	Always uses shell

进阶：subprocess.Popen

Popen 让你对进程生命周期拥有完全控制。构造函数会立刻启动进程并返回一个 Popen 对象供你交互。

基本 Popen 用法

import subprocess
 
proc = subprocess.Popen(
    ["ls", "-la"],
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)
 
stdout, stderr = proc.communicate()  # Wait for completion and get output
print(f"Return code: {proc.returncode}")
print(stdout)

实时流式输出

不同于 run()，Popen 允许你按行读取输出，并在输出产生时立刻处理：

import subprocess
 
proc = subprocess.Popen(
    ["ping", "-c", "5", "example.com"],
    stdout=subprocess.PIPE,
    text=True
)
 
# Read output line by line as it arrives
for line in proc.stdout:
    print(f"[LIVE] {line.strip()}")
 
proc.wait()  # Wait for process to finish
print(f"Exit code: {proc.returncode}")

对需要显示进度或实时写日志的长命令，这是必不可少的：

import subprocess
import sys
 
def run_with_live_output(cmd):
    """Run a command and stream its output in real time."""
    proc = subprocess.Popen(
        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        text=True,
        bufsize=1  # Line-buffered
    )
 
    output_lines = []
    for line in proc.stdout:
        line = line.rstrip()
        print(line)
        output_lines.append(line)
 
    proc.wait()
    return proc.returncode, "\n".join(output_lines)
 
# Usage
code, output = run_with_live_output(["pip", "install", "requests"])
print(f"\nFinished with exit code: {code}")

构建管道（Pipelines）

通过把一个进程的 stdout 接到另一个进程的 stdin 来连接多个命令：

import subprocess
 
# Equivalent to: cat /var/log/syslog | grep "error" | wc -l
p1 = subprocess.Popen(
    ["cat", "/var/log/syslog"],
    stdout=subprocess.PIPE
)
 
p2 = subprocess.Popen(
    ["grep", "error"],
    stdin=p1.stdout,
    stdout=subprocess.PIPE
)
 
# Allow p1 to receive SIGPIPE if p2 exits early
p1.stdout.close()
 
p3 = subprocess.Popen(
    ["wc", "-l"],
    stdin=p2.stdout,
    stdout=subprocess.PIPE,
    text=True
)
 
p2.stdout.close()
 
output, _ = p3.communicate()
print(f"Error count: {output.strip()}")

p1.stdout.close() 在连接到 p2 之后非常重要：如果 p2 提前退出，它能让 p1 收到 SIGPIPE，从而避免死锁。

交互式进程通信

import subprocess
 
# Start a Python REPL as a subprocess
proc = subprocess.Popen(
    ["python", "-i"],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    text=True
)
 
# Send commands and get results
stdout, stderr = proc.communicate(input="print(2 + 2)\nprint('hello')\n")
print(f"stdout: {stdout}")
print(f"stderr: {stderr}")

shell=True：强大但危险

设置 shell=True 会让命令通过系统 shell 执行，启用 shell 特性，但也引入安全风险。

什么时候 shell=True 有用

import subprocess
 
# Shell features: pipes, redirects, globbing, env vars
result = subprocess.run(
    "ls *.py | wc -l",
    shell=True,
    capture_output=True,
    text=True
)
print(f"Python files: {result.stdout.strip()}")
 
# Environment variable expansion
result = subprocess.run(
    "echo $HOME",
    shell=True,
    capture_output=True,
    text=True
)
print(result.stdout.strip())

shell 注入问题

绝不要把未经清洗的用户输入传给 shell=True：

import subprocess
 
# DANGEROUS -- shell injection vulnerability
user_input = "file.txt; rm -rf /"  # malicious input
subprocess.run(f"cat {user_input}", shell=True)  # Executes "rm -rf /"!
 
# SAFE -- use list form without shell=True
subprocess.run(["cat", user_input])  # Treats entire string as filename

如果你确实必须在 shell=True 中拼接动态值，请使用 shlex.quote()：

import subprocess
import shlex
 
user_input = "file with spaces.txt; rm -rf /"
safe_input = shlex.quote(user_input)
 
# shlex.quote wraps in single quotes, neutralizing shell metacharacters
result = subprocess.run(
    f"cat {safe_input}",
    shell=True,
    capture_output=True,
    text=True
)

但最安全的做法是完全避免 shell=True，用 Python 复刻 shell 特性：

import subprocess
import glob
 
# Instead of: subprocess.run("ls *.py | wc -l", shell=True)
py_files = glob.glob("*.py")
print(f"Python files: {len(py_files)}")
 
# Instead of: subprocess.run("cat file1.txt file2.txt > combined.txt", shell=True)
with open("combined.txt", "w") as outfile:
    result = subprocess.run(
        ["cat", "file1.txt", "file2.txt"],
        stdout=outfile
    )

安全最佳实践

Practice	Do	Don't
命令格式	`["cmd", "arg1", "arg2"]`	`f"cmd {user_input}"` with `shell=True`
用户输入	shell 必须时使用 `shlex.quote()`	把字符串直接拼接到命令里
Shell 模式	`shell=False`（默认）	`shell=True` 且输入不可信
可执行文件路径	使用完整路径如 `/usr/bin/git`	在安全敏感代码里依赖 PATH
输入校验	传入前先校验与清洗	直接把原始用户输入交给命令

import subprocess
import shlex
from pathlib import Path
 
def safe_file_operation(filename):
    """Safely run a command with user-supplied filename."""
    # Validate input
    path = Path(filename)
    if not path.exists():
        raise FileNotFoundError(f"File not found: {filename}")
 
    # Check for path traversal
    resolved = path.resolve()
    allowed_dir = Path("/home/user/uploads").resolve()
    if not str(resolved).startswith(str(allowed_dir)):
        raise PermissionError("Access denied: file outside allowed directory")
 
    # Use list form -- no shell injection possible
    result = subprocess.run(
        ["wc", "-l", str(resolved)],
        capture_output=True,
        text=True,
        check=True
    )
    return result.stdout.strip()

真实世界示例

运行 git 命令

import subprocess
 
def git_status(repo_path):
    """Get git status for a repository."""
    result = subprocess.run(
        ["git", "status", "--porcelain"],
        cwd=repo_path,
        capture_output=True,
        text=True,
        check=True
    )
    return result.stdout.strip()
 
def git_log(repo_path, n=5):
    """Get last n commit messages."""
    result = subprocess.run(
        ["git", "log", f"--oneline", f"-{n}"],
        cwd=repo_path,
        capture_output=True,
        text=True,
        check=True
    )
    return result.stdout.strip()
 
status = git_status("/home/user/my-project")
if status:
    print("Uncommitted changes:")
    print(status)
else:
    print("Working directory clean")

压缩与解压文件

import subprocess
 
def compress_directory(source_dir, output_file):
    """Create a tar.gz archive of a directory."""
    subprocess.run(
        ["tar", "-czf", output_file, "-C", source_dir, "."],
        check=True
    )
    print(f"Created archive: {output_file}")
 
def extract_archive(archive_file, dest_dir):
    """Extract a tar.gz archive."""
    subprocess.run(
        ["tar", "-xzf", archive_file, "-C", dest_dir],
        check=True
    )
    print(f"Extracted to: {dest_dir}")
 
compress_directory("/home/user/data", "/tmp/data_backup.tar.gz")

查看系统信息

import subprocess
 
def get_disk_usage():
    """Get disk usage summary."""
    result = subprocess.run(
        ["df", "-h", "/"],
        capture_output=True,
        text=True,
        check=True
    )
    return result.stdout
 
def get_memory_info():
    """Get memory usage on Linux."""
    result = subprocess.run(
        ["free", "-h"],
        capture_output=True,
        text=True,
        check=True
    )
    return result.stdout
 
def get_process_list(filter_name=None):
    """List running processes, optionally filtered."""
    cmd = ["ps", "aux"]
    result = subprocess.run(cmd, capture_output=True, text=True, check=True)
 
    if filter_name:
        lines = result.stdout.strip().split("\n")
        header = lines[0]
        matching = [line for line in lines[1:] if filter_name in line]
        return header + "\n" + "\n".join(matching)
 
    return result.stdout
 
print(get_disk_usage())

用外部工具处理数据文件

import subprocess
import csv
import io
 
def sort_csv_by_column(input_file, column_index=1):
    """Sort a CSV file using the system sort command (fast for large files)."""
    result = subprocess.run(
        ["sort", "-t,", f"-k{column_index}", input_file],
        capture_output=True,
        text=True,
        check=True
    )
    return result.stdout
 
def count_lines(filepath):
    """Count lines in a file using wc (faster than Python for huge files)."""
    result = subprocess.run(
        ["wc", "-l", filepath],
        capture_output=True,
        text=True,
        check=True
    )
    return int(result.stdout.strip().split()[0])
 
def search_in_files(directory, pattern, file_type="*.py"):
    """Search for a pattern in files using grep."""
    result = subprocess.run(
        ["grep", "-rn", "--include", file_type, pattern, directory],
        capture_output=True,
        text=True
    )
    # grep returns exit code 1 if no matches found (not an error)
    if result.returncode == 0:
        return result.stdout
    elif result.returncode == 1:
        return ""  # No matches
    else:
        raise subprocess.CalledProcessError(result.returncode, result.args)
 
matches = search_in_files("/home/user/project", "TODO")
if matches:
    print(matches)
else:
    print("No TODOs found")

自动化部署脚本

import subprocess
import sys
 
def deploy(repo_path, branch="main"):
    """Simple deployment script using subprocess."""
    steps = [
        (["git", "fetch", "origin"], "Fetching latest changes"),
        (["git", "checkout", branch], f"Switching to {branch}"),
        (["git", "pull", "origin", branch], "Pulling latest code"),
        (["pip", "install", "-r", "requirements.txt"], "Installing dependencies"),
        (["python", "manage.py", "migrate"], "Running migrations"),
        (["python", "manage.py", "collectstatic", "--noinput"], "Collecting static files"),
    ]
 
    for cmd, description in steps:
        print(f"\n--- {description} ---")
        try:
            result = subprocess.run(
                cmd,
                cwd=repo_path,
                capture_output=True,
                text=True,
                check=True,
                timeout=120
            )
            if result.stdout:
                print(result.stdout)
        except subprocess.CalledProcessError as e:
            print(f"FAILED: {e.stderr}")
            sys.exit(1)
        except subprocess.TimeoutExpired:
            print(f"TIMEOUT: {description} took too long")
            sys.exit(1)
 
    print("\nDeployment complete")

跨平台注意事项

Windows 与 Unix 下命令行为不同。要写出可移植代码：

import subprocess
import platform
 
def run_command(cmd_unix, cmd_windows=None):
    """Run a command with platform awareness."""
    if platform.system() == "Windows":
        cmd = cmd_windows or cmd_unix
        # Windows often needs shell=True for built-in commands
        return subprocess.run(cmd, shell=True, capture_output=True, text=True)
    else:
        return subprocess.run(cmd, capture_output=True, text=True)
 
# List directory contents
result = run_command(
    cmd_unix=["ls", "-la"],
    cmd_windows="dir"
)
print(result.stdout)

主要平台差异：

Feature	Unix/macOS	Windows
Shell	`/bin/sh`	`cmd.exe`
路径分隔符	`/`	`\\`
内置命令（dir, copy）	不可用	需要 `shell=True`
可执行文件扩展名	不需要	有时需要 `.exe`
信号处理	完整 POSIX signals	有限
`shlex.quote()`	可用	使用 `subprocess.list2cmdline()`

在 Jupyter Notebooks 中运行 subprocess

在 Jupyter notebook 中运行 shell 命令是数据科学家的常见工作流。虽然 Jupyter 支持 !command 语法用于快速调用 shell，但 subprocess 能让你在 Python 代码里进行更规范的错误处理与输出捕获。

在 notebook 里调试 subprocess 调用时——尤其是命令静默失败或输出异常——RunCell (opens in a new tab) 会很有帮助。RunCell 是一个面向 Jupyter 的 AI agent，能够理解你的 notebook 上下文。它可以诊断某条 subprocess 命令为何失败、建议正确参数，并处理平台特定的坑点。你无需在终端与 notebook 间来回切换调试 shell 命令，RunCell 可以直接在 cell 里追踪问题。

import subprocess
 
# In a Jupyter notebook: capture and display command output
result = subprocess.run(
    ["pip", "list", "--format=columns"],
    capture_output=True,
    text=True
)
 
# Display as formatted output in the notebook
print(result.stdout)

常见错误与修复方法

错误 1：忘记捕获输出

import subprocess
 
# Output goes to terminal, not captured
result = subprocess.run(["ls", "-la"])
print(result.stdout)  # None!
 
# Fix: add capture_output=True
result = subprocess.run(["ls", "-la"], capture_output=True, text=True)
print(result.stdout)  # Actual output

错误 2：字符串形式但没加 shell=True

import subprocess
 
# Fails: string passed without shell=True
# subprocess.run("ls -la")  # FileNotFoundError: "ls -la" is not a program
 
# Fix option 1: use a list
subprocess.run(["ls", "-la"])
 
# Fix option 2: use shell=True (less safe)
subprocess.run("ls -la", shell=True)

错误 3：忽略错误

import subprocess
 
# Bad: silently continues on failure
result = subprocess.run(["rm", "/important/file"], capture_output=True, text=True)
# ... continues even if rm failed
 
# Good: check=True raises exception on failure
try:
    result = subprocess.run(
        ["rm", "/important/file"],
        capture_output=True,
        text=True,
        check=True
    )
except subprocess.CalledProcessError as e:
    print(f"Failed to delete: {e.stderr}")

错误 4：Popen 死锁

import subprocess
 
# DEADLOCK: stdout buffer fills up, process blocks, .wait() waits forever
proc = subprocess.Popen(["command_with_lots_of_output"], stdout=subprocess.PIPE)
proc.wait()  # Deadlock!
 
# Fix: use communicate() which handles buffering
proc = subprocess.Popen(["command_with_lots_of_output"], stdout=subprocess.PIPE, text=True)
stdout, stderr = proc.communicate()  # Safe

错误 5：未处理编码

import subprocess
 
# Bytes output can cause issues
result = subprocess.run(["cat", "data.txt"], capture_output=True)
# result.stdout is bytes, not str
 
# Fix: use text=True or encoding parameter
result = subprocess.run(["cat", "data.txt"], capture_output=True, text=True)
 
# For specific encodings:
result = subprocess.run(
    ["cat", "data.txt"],
    capture_output=True,
    encoding="utf-8",
    errors="replace"  # Handle invalid bytes
)

subprocess.run() 参数完整参考

import subprocess
 
result = subprocess.run(
    args,                    # Command as list or string
    stdin=None,              # Input source (PIPE, DEVNULL, file object, or None)
    stdout=None,             # Output destination
    stderr=None,             # Error destination
    capture_output=False,    # Shorthand for stdout=PIPE, stderr=PIPE
    text=False,              # Decode output as strings (alias: universal_newlines)
    shell=False,             # Run through system shell
    cwd=None,                # Working directory
    timeout=None,            # Seconds before TimeoutExpired
    check=False,             # Raise CalledProcessError on non-zero exit
    env=None,                # Environment variables dict
    encoding=None,           # Output encoding (alternative to text=True)
    errors=None,             # Encoding error handling ('strict', 'replace', 'ignore')
    input=None,              # String/bytes to send to stdin
)

FAQ

Python 里的 subprocess 模块是什么？

subprocess 模块是 Python 标准库中用于在 Python 脚本内运行外部命令与程序的工具。它替代了诸如 os.system()、os.popen() 以及 commands 模块等旧方案。它提供创建新进程、连接其 stdin/stdout/stderr 管道、获取返回码、处理超时等能力。主要接口是用于简单执行命令的 subprocess.run()，以及用于需要实时 I/O 或进程管道等高级场景的 subprocess.Popen。

subprocess.run() 和 subprocess.Popen 有什么区别？

subprocess.run() 是一个更高层的便捷函数：它运行命令、等待结束，并返回包含输出的 CompletedProcess 对象，适合绝大多数任务。subprocess.Popen 是更底层的类，允许你直接控制进程：可以按行流式读取输出、交互式发送输入、构建多进程管道，并手动管理进程生命周期。当你需要实时输出流或连接多个进程时，应使用 Popen。

subprocess 里 shell=True 危险吗？

是的，将不可信输入与 shell=True 一起使用会造成 shell 注入漏洞。当设置 shell=True 时，命令字符串会交由系统 shell 解释，因此像 ;、|、&&、$() 等 shell 元字符会被执行，攻击者可能借此注入任意命令。安全默认是 shell=False 并使用列表形式传参。如果必须使用 shell=True，请用 shlex.quote() 清洗输入，并且绝不要传入原始用户输入。

如何捕获 subprocess 命令的输出？

在 subprocess.run() 中使用 capture_output=True 与 text=True。输出会存放在 result.stdout（字符串），错误在 result.stderr。例如：result = subprocess.run(["ls", "-la"], capture_output=True, text=True)，然后读取 result.stdout。如果不加 text=True，输出会以 bytes 返回。

如何在 Python 中处理 subprocess 超时？

向 subprocess.run() 传入 timeout 参数（秒）。如果进程执行超过超时，Python 会终止它并抛出 subprocess.TimeoutExpired。例如：subprocess.run(["slow_command"], timeout=30)。对于 Popen，可使用 proc.communicate(timeout=30) 或 proc.wait(timeout=30)。务必用 try/except 包裹超时敏感代码。

既然推荐 subprocess，为什么 os.system() 还能用？

os.system() 并未被正式废弃，但被视为遗留接口。它通过 shell 执行命令（类似 shell=True），无法捕获输出，没有超时机制，并且只返回退出状态码。subprocess.run() 不仅能做到 os.system() 的事，还额外提供输出捕获、错误处理、超时控制以及更安全的参数传递。所有新代码都应使用 subprocess。

总结

subprocess 模块是 Python 运行外部命令的权威工具。对于直接的命令执行，请使用 subprocess.run()——它在一次调用中就能处理输出捕获、错误检查、超时与输入传递。只有当你需要实时输出流、交互式进程通信或多步骤管道时，才应该使用 subprocess.Popen。

最重要的习惯是：避免将 shell=True 与用户输入一起使用。用列表传参可以彻底消除 shell 注入风险。用 check=True 尽早捕获失败；用 timeout 防止进程挂死；用 text=True 让输出以字符串而不是 bytes 的形式处理。

无论是 git 自动化还是数据流水线编排，subprocess 都能提供 os.system() 无法比拟的控制力与安全性。掌握这些模式后，你就能自信地把任何外部工具集成进 Python 工作流中。

📚