Python Requests Library: Complete Guide to HTTP Requests in Python
Updated on
Making HTTP requests in Python using the built-in urllib module is notoriously complex and verbose. You have to manually encode parameters, handle response objects with multiple method calls, and write dozens of lines of boilerplate code just to send a simple API request. This complexity slows down development and makes your code harder to maintain.
The Python requests library eliminates this frustration by providing an elegant, human-friendly API for HTTP communication. Whether you're consuming REST APIs, scraping websites, uploading files, or building automation scripts, the requests library makes HTTP operations intuitive and straightforward with just a few lines of code.
In this comprehensive guide, you'll learn everything from basic GET and POST requests to advanced features like authentication, sessions, error handling, and real-world API integration patterns.
Installing the Python Requests Library
The requests library is not part of Python's standard library, so you need to install it separately using pip:
pip install requestsFor conda users:
conda install requestsOnce installed, you can import it in your Python scripts:
import requestsTo verify the installation and check the version:
import requests
print(requests.__version__)Making GET Requests with Python Requests
GET requests are the most common HTTP method, used to retrieve data from servers. The requests library makes GET requests incredibly simple.
Basic GET Request
Here's how to make a basic GET request:
import requests
response = requests.get('https://api.github.com')
print(response.status_code) # 200
print(response.text) # Response body as stringGET Request with Query Parameters
Instead of manually constructing URLs with query strings, use the params parameter:
import requests
# Method 1: Using params dictionary
params = {
'q': 'python requests',
'sort': 'stars',
'order': 'desc'
}
response = requests.get('https://api.github.com/search/repositories', params=params)
# The URL is automatically constructed:
# https://api.github.com/search/repositories?q=python+requests&sort=stars&order=desc
print(response.url) # View the constructed URL
data = response.json() # Parse JSON responseGET Request with Custom Headers
Many APIs require custom headers for authentication or content negotiation:
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Accept': 'application/json',
'Accept-Language': 'en-US,en;q=0.9'
}
response = requests.get('https://api.example.com/data', headers=headers)
print(response.json())Making POST Requests in Python
POST requests send data to a server, commonly used for form submissions and API operations.
POST Request with Form Data
To send form-encoded data (like HTML form submissions):
import requests
# Send form data
data = {
'username': 'john_doe',
'password': 'secret123',
'remember_me': True
}
response = requests.post('https://example.com/login', data=data)
print(response.status_code)POST Request with JSON Payload
Modern REST APIs typically expect JSON payloads. The requests library handles JSON serialization automatically:
import requests
# Method 1: Using json parameter (recommended)
payload = {
'name': 'New Project',
'description': 'A test project',
'tags': ['python', 'api']
}
response = requests.post('https://api.example.com/projects', json=payload)
# Method 2: Manual JSON encoding
import json
headers = {'Content-Type': 'application/json'}
response = requests.post(
'https://api.example.com/projects',
data=json.dumps(payload),
headers=headers
)POST Request with Files
Upload files using the files parameter:
import requests
# Upload a single file
files = {'file': open('report.pdf', 'rb')}
response = requests.post('https://example.com/upload', files=files)
# Upload multiple files
files = {
'file1': open('document.pdf', 'rb'),
'file2': open('image.jpg', 'rb')
}
response = requests.post('https://example.com/upload', files=files)
# Upload file with additional form data
files = {'file': open('data.csv', 'rb')}
data = {'description': 'Monthly report', 'category': 'finance'}
response = requests.post('https://example.com/upload', files=files, data=data)Other HTTP Methods: PUT, PATCH, DELETE
The requests library supports all standard HTTP methods:
import requests
# PUT - Replace entire resource
data = {'name': 'Updated Name', 'status': 'active'}
response = requests.put('https://api.example.com/users/123', json=data)
# PATCH - Partially update resource
data = {'status': 'inactive'}
response = requests.patch('https://api.example.com/users/123', json=data)
# DELETE - Remove resource
response = requests.delete('https://api.example.com/users/123')
print(response.status_code) # 204 No Content
# HEAD - Get headers only (no response body)
response = requests.head('https://example.com')
print(response.headers)
# OPTIONS - Get supported methods
response = requests.options('https://api.example.com/users')
print(response.headers.get('Allow'))Understanding the Response Object
The Response object contains all information about the server's reply:
import requests
response = requests.get('https://api.github.com/users/github')
# Status code
print(response.status_code) # 200, 404, 500, etc.
# Response body as string
print(response.text)
# Response body as JSON (for JSON APIs)
data = response.json()
print(data['login'])
# Raw binary content (for images, files)
image_data = response.content
with open('profile.jpg', 'wb') as f:
f.write(image_data)
# Response headers
print(response.headers)
print(response.headers['Content-Type'])
# Encoding
print(response.encoding) # 'utf-8'
# Request information
print(response.request.headers)
print(response.request.url)
# Check if request was successful
if response.ok: # True if status_code < 400
print("Success!")Working with Request Headers
Headers carry important metadata about the request:
Setting Custom Headers
import requests
headers = {
'User-Agent': 'MyApp/1.0',
'Accept': 'application/json',
'Accept-Encoding': 'gzip, deflate',
'Connection': 'keep-alive',
'Custom-Header': 'custom-value'
}
response = requests.get('https://api.example.com', headers=headers)Accessing Response Headers
import requests
response = requests.get('https://api.github.com')
# Dictionary-like access
print(response.headers['Content-Type'])
print(response.headers.get('X-RateLimit-Remaining'))
# Case-insensitive access
print(response.headers['content-type']) # Works!
# Iterate all headers
for key, value in response.headers.items():
print(f"{key}: {value}")Authentication with Python Requests
The requests library supports multiple authentication mechanisms:
Basic Authentication
import requests
from requests.auth import HTTPBasicAuth
# Method 1: Using auth parameter (recommended)
response = requests.get(
'https://api.example.com/protected',
auth=('username', 'password')
)
# Method 2: Explicit HTTPBasicAuth
response = requests.get(
'https://api.example.com/protected',
auth=HTTPBasicAuth('username', 'password')
)
# Method 3: Manual header (not recommended)
import base64
credentials = base64.b64encode(b'username:password').decode('utf-8')
headers = {'Authorization': f'Basic {credentials}'}
response = requests.get('https://api.example.com/protected', headers=headers)Bearer Token Authentication
Common for JWT tokens and OAuth 2.0:
import requests
token = 'your_access_token_here'
headers = {'Authorization': f'Bearer {token}'}
response = requests.get('https://api.example.com/user', headers=headers)API Key Authentication
import requests
# Method 1: Query parameter
params = {'api_key': 'your_api_key_here'}
response = requests.get('https://api.example.com/data', params=params)
# Method 2: Custom header
headers = {'X-API-Key': 'your_api_key_here'}
response = requests.get('https://api.example.com/data', headers=headers)OAuth 2.0 Authentication
For OAuth 2.0, use the requests-oauthlib library:
from requests_oauthlib import OAuth2Session
client_id = 'your_client_id'
client_secret = 'your_client_secret'
token_url = 'https://oauth.example.com/token'
oauth = OAuth2Session(client_id)
token = oauth.fetch_token(token_url, client_secret=client_secret)
# Make authenticated requests
response = oauth.get('https://api.example.com/protected')Sessions for Persistent Connections
Sessions maintain cookies, connection pooling, and configuration across multiple requests:
import requests
# Create a session
session = requests.Session()
# Set headers for all requests in this session
session.headers.update({
'User-Agent': 'MyApp/1.0',
'Accept': 'application/json'
})
# Login and session maintains cookies
login_data = {'username': 'john', 'password': 'secret'}
session.post('https://example.com/login', data=login_data)
# Subsequent requests use the session cookies
response1 = session.get('https://example.com/dashboard')
response2 = session.get('https://example.com/profile')
# Close the session
session.close()Session with Authentication
import requests
session = requests.Session()
session.auth = ('username', 'password')
# All requests in this session use the authentication
response1 = session.get('https://api.example.com/users')
response2 = session.get('https://api.example.com/posts')Session Context Manager
import requests
with requests.Session() as session:
session.headers.update({'Authorization': 'Bearer token123'})
response1 = session.get('https://api.example.com/data')
response2 = session.post('https://api.example.com/data', json={'key': 'value'})
# Session automatically closed after with blockTimeout and Retry Strategies
Always set timeouts to prevent requests from hanging indefinitely:
Setting Timeouts
import requests
# Single timeout value (applies to both connect and read)
response = requests.get('https://api.example.com', timeout=5)
# Separate connect and read timeouts
response = requests.get('https://api.example.com', timeout=(3, 10))
# 3 seconds to establish connection, 10 seconds to read response
# No timeout (dangerous - may hang forever)
response = requests.get('https://api.example.com', timeout=None)Implementing Retry Logic
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
# Configure retry strategy
retry_strategy = Retry(
total=3, # Total number of retries
backoff_factor=1, # Wait 1, 2, 4 seconds between retries
status_forcelist=[429, 500, 502, 503, 504], # Retry on these status codes
allowed_methods=["HEAD", "GET", "OPTIONS", "POST"]
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session = requests.Session()
session.mount("http://", adapter)
session.mount("https://", adapter)
# Requests will automatically retry on failure
response = session.get('https://api.example.com/data')Error Handling in Python Requests
Robust error handling is crucial for production applications:
Handling Common Exceptions
import requests
from requests.exceptions import (
ConnectionError,
Timeout,
HTTPError,
RequestException
)
try:
response = requests.get('https://api.example.com/data', timeout=5)
response.raise_for_status() # Raises HTTPError for 4xx/5xx status codes
data = response.json()
except ConnectionError:
print("Failed to connect to the server")
except Timeout:
print("Request timed out")
except HTTPError as e:
print(f"HTTP error occurred: {e}")
print(f"Status code: {e.response.status_code}")
except RequestException as e:
# Catches all requests exceptions
print(f"An error occurred: {e}")
except ValueError:
# JSON decoding error
print("Invalid JSON response")Checking Status Codes
import requests
response = requests.get('https://api.example.com/data')
# Method 1: Manual check
if response.status_code == 200:
data = response.json()
elif response.status_code == 404:
print("Resource not found")
elif response.status_code >= 500:
print("Server error")
# Method 2: Using raise_for_status()
try:
response.raise_for_status()
data = response.json()
except requests.exceptions.HTTPError as e:
if response.status_code == 404:
print("Resource not found")
elif response.status_code == 401:
print("Authentication required")
else:
print(f"HTTP error: {e}")
# Method 3: Using response.ok
if response.ok: # True if status_code < 400
data = response.json()
else:
print(f"Request failed with status {response.status_code}")SSL Verification and Certificates
By default, requests verifies SSL certificates:
import requests
# Default behavior - verify SSL certificate
response = requests.get('https://api.example.com')
# Disable SSL verification (not recommended for production)
response = requests.get('https://example.com', verify=False)
# Suppress the InsecureRequestWarning
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
response = requests.get('https://example.com', verify=False)
# Use custom CA bundle
response = requests.get('https://example.com', verify='/path/to/ca_bundle.crt')
# Client-side certificates
response = requests.get(
'https://example.com',
cert=('/path/to/client.crt', '/path/to/client.key')
)Using Proxies with Python Requests
Configure proxy servers for requests:
import requests
# HTTP and HTTPS proxies
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
response = requests.get('https://api.example.com', proxies=proxies)
# SOCKS proxy (requires requests[socks])
proxies = {
'http': 'socks5://user:pass@host:port',
'https': 'socks5://user:pass@host:port'
}
# Use environment variables
# Set HTTP_PROXY and HTTPS_PROXY environment variables
response = requests.get('https://api.example.com') # Automatically uses env proxies
# Disable proxies
response = requests.get('https://api.example.com', proxies={'http': None, 'https': None})Python HTTP Libraries Comparison
Here's how the requests library compares to alternatives:
| Feature | requests | urllib | httpx | aiohttp |
|---|---|---|---|---|
| Ease of Use | Excellent (Pythonic API) | Poor (verbose) | Excellent | Good |
| Async Support | No | No | Yes | Yes |
| HTTP/2 Support | No | No | Yes | No |
| Session Management | Built-in | Manual | Built-in | Built-in |
| JSON Handling | Automatic | Manual | Automatic | Automatic |
| Connection Pooling | Yes | No | Yes | Yes |
| Standard Library | No (pip install) | Yes | No (pip install) | No (pip install) |
| Documentation | Excellent | Good | Excellent | Good |
| Performance | Good | Fair | Excellent | Excellent (async) |
| SSL/TLS | Full support | Full support | Full support | Full support |
| Best For | Synchronous HTTP, general use | Simple scripts, no dependencies | Modern sync/async HTTP | High-performance async |
When to use each:
- requests: Default choice for most synchronous HTTP operations. Best for web scraping, API consumption, and general HTTP tasks.
- urllib: Only when you cannot install external packages (standard library requirement).
- httpx: When you need HTTP/2 support or want a modern requests-compatible API with async capabilities.
- aiohttp: For high-performance asynchronous applications handling many concurrent requests.
Rate Limiting and Respectful Scraping
When scraping websites or calling APIs, implement rate limiting:
import requests
import time
from datetime import datetime
class RateLimitedSession:
def __init__(self, requests_per_second=1):
self.session = requests.Session()
self.min_interval = 1.0 / requests_per_second
self.last_request_time = 0
def get(self, url, **kwargs):
# Wait if necessary
elapsed = time.time() - self.last_request_time
if elapsed < self.min_interval:
time.sleep(self.min_interval - elapsed)
# Make request
response = self.session.get(url, **kwargs)
self.last_request_time = time.time()
return response
# Use rate-limited session
session = RateLimitedSession(requests_per_second=2) # 2 requests per second
urls = ['https://api.example.com/item/1', 'https://api.example.com/item/2']
for url in urls:
response = session.get(url)
print(f"{datetime.now()}: {response.status_code}")Respecting robots.txt
import requests
from urllib.robotparser import RobotFileParser
def can_fetch(url, user_agent='MyBot'):
"""Check if URL can be scraped according to robots.txt"""
rp = RobotFileParser()
robots_url = f"{url.split('/')[0]}//{url.split('/')[2]}/robots.txt"
rp.set_url(robots_url)
rp.read()
return rp.can_fetch(user_agent, url)
url = 'https://example.com/page'
if can_fetch(url):
response = requests.get(url)
else:
print("Scraping not allowed by robots.txt")Real-World Examples and Use Cases
Example 1: Consuming a REST API
import requests
class GitHubAPI:
def __init__(self, token=None):
self.base_url = 'https://api.github.com'
self.session = requests.Session()
if token:
self.session.headers.update({'Authorization': f'token {token}'})
def get_user(self, username):
"""Get user information"""
response = self.session.get(f'{self.base_url}/users/{username}')
response.raise_for_status()
return response.json()
def search_repositories(self, query, sort='stars', limit=10):
"""Search repositories"""
params = {'q': query, 'sort': sort, 'per_page': limit}
response = self.session.get(f'{self.base_url}/search/repositories', params=params)
response.raise_for_status()
return response.json()['items']
def create_issue(self, owner, repo, title, body):
"""Create an issue in a repository"""
url = f'{self.base_url}/repos/{owner}/{repo}/issues'
data = {'title': title, 'body': body}
response = self.session.post(url, json=data)
response.raise_for_status()
return response.json()
# Usage
api = GitHubAPI(token='your_github_token')
user = api.get_user('torvalds')
print(f"Name: {user['name']}, Followers: {user['followers']}")
repos = api.search_repositories('python requests', limit=5)
for repo in repos:
print(f"{repo['full_name']}: {repo['stargazers_count']} stars")Example 2: Downloading Files with Progress
import requests
from tqdm import tqdm
def download_file(url, filename):
"""Download file with progress bar"""
response = requests.get(url, stream=True)
response.raise_for_status()
total_size = int(response.headers.get('content-length', 0))
with open(filename, 'wb') as f, tqdm(
desc=filename,
total=total_size,
unit='B',
unit_scale=True,
unit_divisor=1024,
) as progress_bar:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
progress_bar.update(len(chunk))
# Download a file
download_file('https://example.com/large-file.zip', 'downloaded.zip')Example 3: Web Scraping with Error Handling
import requests
from bs4 import BeautifulSoup
import time
def scrape_articles(base_url, max_pages=5):
"""Scrape article titles from a news website"""
session = requests.Session()
session.headers.update({
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
})
articles = []
for page in range(1, max_pages + 1):
try:
url = f"{base_url}?page={page}"
response = session.get(url, timeout=10)
response.raise_for_status()
soup = BeautifulSoup(response.content, 'html.parser')
titles = soup.find_all('h2', class_='article-title')
for title in titles:
articles.append({
'title': title.text.strip(),
'url': title.find('a')['href'] if title.find('a') else None
})
print(f"Scraped page {page}: {len(titles)} articles")
time.sleep(1) # Rate limiting
except requests.exceptions.RequestException as e:
print(f"Error scraping page {page}: {e}")
continue
return articles
# Usage
articles = scrape_articles('https://news.example.com/articles', max_pages=3)
print(f"Total articles collected: {len(articles)}")Example 4: API Integration with Pagination
import requests
def fetch_all_items(api_url, headers=None):
"""Fetch all items from a paginated API"""
items = []
page = 1
while True:
try:
params = {'page': page, 'per_page': 100}
response = requests.get(api_url, params=params, headers=headers, timeout=10)
response.raise_for_status()
data = response.json()
if not data: # No more items
break
items.extend(data)
print(f"Fetched page {page}: {len(data)} items")
page += 1
# Check for pagination in headers
if 'Link' in response.headers:
links = response.headers['Link']
if 'rel="next"' not in links:
break
except requests.exceptions.RequestException as e:
print(f"Error fetching page {page}: {e}")
break
return items
# Usage
all_items = fetch_all_items('https://api.example.com/items')
print(f"Total items: {len(all_items)}")Testing APIs in Jupyter with RunCell
When developing and testing API integrations, RunCell (opens in a new tab) provides an AI-powered agent environment directly in Jupyter notebooks. Instead of manually debugging HTTP requests and responses, RunCell's intelligent agent can help you:
- Automatically construct and test API requests with proper authentication
- Debug response parsing and error handling in real-time
- Generate code snippets for common HTTP patterns
- Validate API responses against expected schemas
- Iterate quickly on data transformations from API responses
This is particularly valuable when working with complex APIs that require multiple authentication steps, pagination handling, or intricate data parsing logic. RunCell accelerates the development workflow by reducing the back-and-forth of testing HTTP requests manually.
FAQ
What is the Python requests library used for?
The Python requests library is used for making HTTP requests to web servers and APIs. It simplifies tasks like fetching web pages, consuming REST APIs, sending form data, uploading files, and handling authentication. It's the most popular HTTP library in Python due to its intuitive API and comprehensive feature set.
How do I install the Python requests library?
Install the requests library using pip: pip install requests. For conda environments, use conda install requests. Once installed, import it with import requests in your Python code. The library is not part of Python's standard library, so installation is required.
What's the difference between requests.get() and requests.post()?
requests.get() retrieves data from a server without modifying anything, typically used for fetching web pages or API data. requests.post() sends data to the server to create or update resources, commonly used for form submissions, file uploads, or API operations that modify server state. GET requests pass parameters in the URL, while POST requests send data in the request body.
How do I handle errors with the Python requests library?
Use try-except blocks to catch request exceptions: ConnectionError for network issues, Timeout for slow responses, HTTPError for 4xx/5xx status codes, and RequestException as a catch-all. Call response.raise_for_status() after each request to automatically raise HTTPError for failed requests. Always set timeouts to prevent requests from hanging indefinitely.
How do I send JSON data with Python requests?
Use the json parameter in requests: requests.post(url, json=data). The library automatically serializes Python dictionaries to JSON and sets the Content-Type: application/json header. To parse JSON responses, use response.json(), which deserializes the JSON response body into a Python dictionary.
Should I use requests or urllib in Python?
Use the requests library for most HTTP operations. It offers a cleaner API, automatic JSON handling, built-in session management, and better error handling compared to urllib. Only use urllib when you cannot install external packages and must rely solely on Python's standard library. For modern applications requiring HTTP/2 or async support, consider httpx as an alternative.
How do I add authentication to Python requests?
For Basic authentication, use requests.get(url, auth=('username', 'password')). For Bearer tokens (JWT, OAuth), add an Authorization header: headers = {'Authorization': f'Bearer {token}'}. For API keys, either add them as query parameters using the params dictionary or as custom headers like 'X-API-Key'. Sessions can persist authentication across multiple requests.
What are sessions in Python requests and when should I use them?
Sessions maintain configuration (headers, cookies, authentication) across multiple requests to the same server. Use sessions when making multiple requests to an API, when you need to maintain login state with cookies, or when you want to reuse TCP connections for better performance. Create a session with session = requests.Session() and use session.get() instead of requests.get().
Conclusion
The Python requests library is an indispensable tool for HTTP communication in Python. Its elegant API transforms complex HTTP operations into simple, readable code. From basic GET requests to advanced features like authentication, sessions, file uploads, and error handling, requests provides everything you need for robust HTTP interactions.
By mastering the patterns and best practices in this guide—setting timeouts, implementing retry logic, handling errors gracefully, and respecting rate limits—you'll build reliable applications that communicate effectively with web services and APIs. Whether you're consuming REST APIs, scraping websites, or building automation tools, the requests library makes HTTP operations straightforward and maintainable.