How to Easily Search Value in Column in Pandas Dataframe
Updated on
Searching for values in a Pandas DataFrame column is a fundamental operation for filtering, cleaning, and analyzing data. Pandas offers many powerful ways to search—including boolean indexing, isin(), query(), and string operations—that make this task fast and intuitive.
This updated guide covers the most useful and modern techniques for searching values in a DataFrame, along with examples you can directly apply in real workflows.
Want to quickly create data visualizations from Pandas DataFrames with no code?
PyGWalker turns your DataFrame into an interactive Tableau-like UI inside Jupyter Notebook.
PyGWalker on GitHub (opens in a new tab)
Pandas DataFrame Basics
A Pandas DataFrame is a two-dimensional tabular structure with labeled rows and columns. Here's a simple example:
import pandas as pd
data = {
'Name': ['John', 'Emma', 'Peter', 'David', 'Sophie'],
'Age': [27, 21, 24, 30, 29],
'Gender': ['Male', 'Female', 'Male', 'Male', 'Female'],
'City': ['New York', 'London', 'Paris', 'Tokyo', 'Rio de Janeiro']
}
df = pd.DataFrame(data)
print(df)Output:
Name Age Gender City
0 John 27 Male New York
1 Emma 21 Female London
2 Peter 24 Male Paris
3 David 30 Male Tokyo
4 Sophie 29 Female Rio de JaneiroSearching for Values in a DataFrame Column
1. Search for an Exact Match (Boolean Indexing)
result = df[df['Age'] == 27]
print(result)Output:
Name Age Gender City
0 John 27 Male New YorkYou can use any comparison operator:
| Operator | Meaning |
|---|---|
== | equal |
!= | not equal |
> / < | greater / less |
>= / <= | greater or equal / less or equal |
Example: find rows where Age ≥ 25
df[df['Age'] >= 25]2. Search for Multiple Values Using isin()
cities = ['Paris', 'Tokyo']
df[df['City'].isin(cities)]Output:
Name Age Gender City
2 Peter 24 Male Paris
3 David 30 Male TokyoUse ~df['col'].isin() to exclude values.
3. Search Using query() (Readable & Fast)
query() lets you filter rows using an SQL-like syntax—great for readability.
df.query("Age == 27")Or multiple conditions:
df.query("Age > 25 and Gender == 'Female'")This often results in cleaner code than nested Boolean indexing.
4. Search for String Patterns (str.contains())
Useful for filtering text columns.
Contains substring
df[df['City'].str.contains('on', case=False, na=False)]Starts or ends with
df[df['Name'].str.startswith('J')]
df[df['City'].str.endswith('o')]5. Search for Missing/Non-Missing Values
df[df['City'].isna()] # missing
df[df['City'].notna()] # non-missing6. Search Across Multiple Columns
Find rows where any column matches a value:
df[df.eq('Male').any(axis=1)]Find rows where all conditions match:
df[(df['Gender'] == 'Female') & (df['Age'] > 25)]Performance Tips (Realistic & Accurate)
Some performance tips were previously misunderstood in older tutorials. Here is the accurate guidance:
✔ 1. Convert columns to category if they have repeated values
This speeds up comparisons:
df['City'] = df['City'].astype('category')✔ 2. Use NumPy arrays for very large datasets
import numpy as np
ages = df['Age'].to_numpy()
df[ages == 27]✔ 3. Avoid apply() for searching
Vectorized operations (boolean indexing, isin(), query()) are always faster.
❌ Removed: “.loc[] is faster than boolean indexing”
This is incorrect—they behave the same under the hood.
.loc[] is for label-based selection, not a speed enhancement.
⚠ About searchsorted()
searchsorted() only works properly if the column is sorted and does not confirm that the value exists.
For example:
df_sorted = df.sort_values('Age')
idx = df_sorted['Age'].searchsorted(27)This finds the insertion position—not necessarily the row with Age = 27.
Use it only for advanced workflows.
Conclusion
Searching for values in a Pandas column is essential for data exploration and cleaning. Pandas provides many efficient ways to search:
- Boolean indexing for exact matches
isin()for multiple valuesquery()for clean, SQL-like filtering- String searching with
str.contains() - Missing value filtering
- Multi-column filtering
These methods help you extract the exact data you need quickly, accurately, and cleanly.
Links
- How to Convert a Pandas DataFrame to a Python List
- How to Sort a Pandas DataFrame by Index
- How to Convert a Pandas Series to a DataFrame
- How to Create a List of Column Names in PySpark Dataframe
- How to Append a Pandas DataFrame in Python
- How to Rename a Column in Pandas DataFrame
Frequently Asked Questions
-
How do I search for a specific value in a DataFrame column? Use boolean indexing:
df[df['Age'] == 27] -
How do I retrieve a specific value from a column? Use row index + column name:
df['Age'][0] -
How do I get a single value quickly? Use
.ator.iat:df.at[0, 'Age'] df.iat[0, 1]
;
