Python List Comprehensions — The Complete Practical Guide

Once list comprehensions click, you'll never write a for loop that appends to a list again. They're not just syntactic sugar — they signal intent clearly and run faster than the equivalent loop in CPython thanks to bytecode optimisation. According to the official Python tutorial, comprehensions provide a concise way to create lists based on existing sequences or other iterables. This article builds the mental model first, then covers every real pattern: filtering, nesting, dict and set comprehensions, and the one case where you should reach for a plain for loop instead.

The Basic Pattern

The anatomy of a list comprehension is [expression for item in iterable]. Three parts: the output expression (what each element becomes), the loop variable, and the iterable to pull from. Start with a familiar loop and collapse it down.

python

# Plain for loop — building a list of file sizes in KB
file_sizes_bytes = [1024, 204800, 51200, 3145728, 8192]

sizes_kb = []
for size in file_sizes_bytes:
    sizes_kb.append(size / 1024)

# print(sizes_kb)  →  [1.0, 200.0, 50.0, 3072.0, 8.0]

# Same result as a list comprehension — one line, same meaning
sizes_kb = [size / 1024 for size in file_sizes_bytes]

The comprehension reads almost like English: "give me size / 1024 for every size in file_sizes_bytes." That clarity is the real win — a reader doesn't have to trace an append call to understand what you're building.

python

# Another common pattern: deriving one list from another
usernames = ["alice", "bob", "carol"]

# Build a list of display names
display_names = [name.capitalize() for name in usernames]
# ['Alice', 'Bob', 'Carol']

# Or extract a single field from a list of dicts
users = [
    {"id": 1, "name": "Alice", "role": "admin"},
    {"id": 2, "name": "Bob",   "role": "viewer"},
    {"id": 3, "name": "Carol", "role": "editor"},
]

names = [user["name"] for user in users]
# ['Alice', 'Bob', 'Carol']

Filtering with if

Add a condition at the end and only elements that pass make it into the output list: [expression for item in iterable if condition]. This is the pattern that replaces a for + if + append combination.

python

# Loop version — extract only active users
active_users = []
for user in users:
    if user["active"]:
        active_users.append(user["name"])

# Comprehension version — identical result
users = [
    {"name": "Alice", "active": True},
    {"name": "Bob",   "active": False},
    {"name": "Carol", "active": True},
    {"name": "Dave",  "active": False},
]

active_users = [user["name"] for user in users if user["active"]]
# ['Alice', 'Carol']

python

# Filtering a raw CSV row — drop blanks and whitespace-only values
raw_row = ["[email protected]", "", "  ", "editor", " "]

clean_row = [field.strip() for field in raw_row if field.strip()]
# ['[email protected]', 'editor']

# Filtering a list of log levels
log_lines = [
    "INFO  server started",
    "DEBUG loading config",
    "ERROR database timeout",
    "DEBUG query took 450ms",
    "ERROR disk space low",
]

errors = [line for line in log_lines if line.startswith("ERROR")]
# ['ERROR database timeout', 'ERROR disk space low']

Working with Strings

String processing is where comprehensions earn their keep. The combination of Python's rich string methods and comprehension syntax keeps transformation pipelines readable without intermediate variables.

python

# Strip whitespace from tags coming out of a form field
raw_tags = ["  python ", "data science", " machine-learning ", "API"]

tags = [tag.strip().lower() for tag in raw_tags]
# ['python', 'data science', 'machine-learning', 'api']

# Normalise email addresses from a signup CSV
raw_emails = ["[email protected]", "  [email protected]  ", "[email protected]"]

emails = [e.strip().lower() for e in raw_emails]
# ['[email protected]', '[email protected]', '[email protected]']

# Extract file extensions from a list of uploaded filenames
filenames = ["report.pdf", "avatar.PNG", "data.CSV", "archive.tar.gz", "notes.txt"]

extensions = [name.rsplit(".", 1)[-1].lower() for name in filenames if "." in name]
# ['pdf', 'png', 'csv', 'gz', 'txt']

Tip: When normalising user input, strip before you lowercase — otherwise a value like " [email protected] " passes a .lower() check on the content but still carries leading spaces that will break a database lookup or JSON key match.

Dict and Set Comprehensions

The same idea extends to dicts and sets. A dict comprehension uses curly braces with a key: value pair: {key: value for item in iterable}. A set comprehension drops the colon and produces a deduplicated collection: {expression for item in iterable}.

python

# Invert a dict — swap keys and values
permissions = {"alice": "admin", "bob": "viewer", "carol": "editor"}

by_role = {role: name for name, role in permissions.items()}
# {'admin': 'alice', 'viewer': 'bob', 'editor': 'carol'}

# Build a fast lookup dict from a list of user records
users = [
    {"id": 101, "name": "Alice", "active": True},
    {"id": 102, "name": "Bob",   "active": False},
    {"id": 103, "name": "Carol", "active": True},
]

# O(1) lookups by ID — much faster than scanning the list every time
user_by_id = {user["id"]: user for user in users}
# {101: {...}, 102: {...}, 103: {...}}

# Access a user directly
user_by_id[102]["name"]  # 'Bob'

python

# Set comprehension — deduplicate a list of file extensions
uploads = ["report.pdf", "data.csv", "summary.pdf", "export.CSV", "notes.txt"]

unique_extensions = {name.rsplit(".", 1)[-1].lower() for name in uploads if "." in name}
# {'pdf', 'csv', 'txt'}  — order not guaranteed

One caveat with sets: order is not preserved. If you need to deduplicate a list while keeping the original order, a set comprehension is the wrong tool — use list(dict.fromkeys(items)) instead, which leverages the insertion-ordered behaviour of dicts in Python 3.7+.

Nested Comprehensions

You can nest comprehensions to iterate over nested structures. The most common use case is flattening a list of lists — a matrix from a CSV parse, chunked API response pages, or grouped query results.

python

# Flatten a 2D list (e.g. paginated API results)
pages = [
    [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}],
    [{"id": 3, "name": "Carol"}],
    [{"id": 4, "name": "Dave"}, {"id": 5, "name": "Eve"}],
]

all_users = [user for page in pages for user in page]
# [{'id': 1, ...}, {'id': 2, ...}, {'id': 3, ...}, {'id': 4, ...}, {'id': 5, ...}]

# The order mirrors what nested for loops would produce:
# for page in pages:
#     for user in page:
#         all_users.append(user)

Read nested comprehensions left to right — the outer loop comes first, the inner loop second. That matches the order of nested for loops.

python

# Two levels — fine and readable
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [cell for row in matrix for cell in row]
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Three levels — stop here. Use a plain loop or a helper function.
# This is the point where readability dies:
cube = [[[1,2],[3,4]],[[5,6],[7,8]]]

# Don't do this:
flat3 = [val for layer in cube for row in layer for val in row]

# Do this instead:
flat3 = []
for layer in cube:
    for row in layer:
        flat3.extend(row)

Generator expressions use the same syntax with parentheses instead of brackets: (user["id"] for user in users). They evaluate lazily — elements are produced one at a time rather than building the full list in memory. Use them when you only need to iterate once, or when you're passing the result straight to sum(), max(), any(), or similar functions. See the generator expressions reference for the full details.

The if/else Ternary Inside a Comprehension

When you need to transform elements into one of two values — rather than drop them — use a ternary expression in the output position. The position matters: the ternary belongs at the start, not after the iterable.

python

# Correct: ternary is the output expression
users = [
    {"name": "Alice", "active": True},
    {"name": "Bob",   "active": False},
    {"name": "Carol", "active": True},
]

status = ["active" if user["active"] else "inactive" for user in users]
# ['active', 'inactive', 'active']

# Normalise a config value — replace None with a default
raw_config = ["/var/log", None, "/tmp", None, "/etc/app"]

paths = [path if path is not None else "/var/log/default" for path in raw_config]
# ['/var/log', '/var/log/default', '/tmp', '/var/log/default', '/etc/app']

python

# Common mistake — putting the ternary after the iterable (SyntaxError)
# status = ["active" for user in users if user["active"] else "inactive"]  # WRONG

# The trailing "if" is a filter — it drops non-matching items entirely.
# The ternary ("if ... else ...") in the output expression transforms all items.
# They serve different purposes. You can combine them:

# Keep only users, but show their status
status = ["active" if user["active"] else "inactive"
          for user in users
          if user["name"] != "Bob"]
# ['active', 'active']  — Bob was filtered out entirely

When to Use a for Loop Instead

List comprehensions are for building lists. If what you're really doing is running side effects, use a for loop — not a comprehension. This matters beyond style: a comprehension that discards its result wastes memory building a list nobody uses, and it hides intent from the reader.

python

# Bad — comprehension for side effects (writing to a file, printing, calling an API)
[print(f"Processing user {user['name']}") for user in users]   # don't do this
[requests.post("/api/notify", json=user) for user in users]    # definitely don't do this

# Good — plain for loop makes the intent obvious
for user in users:
    print(f"Processing user {user['name']}")

for user in users:
    requests.post("/api/notify", json=user)

python

# Bad — comprehension with logic complex enough to need comments or multiple steps
result = [
    user["name"].strip().lower()
    if user.get("active") and user.get("email_verified")
    else user["name"].strip().lower() + " (unverified)"
    for user in users
    if user.get("role") in ("admin", "editor") and user.get("last_login") is not None
]

# Good — break it out when the logic is this involved
result = []
for user in users:
    if user.get("role") not in ("admin", "editor"):
        continue
    if user.get("last_login") is None:
        continue
    name = user["name"].strip().lower()
    if not (user.get("active") and user.get("email_verified")):
        name += " (unverified)"
    result.append(name)

Use a comprehension when you are building a new list from an existing iterable with a clear expression and an optional filter.
Use a for loop when you are running side effects — I/O, network calls, printing, mutating external state.
Use a for loop when the transformation logic needs multiple lines, intermediate variables, or comments to be understood.
Use a generator expression when you only need to iterate once or pass directly to sum(), any(), max() — no list needed in memory.

Performance Note

List comprehensions are meaningfully faster than an equivalent for + append loop in CPython. The reason is bytecode: a comprehension compiles to a dedicated LIST_APPEND opcode that avoids the attribute lookup on list.append on every iteration. The Python performance tips wiki covers this, and the gap is typically 10–40% for pure-Python workloads depending on list size.

python

import timeit

data = list(range(100_000))

# for + append
def with_loop():
    result = []
    for x in data:
        result.append(x * 2)
    return result

# list comprehension
def with_comprehension():
    return [x * 2 for x in data]

# generator expression — no list built at all
def with_generator():
    return sum(x * 2 for x in data)

# Typical results on CPython 3.12:
# with_loop():          ~7.2 ms
# with_comprehension(): ~4.8 ms  (~33% faster)
# with_generator():     ~4.1 ms  (and uses O(1) memory vs O(n))

If you don't need a materialised list — you're passing the result to sum(), any(), max(), or iterating it once — use a generator expression instead. It uses constant memory regardless of input size, which matters when processing large CSV exports or JSON payloads in a tight loop.

Wrapping Up

List comprehensions are one of those Python features that feel awkward for about a week and then become impossible to live without. The mental model is simple: output expression, loop variable, iterable, optional filter. Stick to that and you'll write readable, idiomatic Python. When the logic gets complex, that's the signal to fall back to a plain loop — not a failure, just the right tool for the job.

If you're working with JSON data in Python — turning API responses into lists of values, extracting fields from records, building lookup dicts — the tools on this site pair well with what you've just learned. Try the JSON Formatter to inspect and pretty-print JSON payloads before you write the comprehension that processes them, or the CSV Formatter to validate CSV data before parsing it into a list of rows. For the full language reference, the Python docs on list comprehensions and PEP 202 (the original proposal) are worth a read.

← All Python articles Browse all categories →