This article contains 25 Python comprehensions exercises including list, dict, set, generator comprehensions exercises, from beginner to expert level to helping you write more efficient and Pythonic code.
Each coding challenge includes a Practice Problem, Hint, Solution code, and detailed Explanation, ensuring you don’t just copy code, but genuinely practice and understand how and why it works.
- All solutions have been fully tested on Python 3.
- Use our Online Code Editor to solve these exercises in real time.
+ Table of Contents (25 Exercises)
Table of contents
- Exercise 1: Squares List
- Exercise 2: Even Numbers
- Exercise 3: String Lengths
- Exercise 4: Uppercase Converter
- Exercise 5: Flatten a 2D List
- Exercise 6: Filter & Transform Together
- Exercise 7: Word Frequency Dict
- Exercise 8: Invert a Dictionary
- Exercise 9: Unique Vowels
- Exercise 10: Square Mapping
- Exercise 11: FizzBuzz with Comprehension
- Exercise 12: Matrix Transposition
- Exercise 13: Cartesian Product
- Exercise 14: Extract Digits
- Exercise 15: Nested Filtering
- Exercise 16: Conditional Dict Comprehension
- Exercise 17: Grouped Dict from Two Lists
- Exercise 18: Set of Common Elements
- Exercise 19: Character Frequency Set
- Exercise 20: Lazy Squares
- Exercise 21: Infinite Fibonacci Generator
- Exercise 22: Chained Generator Pipeline
- Exercise 23: CSV Row Parser
- Exercise 24: Dict of Grouped Anagrams
- Exercise 25: Comprehension vs Generator Benchmark
Exercise 1: Squares List
Problem Statement: Generate a list of squares for every integer from 1 to 20 using a list comprehension. Do not use a for loop with append().
Purpose: This exercise introduces the list comprehension syntax as a concise and readable replacement for the classic accumulate-and-append loop. Mastering the basic [expression for item in iterable] form is the first step toward writing idiomatic Python and understanding the more powerful filtered and nested variations that follow in later exercises.
Given Input: Integers from 1 to 20 (inclusive).
Expected Output: [1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400]
▼ Hint
Use range(1, 21) as the iterable and n ** 2 as the expression: [n ** 2 for n in range(1, 21)]. range(1, 21) produces integers from 1 up to and including 20.
▼ Solution & Explanation
Explanation:
[n ** 2 for n in range(1, 21)]: The three parts of a list comprehension are the output expression (n ** 2), the loop variable (n), and the iterable (range(1, 21)). Python evaluates the expression for each value ofnand collects all results into a new list in one step.range(1, 21): Generates integers from1up to but not including21, so the last value is20. This is a common off-by-one point: the stop argument is always exclusive in Python ranges.- Equivalent loop: The comprehension replaces three lines —
squares = [], aforloop, andsquares.append(n ** 2)— with a single expression. Both produce identical results, but the comprehension communicates intent more directly: “a list of squares.” - Performance: List comprehensions are generally faster than equivalent
appendloops in CPython because the interpreter can optimise the internal list-building operation. For most code the difference is negligible, but the readability benefit alone justifies preferring comprehensions for straightforward transformations.
Exercise 2: Even Numbers
Problem Statement: Given a list of integers, use a list comprehension with a conditional clause to extract only the even numbers and collect them into a new list.
Purpose: Adding an if clause to a list comprehension turns it into a combined filter-and-collect operation. This pattern replaces the common if-inside-a-loop idiom with a single readable expression and is one of the most frequently used comprehension forms in everyday Python code.
Given Input: numbers = [3, 7, 2, 14, 9, 8, 11, 6, 5, 10]
Expected Output: [2, 14, 8, 6, 10]
▼ Hint
Append an if condition after the for clause: [n for n in numbers if n % 2 == 0]. Only elements for which the condition evaluates to True are included in the output list.
▼ Solution & Explanation
Explanation:
if n % 2 == 0: The condition is evaluated for every element before the output expression is applied. When it isTruethe element passes through; when it isFalsethe element is silently skipped. The output expression and the filter condition are independent — you can transform and filter in the same comprehension.- Order of clauses: In a list comprehension the order is always:
[expression for variable in iterable if condition]. Theifclause comes last and acts as a gate. Placing the condition before theforkeyword (as in a Python ternary expression) would mean something different — it would be a conditional expression on the output value, not a filter. - Original order preserved: The output list contains elements in the same order they appeared in the input. Unlike a set, a list comprehension does not sort or deduplicate — it simply iterates and selects.
- Equivalent loop: This replaces
evens = []; for n in numbers: if n % 2 == 0: evens.append(n). The comprehension version is not only shorter but also signals the reader immediately that the result is a filtered subset of the input.
Exercise 3: String Lengths
Problem Statement: Given a list of words, use a list comprehension to create a new list where each element is the length of the corresponding word from the original list.
Purpose: This exercise reinforces that the output expression in a list comprehension can be any valid Python expression — including a function call. Applying len() to each element is a simple example of the broader pattern of projecting a list of objects onto one of their properties or derived values.
Given Input: words = ["python", "list", "comprehension", "is", "powerful"]
Expected Output: [6, 4, 13, 2, 8]
▼ Hint
Use len(word) as the output expression: [len(word) for word in words]. The built-in len() function returns an integer, so the resulting list will contain integers in the same order as the input words.
▼ Solution & Explanation
Explanation:
len(word): Called once per element, it returns the number of characters in the string. Becauselen()accepts any sequence, the same comprehension pattern works unchanged if the input list contained tuples, sub-lists, or any other sized objects.- Positional correspondence: The output list has exactly the same number of elements as the input and in the same order.
lengths[0]is the length ofwords[0], and so on. This one-to-one mapping is guaranteed by list comprehensions when there is noiffilter clause. - Pairing with
zip(): To display word-length pairs side by side you can writelist(zip(words, lengths)), which produces[('python', 6), ('list', 4), ...]. Alternatively,[(word, len(word)) for word in words]produces the same result in a single comprehension. - Real-world use: Projecting a list of objects onto a property is one of the most common comprehension patterns: extracting email addresses from a list of user dictionaries, pulling timestamps from log entries, or converting a list of file paths to their base names with
os.path.basename().
Exercise 4: Uppercase Converter
Problem Statement: Given a list of lowercase strings, use a list comprehension to produce a new list where every string has been converted to uppercase. The original list must remain unchanged.
Purpose: This exercise shows that the output expression in a comprehension can call a method on each element, not just a standalone function. It also reinforces the immutability of list comprehensions: they always produce a new list, leaving the source data untouched.
Given Input: fruits = ["apple", "banana", "cherry", "date", "elderberry"]
Expected Output: ['APPLE', 'BANANA', 'CHERRY', 'DATE', 'ELDERBERRY']
▼ Hint
Call the string method directly on the loop variable: [fruit.upper() for fruit in fruits]. String methods return a new string and do not modify the original, so the source list stays unchanged.
▼ Solution & Explanation
Explanation:
fruit.upper(): Calls thestr.upper()method on each element. Python strings are immutable, soupper()always returns a brand-new string rather than modifying the original. The comprehension collects these new strings into a new list.- Original list unchanged: The comprehension creates a completely separate list object. Printing
fruitsafter the comprehension confirms that the source data is unaffected. This is a key property to understand: comprehensions are non-destructive by design. - Method chaining in the expression: The output expression can chain multiple method calls:
[fruit.strip().upper().replace("A", "@") for fruit in fruits]is valid and still reads as a single comprehension. Keep chaining reasonable to preserve readability. - Generalising the pattern: Any string transformation method works the same way:
.lower(),.title(),.strip(),.replace(). The same pattern also applies to objects:[user.get_full_name() for user in users]extracts a derived value from each object in a list.
Exercise 5: Flatten a 2D List
Problem Statement: Given a matrix represented as a list of lists (a 2D list), use a nested list comprehension to flatten it into a single one-dimensional list. Each inner list can be of a different length.
Purpose: Nested list comprehensions unlock a second dimension of iteration within the same expression. Flattening is the canonical example: it requires iterating over outer rows and inner elements simultaneously. This pattern extends naturally to matrix transposition, Cartesian products, and any operation that needs to visit every cell in a grid.
Given Input: matrix = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]
Expected Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
▼ Hint
- Use two
forclauses inside one comprehension:[item for row in matrix for item in row]. Read it left to right — the outer loop iterates over rows, the inner loop iterates over items within each row. - The order of the
forclauses in a nested comprehension mirrors the order you would write them as nestedforloops: outer loop first, inner loop second.
▼ Solution & Explanation
Explanation:
for row in matrix: The outer loop iterates over the three sublists. On the first passrow = [1, 2, 3], on the secondrow = [4, 5], and on the thirdrow = [6, 7, 8, 9].for item in row: The inner loop iterates over every element within the currentrow. Eachitemis appended to the output list by the leading expression. The twoforclauses together visit every element across all rows in order.- Reading order vs. nesting order: A common point of confusion is that
[item for row in matrix for item in row]reads left to right with the outermost loop first — the same order as the equivalent nestedforloops written top to bottom. This is different from nested list comprehensions that generate lists of lists, where the inner comprehension is written inside square brackets. - Jagged rows are handled automatically: Because the inner loop iterates over whatever is in
row, sublists of different lengths work without any special handling.[4, 5]contributes two elements and[6, 7, 8, 9]contributes four, and the output reflects that naturally.
Exercise 6: Filter & Transform Together
Problem Statement: Given a list of integers, use a single list comprehension to produce a new list that contains the squares of only the odd numbers. Even numbers should be excluded entirely from the output.
Purpose: This exercise combines both halves of the comprehension syntax — a transformation expression and a filter condition — in one concise statement. Performing a filter and a transform simultaneously in a single pass is one of the most practical and common real-world uses of list comprehensions.
Given Input: numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Expected Output: [1, 9, 25, 49, 81]
▼ Hint
Combine the squaring expression with an odd-number condition: [n ** 2 for n in numbers if n % 2 != 0]. The if clause filters first; only elements that pass the condition have the expression applied to them.
▼ Solution & Explanation
Explanation:
if n % 2 != 0: The filter is evaluated first for each element. Numbers where the remainder after dividing by 2 is not zero are odd. Even numbers (wheren % 2 == 0) never reach the squaring step and are absent from the output entirely.- Single-pass efficiency: The comprehension visits each element once, applies the condition, and computes the square only for those that pass. This is more efficient than first filtering into an intermediate list and then mapping a square operation over it, because no temporary list is created.
- Alternative odd check:
n % 2 == 1is equivalent ton % 2 != 0for positive integers. For code clarity,n % 2 != 0is preferred because it also correctly handles negative odd numbers (e.g.,-3 % 2is1in Python, but-3 % 2 != 0is stillTrueregardless of sign). - Template for real-world use: The pattern
[transform(x) for x in data if condition(x)]is one of the most reusable templates in Python. Examples include extracting discounted prices only for items in stock, converting only non-empty strings, or formatting only rows that meet a threshold – all expressible in a single readable line.
Exercise 7: Word Frequency Dict
Problem Statement: Given a list of words that may contain duplicates, use a dictionary comprehension to build a dictionary that maps each unique word to the number of times it appears in the list.
Purpose: Dictionary comprehensions extend the comprehension concept to key-value pairs, allowing you to build a dict in a single expression rather than with a loop and manual assignment. Word frequency counting is a classic application that demonstrates the pattern clearly and mirrors real-world text processing tasks such as building search indexes, analysing logs, and computing term frequency for NLP.
Given Input: words = ["apple", "banana", "apple", "cherry", "banana", "apple", "date"]
Expected Output: {'apple': 3, 'banana': 2, 'cherry': 1, 'date': 1}
▼ Hint
- Use
set(words)as the iterable to get each unique word exactly once, then usewords.count(word)as the value expression:{word: words.count(word) for word in set(words)}. - Note that iterating over a
setdoes not guarantee insertion order, so the dictionary key order may differ from the original list. If order matters, usedict.fromkeys(words)or a sorted iterable instead.
▼ Solution & Explanation
Explanation:
{word: words.count(word) for word in set(words)}: The dict comprehension syntax is{key_expr: value_expr for var in iterable}. Here the key is each unique word and the value is how many times that word appears in the original list. Usingset(words)as the iterable ensures each word is processed exactly once as a key.words.count(word): Scans the entire list and returns the number of occurrences ofword. It is called once per unique word, making the overall approach O(n * k) wherenis the list length andkis the number of unique words.- More efficient alternative –
collections.Counter:from collections import Counter; Counter(words)produces the same result in a single O(n) pass and is the idiomatic tool for frequency counting in Python. The comprehension approach is shown here to demonstrate the dict comprehension pattern; useCounterin production code. - Key order: From Python 3.7 onward, dictionaries maintain insertion order. However,
set(words)is unordered, so the keys offrequencymay appear in any sequence. To preserve the first-occurrence order of words from the original list, replaceset(words)withdict.fromkeys(words), which yields unique words in their original order.
Exercise 8: Invert a Dictionary
Problem Statement: Given a dictionary, use a dict comprehension to produce a new dictionary where the original keys become values and the original values become keys. The original dictionary must remain unchanged.
Purpose: Dictionary inversion is a common operation when you need to look up data in the reverse direction — for example, finding a country from a country code when your data is keyed by country name. This exercise shows that dict comprehensions are not limited to building new data; they can restructure existing mappings with a single readable expression.
Given Input: codes = {"USD": "US Dollar", "EUR": "Euro", "GBP": "British Pound", "JPY": "Japanese Yen"}
Expected Output: {'US Dollar': 'USD', 'Euro': 'EUR', 'British Pound': 'GBP', 'Japanese Yen': 'JPY'}
▼ Hint
- Iterate over
codes.items()to get each key-value pair, then swap them in the comprehension:{value: key for key, value in codes.items()}. - Inversion only works correctly when all values in the original dictionary are unique and hashable. If two keys share the same value, the later one will silently overwrite the earlier one in the inverted dict.
▼ Solution & Explanation
Explanation:
codes.items(): Returns a view of all key-value pairs as two-element tuples. Unpacking each pair intokey, valuein theforclause gives direct access to both components, making the swap in the expression natural to read.{value: key ...}: Places the original value on the left of the colon (making it the new key) and the original key on the right (making it the new value). The comprehension produces a fresh dictionary object — the originalcodesis untouched.- Uniqueness requirement: Dictionary keys must be unique. If the original dict has duplicate values — for example, two currencies both named
"Dollar"— the last one encountered during iteration wins and the earlier mapping is lost without any warning. Always verify value uniqueness before inverting. - Hashability requirement: New keys must be hashable. Strings, numbers, and tuples are hashable; lists and dicts are not. If the original values are lists, a direct inversion will raise
TypeError. In that case, convert values to tuples first:{tuple(v): k for k, v in d.items()}.
Exercise 9: Unique Vowels
Problem Statement: Given a string, use a set comprehension to extract all unique vowels (a e i o u) that appear in it. The result should be a set, duplicates are discarded automatically and order is not guaranteed.
Purpose: Set comprehensions apply the same concise syntax as list comprehensions but produce a set, giving automatic deduplication for free. This exercise introduces the set comprehension form and demonstrates its natural fit for tasks where uniqueness matters more than order — a common requirement in text analysis, data cleaning, and membership testing.
Given Input: sentence = "the quick brown fox jumps over the lazy dog"
Expected Output: {'a', 'e', 'i', 'o', 'u'} (order may vary)
▼ Hint
- Use curly braces instead of square brackets to create a set comprehension:
{char for char in sentence if char in "aeiou"}. - The
inoperator checks membership in the vowel string. Because the sentence uses lowercase letters throughout, no case normalisation is needed here — but in general you would add.lower()to handle mixed-case input.
▼ Solution & Explanation
Explanation:
{...}vs.[...]: Replacing the square brackets with curly braces changes the output from a list to a set. The rest of the syntax is identical. A set stores each unique value once, so even though the lettereappears many times in the sentence, it appears only once in the result.char in "aeiou": Python’sinoperator tests membership in any iterable, including strings. This is equivalent to writingchar == 'a' or char == 'e' or ...but far more concise. For case-insensitive matching, preprocess withsentence.lower()before the comprehension.- Output order: Sets are unordered in Python. The printed representation may show the vowels in any sequence. If you need a sorted result, wrap the set in
sorted():sorted(unique_vowels)returns a list in alphabetical order. - Distinguishing from a dict comprehension: Both dict and set comprehensions use curly braces. Python tells them apart by the colon:
{k: v for ...}is a dict comprehension, while{expr for ...}without a colon is a set comprehension. An empty{}is always a dict, never an empty set — useset()for that.
Exercise 10: Square Mapping
Problem Statement: Use a dict comprehension to create a dictionary that maps each integer from 1 to 10 to its square. The keys should be the integers and the values should be their squares.
Purpose: This exercise reinforces dict comprehension syntax by pairing range() with a mathematical expression as the value. The resulting lookup table is a practical data structure: instead of recomputing a square every time it is needed, you build the mapping once and retrieve values by key in O(1) time.
Given Input: Integers from 1 to 10 (inclusive).
Expected Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100}
▼ Hint
Use range(1, 11) as the iterable and write the key-value pair as n: n ** 2: {n: n ** 2 for n in range(1, 11)}. The colon separates the key expression from the value expression inside the curly braces.
▼ Solution & Explanation
Explanation:
{n: n ** 2 for n in range(1, 11)}: The key expression (n) and value expression (n ** 2) are separated by a colon, and both are evaluated for each item produced byrange(1, 11). The result is a fully populated dictionary in a single line.- Lookup table use case: Once built,
squares_map[7]returns49instantly. If squaring were a slow operation — say, loading a pre-trained model result for each input — building the table once and reusing it would be a significant optimisation. This is the memoisation pattern applied manually. - Key and value can be different expressions: The key and value expressions are entirely independent. You could write
{n: n ** 3 for n in range(1, 11)}for cubes, or{n: f"square of {n} is {n**2}" for n in range(1, 11)}for a string-valued map. Any combination of hashable key expression and arbitrary value expression is valid. - Filtering in a dict comprehension: Like list comprehensions, dict comprehensions accept an
ifclause:{n: n ** 2 for n in range(1, 11) if n % 2 == 0}produces a map of only the even numbers. The filter, loop, and key-value expressions all live in the same single-line structure.
Exercise 11: FizzBuzz with Comprehension
Problem Statement: Produce the classic FizzBuzz sequence for integers 1 to 50 in a single list comprehension. Each element should be "FizzBuzz" if the number is divisible by both 3 and 5, "Fizz" if divisible by 3 only, "Buzz" if divisible by 5 only, and the number itself (as an integer) otherwise.
Purpose: FizzBuzz requires branching logic with multiple conditions — traditionally written as a cascade of if/elif/else statements. Expressing it inside a comprehension using nested ternary expressions shows how conditional output expressions work, and forces you to think carefully about condition order to get correct results.
Given Input: Integers from 1 to 50 (inclusive).
Expected Output:
[1, 2, 'Fizz', 4, 'Buzz', 'Fizz', 7, 8, 'Fizz', 'Buzz', 11, 'Fizz', 13, 14, 'FizzBuzz', 16, 17, 'Fizz', 19, 'Buzz', 21, 22, 'Fizz', 'Buzz', 26, 'Fizz', 28, 29, 'FizzBuzz', 31, ...]
▼ Hint
- Use a nested ternary expression as the output value:
"FizzBuzz" if n % 15 == 0 else "Fizz" if n % 3 == 0 else "Buzz" if n % 5 == 0 else n. The divisibility by both 3 and 5 is equivalent to divisibility by 15, so check that first to avoid it being masked by the individual checks. - Order matters: if you check
n % 3 == 0beforen % 15 == 0, numbers divisible by 15 will match the"Fizz"branch and never reach"FizzBuzz". Always test the most specific condition first.
▼ Solution & Explanation
Explanation:
- Nested ternary as the output expression: Python evaluates the ternary chain from left to right, taking the first branch whose condition is true. Writing the expression across multiple indented lines does not change the logic — it only improves readability. The entire multi-line block before
for n in range(1, 51)is still one expression. n % 15 == 0must come first: A number divisible by 15 is also divisible by 3 and by 5. Ifn % 3 == 0were tested first, multiples of 15 would match that branch and return"Fizz"instead of"FizzBuzz". Testing the most specific condition first prevents it from being swallowed by a less specific one.- Mixed output types: The list contains both strings and integers. Python lists are heterogeneous, so this is valid. The integer fallback
nis returned as-is rather than converted to a string —str(n)would make the list uniformly typed if that is preferred. - Readability vs. cleverness: Nested ternaries inside comprehensions can become hard to parse quickly. For anything beyond three branches, a helper function is clearer:
def fizzbuzz_label(n): ...called as[fizzbuzz_label(n) for n in range(1, 51)]. The comprehension stays clean and the logic lives in a testable, named function.
Exercise 12: Matrix Transposition
Problem Statement: Given a 3×3 matrix represented as a list of lists, use a nested list comprehension to produce its transpose — a new matrix where rows and columns are swapped. Row i, column j of the original becomes row j, column i of the result.
Purpose: Matrix transposition is a fundamental operation in linear algebra, image processing, and data manipulation (it is what pandas.DataFrame.T does internally). Implementing it with a nested list comprehension — where the outer loop produces rows and the inner loop fills columns — deepens understanding of how nested comprehensions build two-dimensional structures rather than flattening them.
Given Input: matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Expected Output:
Original: [[1, 2, 3], [4, 5, 6], [7, 8, 9]] Transposed: [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
▼ Hint
- The outer comprehension iterates over column indices (
j in range(3)) to produce each new row. The inner comprehension iterates over row indices (i in range(3)) to collect the elements for that new row:[[matrix[i][j] for i in range(3)] for j in range(3)]. - Think of it this way: the transposed row
jis made up of columnjfrom every original row. The inner comprehension picksmatrix[0][j],matrix[1][j],matrix[2][j]— the entirej-th column.
▼ Solution & Explanation
Explanation:
- Structure of the nested comprehension:
[[inner] for j in range(3)]is the outer comprehension — it produces three rows.[matrix[i][j] for i in range(3)]is the inner comprehension — it produces the elements of each row by walking down columnjacross all original rows. The result is a list of lists, not a flat list. - Flattening vs. nesting: Exercise 5 used
[item for row in matrix for item in row]— twoforclauses in one comprehension — to produce a flat list. Transposition uses[[inner] for outer]— an inner comprehension nested inside square brackets — to produce a list of lists. The placement of the inner brackets is the key structural difference. - Generalising beyond 3×3: Replace the hardcoded
3withlen(matrix)andlen(matrix[0])to handle any rectangular matrix:[[matrix[i][j] for i in range(len(matrix))] for j in range(len(matrix[0]))]. - Built-in alternative:
list(map(list, zip(*matrix)))is the idiomatic one-liner for transposition.zip(*matrix)unpacks the rows as arguments tozip, which groups elements by column position, andmap(list, ...)converts each zip tuple to a list. The comprehension approach is shown here because it makes the row-column index logic explicit.
Exercise 13: Cartesian Product
Problem Statement: Given two lists, use a nested list comprehension to generate every possible (x, y) pair where x comes from the first list and y comes from the second. This is the Cartesian product of the two lists. Do not use itertools.product.
Purpose: The Cartesian product is needed any time you want to try every combination of two sets of values — grid coordinates, test parameter combinations, pairing sizes with colours in a catalogue, or generating game board positions. A nested comprehension produces it naturally and clearly, and understanding it is the key to recognising when itertools.product is the right tool for more complex cases.
Given Input: xs = [1, 2, 3] and ys = ["a", "b", "c"]
Expected Output:
[(1, 'a'), (1, 'b'), (1, 'c'), (2, 'a'), (2, 'b'), (2, 'c'), (3, 'a'), (3, 'b'), (3, 'c')]
▼ Hint
- Use two
forclauses in a single list comprehension:[(x, y) for x in xs for y in ys]. The outer loop fixesxand the inner loop cycles through all values ofybeforexadvances — which is exactly how the Cartesian product is ordered. - The output expression
(x, y)is a tuple literal. Tuples inside a list comprehension do not need extra parentheses around them to be valid, but including them improves readability.
▼ Solution & Explanation
Explanation:
for x in xs for y in ys: Twoforclauses in a single comprehension produce the same result as two nestedforloops. The left-most loop is outermost:xtakes the value1whileycycles through"a","b","c"; thenxadvances to2, and so on. The total number of pairs islen(xs) * len(ys).- Tuple output expression:
(x, y)creates a tuple for each combination. Tuples are the natural container for fixed-size heterogeneous pairs. If you need a list of lists instead, replace(x, y)with[x, y]. - Adding a filter: You can exclude certain pairs with an
ifclause:[(x, y) for x in xs for y in ys if x != 2]omits all pairs wherexis2. The condition can reference both loop variables, making it possible to enforce diagonal or relational constraints. - When to use
itertools.product: For two lists the comprehension is perfectly clear. For three or more lists,itertools.product(xs, ys, zs)is cleaner and avoids deeply nestedforclauses. It also works lazily, which matters when the product is very large.
Exercise 14: Extract Digits
Problem Statement: Given a list of strings that contain a mix of letters and digits, use a nested list comprehension to extract every numeric character from every string and collect them all into a single flat list of digit characters.
Purpose: This exercise combines nested iteration with a filter condition in one comprehension to simultaneously flatten and filter a two-level structure. It mirrors a real pattern in data cleaning: stripping non-numeric characters from product codes, phone numbers, or identifiers spread across a collection of strings.
Given Input: strings = ["abc123", "hello", "42px", "year2024", "no-digits"]
Expected Output: ['1', '2', '3', '4', '2', '4', '2', '0', '2', '4']
▼ Hint
- Use two
forclauses and oneifclause:[char for s in strings for char in s if char.isdigit()]. The outer loop iterates over strings, the inner loop over characters within each string, and the condition keeps only digit characters. str.isdigit()returnsTruefor any character in the Unicode digit category, which covers0–9as well as superscript digits. For ASCII digits only,char in "0123456789"orchar.isdecimal()is more precise.
▼ Solution & Explanation
Explanation:
for s in strings: The outer loop visits each string in the input list. Strings with no digits — like"hello"and"no-digits"— contribute zero elements to the output because none of their characters pass theifcondition. The comprehension handles sparse input gracefully without any special-case logic.for char in s: The inner loop iterates over every character in the current string. Python strings are iterable sequences of single-character strings, so this works without any explicit indexing or slicing.char.isdigit(): ReturnsTruefor characters that represent a digit. For standard ASCII input this is equivalent to checkingchar in "0123456789". The result characters are kept as single-character strings — convert withint(char)in the output expression if you need integers:[int(char) for s in strings for char in s if char.isdigit()].- Three-clause comprehension pattern:
[expr for outer in collection for inner in outer if condition]is the general template for simultaneously flattening and filtering a nested structure. It combines the techniques from Exercise 5 (flattening with twoforclauses) and Exercise 2 (filtering with anifclause) into one expression.
Exercise 15: Nested Filtering
Problem Statement: Given a list of lists of numbers, use a list comprehension to extract only those sub-lists where every element is positive. Each qualifying sub-list should appear in the output intact — do not flatten or modify the inner lists.
Purpose: This exercise separates nested filtering from nested iteration. Rather than looping inside a sub-list to transform its contents, you are evaluating a condition that considers the sub-list as a whole and either keeps it or discards it. This pattern appears whenever you want to select complete rows, groups, or records based on a property of all their members.
Given Input: groups = [[1, 2, 3], [-1, 4, 5], [6, 7, 8], [0, 9, 10], [-3, -1, 2], [4, 5, 6]]
Expected Output: [[1, 2, 3], [6, 7, 8], [4, 5, 6]]
▼ Hint
- Use
all()in the filter condition:[group for group in groups if all(n > 0 for n in group)].all()returnsTrueonly when every element of the iterable it receives is truthy — here, only when every number in the sub-list is strictly greater than zero. - Note that
0is not positive, so[0, 9, 10]should be excluded. If you want to include zero, change the condition ton >= 0.
▼ Solution & Explanation
Explanation:
all(n > 0 for n in group): The argument toall()is a generator expression that yieldsTrueorFalsefor each element.all()short-circuits on the firstFalseit encounters, so it stops iterating as soon as a non-positive number is found — making it efficient for sub-lists that fail early.- Output expression is the whole sub-list: The output expression is simply
group, not any transformation of it. The comprehension acts purely as a filter — it decides whether each sub-list is included, not how it looks. This is the same[x for x in data if condition]form from Exercise 2, applied one level up in a nested structure. all()on an empty sub-list: By mathematical convention,all()returnsTruefor an empty iterable (vacuous truth). This means an empty list[]would pass the filter and appear in the output. Addand groupto the condition —if group and all(n > 0 for n in group)— if you want to exclude empty sub-lists.- Counterpart with
any(): Replacingall()withany()keeps sub-lists where at least one element is positive. The two built-ins cover the two most common whole-group conditions and eliminate the need for manual flag variables or nested loops.
Exercise 16: Conditional Dict Comprehension
Problem Statement: Given a dictionary mapping student names to their exam scores, use a dict comprehension with an if clause to produce a new dictionary containing only the students who passed — defined as a score of 50 or above.
Purpose: Filtering a dictionary down to a relevant subset is one of the most common data-processing tasks in Python: extracting active users, keeping only in-stock products, or selecting records above a threshold. A dict comprehension with an if clause handles this in a single readable line, replacing a manual loop with conditional assignment.
Given Input: scores = {"Alice": 82, "Bob": 45, "Charlie": 91, "Diana": 37, "Eve": 55, "Frank": 49}
Expected Output: {'Alice': 82, 'Charlie': 91, 'Eve': 55}
▼ Hint
Iterate over scores.items() and add a condition on the value: {name: score for name, score in scores.items() if score >= 50}. The if clause in a dict comprehension works identically to the one in a list comprehension — only key-value pairs that satisfy the condition are included.
▼ Solution & Explanation
Explanation:
scores.items(): Returns all key-value pairs as(name, score)tuples. Unpacking into two variables in theforclause gives direct, readable access to both the key and the value — avoiding the less clear pattern of looking up the value separately withscores[name].if score >= 50: The condition is evaluated against the value (score), but it could reference the key, the value, or both. For example,if score >= 50 and name != "Eve"would additionally exclude Eve. The full key-value context is available throughout the comprehension.- Original dictionary unchanged: Like all comprehensions, this produces a new dictionary object. The original
scoresdict is unmodified, so you can derive multiple filtered views from the same source without any risk of data loss. - Transforming the value in the same step: You can filter and transform simultaneously:
{name: score - 50 for name, score in scores.items() if score >= 50}produces a dict of passing margins. The key expression, value expression, and filter condition are all independent and can each be as complex as needed.
Exercise 17: Grouped Dict from Two Lists
Problem Statement: Given a list of keys and a list of values of equal length, use a dict comprehension with zip() to pair them into a dictionary. Where a value is None, substitute it with the string "N/A".
Purpose: Combining two parallel lists into a dictionary is a common data-wrangling task — turning a list of column headers and a list of row values into a record, for example. Adding the None-to-"N/A" substitution shows how a conditional expression on the value side of a dict comprehension handles missing data inline, without a separate preprocessing step.
Given Input: keys = ["name", "age", "city", "email"] and values = ["Alice", 30, None, None]
Expected Output: {'name': 'Alice', 'age': 30, 'city': 'N/A', 'email': 'N/A'}
▼ Hint
- Use
zip(keys, values)to pair elements by position, then apply a conditional value expression:{k: (v if v is not None else "N/A") for k, v in zip(keys, values)}. - Use
is not Nonerather than!= Nonefor the check. Theisoperator tests identity (whether the object is literally theNonesingleton), which is the correct and idiomatic way to test forNonein Python.
▼ Solution & Explanation
Explanation:
zip(keys, values): Pairs elements at the same index from both lists, producing("name", "Alice"),("age", 30),("city", None),("email", None). Unpacking each pair intok, vin theforclause gives clean named access to both.zip()stops at the shorter list if the two differ in length.v if v is not None else "N/A": This is a ternary expression used as the value side of the dict comprehension. Whenvholds an actual value — including0,False, or an empty string — it is kept as-is. Only the literalNoneis replaced. Usingis not Noneinstead of a truthiness check (if v) is important here:0andFalseare falsy but should not be replaced with"N/A".- Shorter alternative with
or:v or "N/A"is a tempting shortcut, but it replaces any falsy value — including0,False, and empty strings — not justNone. This is almost always the wrong behaviour for missing-data substitution. The explicit ternary withis not Noneis safer and communicates intent clearly. - Building records from CSV rows: This pattern scales directly to CSV parsing:
headerscomes from the first row androw_valuesfrom each subsequent row. A dict comprehension withzipturns each row into a named record, which can then be appended to a list or passed to downstream processing.
Exercise 18: Set of Common Elements
Problem Statement: Given two lists of integers that may contain duplicates, use a set comprehension to find all numbers that appear in both lists. Each number should appear only once in the result regardless of how many times it occurs in either list.
Purpose: Finding the intersection of two collections is a fundamental operation in data analysis — matching customer IDs across two datasets, finding common tags between posts, or identifying shared product codes. This exercise shows how a set comprehension with an in check produces the intersection naturally, and sets up an intuition for when Python’s built-in set operations are the right tool.
Given Input: list_a = [1, 2, 3, 4, 5, 3, 2] and list_b = [3, 4, 5, 6, 7, 4, 5]
Expected Output: {3, 4, 5} (order may vary)
▼ Hint
- Iterate over one list and check membership in the other:
{n for n in list_a if n in list_b}. Because the result is a set, duplicate values inlist_athat pass the filter appear only once in the output. - For large lists, convert
list_bto a set first —set_b = set(list_b)— before the comprehension. Membership testing withinis O(1) for sets and O(n) for lists, so this dramatically reduces the overall time for long inputs.
▼ Solution & Explanation
Explanation:
{n for n in list_a if n in list_b}: Iterates over every element oflist_aand includes it in the output set only if it also exists inlist_b. The set container automatically discards duplicates, so even though3and2each appear twice inlist_a, only one copy of each can ever be in the result.n in list_b— performance note: Checking membership in a list is O(n) — Python scans from the start until it finds a match or exhausts the list. For a shortlist_bthis is fine, but for thousands of elements the quadratic behaviour becomes noticeable. Replacinglist_bwithset(list_b)reduces each membership test to O(1).- Built-in intersection:
set(list_a) & set(list_b)produces the same result using Python’s optimised set intersection algorithm in O(min(len(a), len(b))) time. Prefer this in production code. The comprehension form is shown here because it makes the filter logic explicit and is easier to extend — for example, adding a second condition such asif n in list_b and n > 2. - Difference and union with comprehensions: Set difference (elements in A but not B) is
{n for n in list_a if n not in list_b}. Union (elements in either) is harder to express purely as a comprehension — the built-inset(list_a) | set(list_b)is clearer and more efficient for that case.
Exercise 19: Character Frequency Set
Problem Statement: Given a sentence, use a set comprehension with a condition to build a set of all characters — excluding spaces — that appear more than once in the sentence. The result should contain each such character exactly once.
Purpose: This exercise combines set comprehensions with a condition that requires querying the source data (the sentence’s character counts) to decide which elements to include. It shows that the filter condition in a comprehension is not limited to simple comparisons — it can call any function, including str.count(), which turns a comprehension into a concise frequency-based analysis tool.
Given Input: sentence = "comprehension makes python powerful"
Expected Output: A set of characters that each appear at least twice. Expected characters include 'o', 'e', 'n', 's', 'i', 'p', and 'h' (order may vary).
▼ Hint
- Iterate over the sentence and use
sentence.count(char)as the condition:{char for char in sentence if char != " " and sentence.count(char) > 1}. - Because the result is a set, each qualifying character appears only once regardless of how many times it is visited during iteration. You do not need to deduplicate manually.
▼ Solution & Explanation
Explanation:
sentence.count(char) > 1:str.count()scans the entire string and returns how many times the substring appears. Calling it inside the comprehension condition means it is invoked once per unique character position — so for a sentence with 35 characters it is called 35 times. For short strings this is perfectly acceptable; for very long text, precompute aCounterfirst and look up counts in O(1).char != " ": Explicitly excludes the space character, which typically appears many times in a sentence. This condition can be extended withchar.isalpha()to also exclude punctuation and digits if you want only letter characters.- Set deduplication does the heavy lifting: The comprehension visits every character in the sentence, including many repeats. Because the container is a set, each character that passes the filter is stored only once. There is no need to check whether the character is already in the result — the set handles that automatically.
- Efficient alternative with
Counter:from collections import Counter; {char for char, count in Counter(sentence).items() if char != " " and count > 1}computes all character frequencies in a single O(n) pass and then filters the result. This avoids the repeatedstr.count()scans and is the recommended approach for longer strings.
Exercise 20: Lazy Squares
Problem Statement: Create a generator expression that yields the squares of integers from 1 to 1,000,000. Retrieve only the first 10 values using itertools.islice(). Then compare the memory footprint of the generator against an equivalent list comprehension using sys.getsizeof() to make the efficiency difference concrete.
Purpose: A list comprehension materialises every value into memory immediately. A generator expression computes each value only when requested and stores nothing beyond the current position. When only a small portion of a large sequence is needed, the generator uses a constant amount of memory regardless of the sequence’s theoretical length — a critical distinction for large-scale data processing.
Given Input: Integers from 1 to 1,000,000.
Expected Output:
First 10 squares: [1, 4, 9, 16, 25, 36, 49, 64, 81, 100] Generator size: 208 bytes List size: 8,000,056 bytes
▼ Hint
- A generator expression uses parentheses instead of square brackets:
(n ** 2 for n in range(1, 1_000_001)). No values are computed at this point — the generator is just a plan. - Use
list(itertools.islice(gen, 10))to pull the first 10 values. Then create the equivalent list comprehension[n ** 2 for n in range(1, 1_000_001)]and compare both withsys.getsizeof().
▼ Solution & Explanation
Explanation:
- Parentheses create a generator, brackets create a list: The only syntactic difference between
(n ** 2 for n in range(...))and[n ** 2 for n in range(...)]is the delimiter. The parentheses produce a generator object that holds a reference to the iteration state and the expression to evaluate — no computed values are stored. The brackets build a fully materialised list in memory immediately. itertools.islice(gen, 10): Pulls values from the generator one at a time until 10 have been retrieved, then stops. The generator is paused at that point — if you calledislice(gen, 5)again on the same object, you would get squares 11 through 15 because the generator remembers where it was.sys.getsizeof(): Returns the memory footprint of the Python object itself in bytes. For the generator this is a fixed ~200 bytes regardless of how many values remain. For the list it scales with the number of elements — roughly 8 bytes per integer reference plus a small header, giving about 8 MB for one million integers.- Practical guidance: Use a generator expression when you only need to iterate over the results once and do not need random access or a known length. Use a list comprehension when you need to index into the result, check its length, iterate it multiple times, or pass it to a function that requires a sequence rather than an iterator.
Exercise 21: Infinite Fibonacci Generator
Problem Statement: Write a generator function fibonacci() that yields Fibonacci numbers indefinitely. Then use itertools.takewhile() to lazily consume from it, stopping as soon as a value exceeds a given limit. Collect the results into a list and print them.
Purpose: An infinite generator paired with takewhile() is the functional-style alternative to a while loop with a break condition. It separates the concern of producing values from the concern of deciding when to stop, making both halves independently reusable and testable. This pattern is fundamental to stream processing and lazy pipelines in Python.
Given Input: All Fibonacci numbers up to but not including 200.
Expected Output: [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
▼ Hint
- Write
fibonacci()as a generator function usingyield. Maintain two variablesa, b = 1, 1and update them with simultaneous assignmenta, b = b, a + bon each iteration inside an infinitewhile Trueloop. - Apply
itertools.takewhile(lambda n: n < 200, fibonacci())to create a lazy iterator that yields values from the generator as long as the condition holds, and stops the moment it does not. - Wrap the whole thing in
list()to force evaluation and collect the results.
▼ Solution & Explanation
Explanation:
yield ainsidewhile True: Each time the caller requests the next value, execution resumes after theyield, updatesaandbwith the simultaneous assignment, and loops back toyieldagain. The simultaneous assignmenta, b = b, a + bevaluates both right-hand sides before either variable is updated, so the old value ofais used correctly in the sum.itertools.takewhile(predicate, iterable): Pulls values fromiterableone at a time. While the predicate returnsTrue, the value is passed through to the caller. The moment the predicate returnsFalsefor any value,takewhilestops immediately — it does not check subsequent values. The stopping value itself is consumed and discarded.- Separation of concerns: The generator knows only how to produce Fibonacci numbers. The
takewhileknows only the stopping condition. Either can be swapped independently: change the limit without touching the generator, or reuse the generator withitertools.islice(fibonacci(), 20)to take a fixed count instead. - Memory efficiency: At any moment, only the current
aandbvalues exist in memory regardless of how many numbers have been yielded. Collecting the result withlist()does allocate all qualifying values at once — if the limit is very large, consider processing them one at a time with aforloop over thetakewhileiterator instead.
Exercise 22: Chained Generator Pipeline
Problem Statement: Build a three-stage generator pipeline: the first stage yields integers from a given range, the second filters only prime numbers from that stream, and the third formats each prime as a string like "Prime: 7". Chain all three stages together without materialising any intermediate list. Retrieve and print the first 10 results.
Purpose: This exercise shows the full power of chaining generator expressions into a multi-stage pipeline. Each stage is a lazy wrapper around the previous one — values flow through all three transforms one at a time, with no intermediate collections ever built in memory. This pattern is the Python equivalent of Unix pipes and is the foundation of memory-efficient data processing at scale.
Given Input: Integers from 2 to 100.
Expected Output:
['Prime: 2', 'Prime: 3', 'Prime: 5', 'Prime: 7', 'Prime: 11', 'Prime: 13', 'Prime: 17', 'Prime: 19', 'Prime: 23', 'Prime: 29']
▼ Hint
- Write a helper function
is_prime(n)that returnsTrueifnis prime. A simple trial-division check — testing divisibility by all integers from2up toint(n ** 0.5) + 1— is sufficient for this range. - Create the three stages as separate generator expressions, each referencing the previous one as its source:
numbers = (n for n in range(2, 101)), thenprimes = (n for n in numbers if is_prime(n)), thenformatted = (f"Prime: {n}" for n in primes). - Use
list(itertools.islice(formatted, 10))to pull exactly 10 formatted primes through the entire pipeline in one pass.
▼ Solution & Explanation
Explanation:
- Each stage wraps the previous one:
numbers,primes, andformattedare three nested generator objects. Defining them creates no computation — each is just a description of what to do when values are requested. Only whenislicepulls a value fromformatteddoes the chain wake up:formattedasksprimesfor the next value, which asksnumbers, which yields an integer, which is tested for primality, and if it passes, is formatted and returned. - Single-pass, no intermediate storage: At any moment exactly one integer exists in the pipeline as it moves through the three stages. There is no list of all integers, no list of all primes — just the current value being evaluated. The memory footprint is O(1) regardless of the range size.
is_primeas a helper function: The primality check is too complex for a single expression inside a generator condition, so it lives in a named function. This is good practice: keep generator expressions simple and delegate logic to named, testable helpers. The generator pipeline remains readable while the helper can be unit-tested independently.- Extending the pipeline: Adding a fourth stage is as simple as writing another generator expression that references
formatted. No existing stage needs to change. This open-ended composability — adding stages without modifying the pipeline — is the key architectural advantage over a loop-based approach where all stages are interleaved in one block of code.
Exercise 23: CSV Row Parser
Problem Statement: Given a multi-line CSV string, use a generator expression to lazily parse it row by row. Each yielded value should be a list of stripped string values produced by splitting on commas. Skip the header row. Consume the generator with a for loop and print each parsed row.
Purpose: In real applications, CSV files can be gigabytes in size — far too large to load into memory at once. A generator expression over the lines of a file (or a string in this exercise) reads and parses one row at a time, keeping memory usage constant. This exercise builds the habit of thinking lazily about row-by-row data processing before reaching for a library like pandas.
Given Input: A CSV string with a header row and four data rows containing name, age, and city fields.
Expected Output:
['Alice', '30', 'New York'] ['Bob', '25', 'London'] ['Charlie', '35', 'Tokyo'] ['Diana', '28', 'Paris']
▼ Hint
- Split the CSV string into lines with
csv_data.strip().splitlines(). Slice off the header with[1:]to skip it, then build a generator expression that strips and splits each remaining line:(line.strip().split(",") for line in lines). - To strip whitespace from individual field values as well as the line itself, use a nested list comprehension inside the generator:
([field.strip() for field in line.split(",")] for line in lines).
▼ Solution & Explanation
Explanation:
splitlines()[1:]:splitlines()splits the string on newline characters and returns a list of lines without the newline characters themselves. Slicing with[1:]discards the header row so the generator only processes data rows. This list of lines is small and created once — the lazy evaluation happens in the generator expression that iterates over it.- Nested list comprehension inside a generator expression: The outer structure
(... for line in lines)is a generator expression — lazy, O(1) memory. The inner structure[field.strip() for field in line.split(",")]is a list comprehension — it fully evaluates one row at a time as the generator is consumed. This combination is idiomatic: the row-level list is small and finite, while the line-level iteration stays lazy. - Field-level stripping: Applying
.strip()to each individual field after splitting handles CSV files where values are padded with spaces around commas ("Alice, 30, New York"). Stripping only the whole line would leave leading spaces on fields after the first comma. - Scaling to real files: Replace the string with an open file object and the same generator pattern works unchanged:
rows = ([field.strip() for field in line.split(",")] for line in open("data.csv")). Python’s file objects are themselves lazy iterators, so the entire pipeline — open, iterate, split, strip — processes one line at a time with constant memory. For production CSV parsing with quoting and escaping, use the standard library’scsv.reader, which wraps the same lazy pattern.
Exercise 24: Dict of Grouped Anagrams
Problem Statement: Given a list of words, group all anagrams together into a dictionary where each key is the sorted tuple of letters shared by a group, and each value is a list of all words from the input that are anagrams of one another. Use a dict comprehension and sorted() to build the grouping in a single pass.
Given Input: words = ["eat", "tea", "tan", "ate", "nat", "bat", "tab"]
Expected Output:
{('a', 'e', 't'): ['eat', 'tea', 'ate'],
('a', 'n', 't'): ['tan', 'nat'],
('a', 'b', 't'): ['bat', 'tab']}
▼ Hint
- Two words are anagrams if and only if sorting their characters produces the same result. Use
tuple(sorted(word))as the grouping key — it is hashable (unlike a sorted list) and identical for all anagrams. - A plain dict comprehension cannot accumulate multiple words per key — if you write
{tuple(sorted(w)): w for w in words}, each key will only hold the last word that hashed to it. To collect groups, build the key-to-list mapping usingdefaultdict(list)with a loop, then show that thesorted()key idea is what makes the grouping work.
▼ Solution & Explanation
Explanation:
tuple(sorted(word)):sorted(word)returns a list of the word’s characters in alphabetical order — the same list for every anagram in the group. Wrapping it intuple()makes it hashable so it can be used as a dictionary key. For example,"eat","tea", and"ate"all produce('a', 'e', 't').- Why a plain dict comprehension cannot group: A dict comprehension evaluates one key-value pair per iteration and stores the result immediately. If two iterations produce the same key, the second overwrites the first — there is no way to accumulate values under a shared key.
defaultdict(list)solves this by initialising a new empty list automatically the first time a key is seen, allowing.append()to build up the group across multiple iterations. - The comprehension’s role here: While
defaultdictdoes the accumulation, the key insight — usingtuple(sorted(word))as a canonical fingerprint — is the comprehension-style thinking. This same key expression could power agroupbyin a pipeline:itertools.groupby(sorted(words, key=lambda w: sorted(w)), key=lambda w: tuple(sorted(w)))achieves the same grouping lazily. - Alternative with
itertools.groupby: Sorting the word list by the canonical key first and then applyinggroupbyproduces the same groups lazily without adefaultdict:{k: list(v) for k, v in itertools.groupby(sorted(words, key=lambda w: tuple(sorted(w))), key=lambda w: tuple(sorted(w)))}. This is a genuine dict comprehension that groups correctly, becausegroupbyhas already collected consecutive equal-key items into groups before the comprehension runs.
Exercise 25: Comprehension vs Generator Benchmark
Problem Statement: Write the same transformation — squaring all even numbers from 1 to 100,000 — as both a list comprehension and a generator expression. Use sys.getsizeof() to compare their memory footprints and timeit.timeit() to measure construction time and full iteration time for each. Print a clear report and explain when each form is preferable.
Purpose: Knowing that generators are more memory-efficient than lists is theoretical until you measure it. This exercise makes both differences concrete with real numbers, and frames the decision not as “generators are always better” but as a trade-off: generators win on memory and initial construction speed, while lists win when you need random access, multiple iterations, or a known length. Understanding this trade-off is essential for writing Python that is both correct and efficient at scale.
Given Input: Even numbers from 1 to 100,000, squared.
Expected Output:
--- Memory --- List size: 434,944 bytes Gen size: 208 bytes --- Construction time (1000 runs) --- List build: 0.4521s Gen build: 0.0003s --- Full iteration time (1000 runs) --- List iter: 0.1843s Gen iter: 0.5102s
▼ Hint
- Use
timeit.timeit(stmt, number=1000)to time each operation. Pass the code as a string and use thesetupparameter for any imports. Alternatively, wrap each operation in a lambda:timeit.timeit(lambda: list(...), number=1000). - For the generator iteration time, you must force full consumption to measure it fairly — wrap the generator in
sum()orcollections.deque(maxlen=0)inside the timed call to drain it without building a list. - Remember that a generator can only be iterated once. Create a fresh generator inside each timed call — do not reuse the same object across timing runs.
▼ Solution & Explanation
Explanation:
- Memory: generator wins decisively:
sys.getsizeof()on a list returns the size of the list object plus its internal pointer array — roughly 8 bytes per element. For 50,000 even-number squares that is around 400 KB. The generator object is a fixed ~200 bytes regardless of how many values it would produce, because it stores only the iteration state (the current position inrange) and the expression to evaluate, not the values themselves. - Construction time: generator wins: Creating a list comprehension computes all 50,000 squares immediately, allocates memory for them, and stores every result. Creating a generator expression takes microseconds because nothing is computed — it just records the recipe. This difference matters when the pipeline is set up many times but each instance is consumed partially or not at all.
- Full iteration time: list wins slightly: Once you need every value, a list is faster to iterate because its elements are stored contiguously in memory (cache-friendly) and Python’s list iterator is implemented in optimised C. A generator incurs a small per-value overhead from resuming the generator frame and re-evaluating the expression each time. The gap is modest — typically 10-30% slower for generators — but it is real and worth knowing.
- Decision guide: Use a generator expression when you need to iterate once, process a large or infinite sequence, or embed the expression inside a function like
sum(),max(), orany()that consumes it in one pass. Use a list comprehension when you need to iterate multiple times, index by position, check the length, pass the result to a function that requires a sequence, or when the full result must be available before any downstream processing can begin.

Leave a Reply