Python Iterators
Master the art of iteration and create memory-efficient programs
🔄 What are Iterators?
Iterators are objects that implement the iterator protocol, allowing you to traverse through collections one element at a time. They provide a memory-efficient way to process large datasets without loading everything into memory at once.
# Basic iterator example
numbers = [1, 2, 3, 4, 5] # Create an iterable list
iterator = iter(numbers) # Get iterator from list
# Using next() to iterate
print(next(iterator)) # Output: 1
print(next(iterator)) # Output: 2
# String iterator
text = "Hello"
text_iterator = iter(text)
print(next(text_iterator)) # Output: H
Understanding Iterator Theory
Before diving into code, let's understand the fundamental concepts behind iterators:
🧠 Core Concepts
Iterator Protocol
An object must implement
__iter__()
and
__next__()
methods to be considered an iterator.
Lazy Evaluation
Iterators generate values on-demand, not all at once, making them memory efficient.
One-Time Use
Most iterators can only be traversed once. After exhaustion, they need to be recreated.
StopIteration
When an iterator has no more items, it raises a
StopIteration
exception.
💡 Key Benefits
- Memory Efficiency: Process large datasets without loading everything into memory
- Performance: Generate values only when needed (lazy evaluation)
- Composability: Chain iterators together for complex data processing
- Infinite Sequences: Create iterators that generate infinite sequences
- Clean Code: Write more readable and Pythonic code
Built-in Iterators
Python provides many built-in iterators. Let's start with the basics:
# Lists are iterable, but not iterators themselves
numbers = [1, 2, 3, 4, 5]
# Get an iterator from a list
number_iterator = iter(numbers)
print(f"Iterator object: {number_iterator}")
# Use next() to get values one by one
print(next(number_iterator)) # 1
print(next(number_iterator)) # 2
print(next(number_iterator)) # 3
# Iterate through remaining items
for num in number_iterator:
print(num) # 4, 5
# String iteration
text = "Hello"
text_iter = iter(text)
print(next(text_iter)) # 'H'
print(next(text_iter)) # 'e'
🎯 Pro Tips
-
Check if iterable:
Use
hasattr(obj, '__iter__') -
Safe iteration:
Use
next(iterator, default_value)to avoid StopIteration -
Convert to list:
Use
list(iterator)to consume all values -
Iterator vs Iterable:
Lists are iterable, but
iter(list)returns an iterator
Creating Custom Iterators
Learn to create your own iterator classes by implementing the iterator protocol:
# Simple counter iterator
class Counter:
def __init__(self, start, end):
self.current = start
self.end = end
def __iter__(self):
return self
def __next__(self):
if self.current >= self.end:
raise StopIteration
else:
self.current += 1
return self.current - 1
# Usage
counter = Counter(1, 4)
for num in counter:
print(num) # 1, 2, 3
# Even numbers iterator
class EvenNumbers:
def __init__(self, limit):
self.limit = limit
self.current = 0
def __iter__(self):
return self
def __next__(self):
if self.current >= self.limit:
raise StopIteration
result = self.current
self.current += 2
return result
# Usage
evens = EvenNumbers(10)
print(list(evens)) # [0, 2, 4, 6, 8]
🔍 Iterator Protocol Deep Dive
__iter__(): Returns the iterator object itself. This makes the object iterable.
__next__(): Returns the next value in the sequence. When no more items are available, it raises StopIteration.
Best Practice: Always implement both methods for a complete iterator.
Generator Functions - Simplified Iterators
Generators provide an easier way to create iterators using the
yield
keyword:
# Simple generator function
def count_up_to(max_count):
count = 1
while count <= max_count:
yield count
count += 1
# Usage
counter = count_up_to(3)
print(next(counter)) # 1
print(next(counter)) # 2
print(next(counter)) # 3
# Fibonacci generator
def fibonacci(n):
a, b = 0, 1
count = 0
while count < n:
yield a
a, b = b, a + b
count += 1
# Usage
fib = fibonacci(5)
for num in fib:
print(num) # 0, 1, 1, 2, 3
# Square numbers generator
def squares(limit):
for i in range(limit):
yield i ** 2
# Usage
square_gen = squares(5)
print(list(square_gen)) # [0, 1, 4, 9, 16]
🚀 Generator Advantages
- Simpler Syntax: No need to implement __iter__ and __next__
- Automatic State Management: Python handles the state between yields
- Memory Efficient: Values are generated on-demand
- Readable Code: More intuitive than class-based iterators
Generator Expressions
Create generators using a concise syntax similar to list comprehensions:
# Basic generator expression
squares = (x**2 for x in range(5))
print(type(squares)) #
print(list(squares)) # [0, 1, 4, 9, 16]
# Filtered generator
even_squares = (x**2 for x in range(10) if x % 2 == 0)
print(list(even_squares)) # [0, 4, 16, 36, 64]
# String processing
words = ['hello', 'world', 'python', 'programming']
lengths = (len(word) for word in words)
print(list(lengths)) # [5, 5, 6, 11]
# Memory comparison
import sys
# List comprehension (stores all values)
list_comp = [x**2 for x in range(1000)]
print(f"List size: {sys.getsizeof(list_comp)} bytes")
# Generator expression (stores only the expression)
gen_exp = (x**2 for x in range(1000))
print(f"Generator size: {sys.getsizeof(gen_exp)} bytes")
📊 List vs Generator Comparison
List Comprehension
- ✅ Random access to elements
- ✅ Can iterate multiple times
- ❌ Uses more memory
- ❌ Slower for large datasets
Generator Expression
- ✅ Memory efficient
- ✅ Lazy evaluation
- ❌ One-time iteration
- ❌ No random access
Itertools Module
Python's itertools module provides powerful iterator building blocks:
import itertools
# count() - infinite counting
counter = itertools.count(start=5, step=2)
print(next(counter)) # 5
print(next(counter)) # 7
print(next(counter)) # 9
# cycle() - infinite repetition
colors = itertools.cycle(['red', 'green', 'blue'])
for i, color in enumerate(colors):
if i >= 6:
break
print(color) # red, green, blue, red, green, blue
# repeat() - repeat a value
repeated = itertools.repeat('hello', 3)
print(list(repeated)) # ['hello', 'hello', 'hello']
# chain() - combine iterators
list1 = [1, 2, 3]
list2 = [4, 5, 6]
chained = itertools.chain(list1, list2)
print(list(chained)) # [1, 2, 3, 4, 5, 6]
# islice() - slice an iterator
numbers = itertools.count()
first_five = itertools.islice(numbers, 5)
print(list(first_five)) # [0, 1, 2, 3, 4]
🛠️ Common Itertools Functions
- count(): Infinite arithmetic progression
- cycle(): Infinite repetition of iterable
- repeat(): Repeat a value n times
- chain(): Flatten nested iterables
- islice(): Slice an iterator efficiently
- takewhile(): Take elements while condition is true
- dropwhile(): Drop elements while condition is true
Practical Applications
Here are real-world scenarios where iterators shine:
# Memory-efficient file reading
def read_large_file(file_path):
"""Generator to read large files line by line"""
with open(file_path, 'r') as file:
for line in file:
yield line.strip()
# Usage (doesn't load entire file into memory)
# for line in read_large_file('huge_file.txt'):
# process_line(line)
# Batch processing iterator
def batch_processor(iterable, batch_size):
"""Process items in batches"""
iterator = iter(iterable)
while True:
batch = list(itertools.islice(iterator, batch_size))
if not batch:
break
yield batch
# Usage
data = range(10)
for batch in batch_processor(data, 3):
print(f"Processing batch: {batch}")
# Output: [0, 1, 2], [3, 4, 5], [6, 7, 8], [9]
# Data transformation pipeline
def process_numbers(numbers):
"""Multi-step data processing pipeline"""
# Step 1: Filter even numbers
evens = (x for x in numbers if x % 2 == 0)
# Step 2: Square them
squared = (x**2 for x in evens)
# Step 3: Filter large values
filtered = (x for x in squared if x < 100)
return filtered
# Usage
numbers = range(20)
result = process_numbers(numbers)
print(list(result)) # [0, 4, 16, 36, 64]
Best Practices & Performance Tips
Performance
- Use generators for large datasets
- Prefer generator expressions over list comprehensions when possible
- Chain iterators instead of creating intermediate lists
Memory Management
- Generators use constant memory regardless of input size
- Avoid converting generators to lists unless necessary
- Use itertools for memory-efficient operations
Code Quality
- Use descriptive names for generator functions
- Document the expected behavior and limitations
- Handle StopIteration gracefully
Common Pitfalls
- Remember generators are one-time use
- Don't modify the underlying data during iteration
- Be careful with infinite generators in loops
🧠 Test Your Knowledge
Test your understanding of Python iterators: