Python Iterators

Master the art of iteration and create memory-efficient programs

🔄 What are Iterators?

Iterators are objects that implement the iterator protocol, allowing you to traverse through collections one element at a time. They provide a memory-efficient way to process large datasets without loading everything into memory at once.


# Basic iterator example
numbers = [1, 2, 3, 4, 5]  # Create an iterable list
iterator = iter(numbers)    # Get iterator from list

# Using next() to iterate
print(next(iterator))      # Output: 1
print(next(iterator))      # Output: 2

# String iterator
text = "Hello"
text_iterator = iter(text)
print(next(text_iterator)) # Output: H
                                    
__iter__
Protocol
__next__
Method
Memory
Efficient

Understanding Iterator Theory

Before diving into code, let's understand the fundamental concepts behind iterators:

🧠 Core Concepts

Iterator Protocol

An object must implement __iter__() and __next__() methods to be considered an iterator.

Lazy Evaluation

Iterators generate values on-demand, not all at once, making them memory efficient.

One-Time Use

Most iterators can only be traversed once. After exhaustion, they need to be recreated.

StopIteration

When an iterator has no more items, it raises a StopIteration exception.

💡 Key Benefits

  • Memory Efficiency: Process large datasets without loading everything into memory
  • Performance: Generate values only when needed (lazy evaluation)
  • Composability: Chain iterators together for complex data processing
  • Infinite Sequences: Create iterators that generate infinite sequences
  • Clean Code: Write more readable and Pythonic code

Built-in Iterators

Python provides many built-in iterators. Let's start with the basics:

Basic Iterator Usage
# Lists are iterable, but not iterators themselves
numbers = [1, 2, 3, 4, 5]

# Get an iterator from a list
number_iterator = iter(numbers)
print(f"Iterator object: {number_iterator}")

# Use next() to get values one by one
print(next(number_iterator))  # 1
print(next(number_iterator))  # 2
print(next(number_iterator))  # 3

# Iterate through remaining items
for num in number_iterator:
    print(num)  # 4, 5

# String iteration
text = "Hello"
text_iter = iter(text)
print(next(text_iter))  # 'H'
print(next(text_iter))  # 'e'

🎯 Pro Tips

  • Check if iterable: Use hasattr(obj, '__iter__')
  • Safe iteration: Use next(iterator, default_value) to avoid StopIteration
  • Convert to list: Use list(iterator) to consume all values
  • Iterator vs Iterable: Lists are iterable, but iter(list) returns an iterator

Creating Custom Iterators

Learn to create your own iterator classes by implementing the iterator protocol:

Simple Custom Iterator
# Simple counter iterator
class Counter:
    def __init__(self, start, end):
        self.current = start
        self.end = end
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current >= self.end:
            raise StopIteration
        else:
            self.current += 1
            return self.current - 1

# Usage
counter = Counter(1, 4)
for num in counter:
    print(num)  # 1, 2, 3

# Even numbers iterator
class EvenNumbers:
    def __init__(self, limit):
        self.limit = limit
        self.current = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current >= self.limit:
            raise StopIteration
        result = self.current
        self.current += 2
        return result

# Usage
evens = EvenNumbers(10)
print(list(evens))  # [0, 2, 4, 6, 8]

🔍 Iterator Protocol Deep Dive

__iter__(): Returns the iterator object itself. This makes the object iterable.

__next__(): Returns the next value in the sequence. When no more items are available, it raises StopIteration.

Best Practice: Always implement both methods for a complete iterator.

Generator Functions - Simplified Iterators

Generators provide an easier way to create iterators using the yield keyword:

Generator Functions
# Simple generator function
def count_up_to(max_count):
    count = 1
    while count <= max_count:
        yield count
        count += 1

# Usage
counter = count_up_to(3)
print(next(counter))  # 1
print(next(counter))  # 2
print(next(counter))  # 3

# Fibonacci generator
def fibonacci(n):
    a, b = 0, 1
    count = 0
    while count < n:
        yield a
        a, b = b, a + b
        count += 1

# Usage
fib = fibonacci(5)
for num in fib:
    print(num)  # 0, 1, 1, 2, 3

# Square numbers generator
def squares(limit):
    for i in range(limit):
        yield i ** 2

# Usage
square_gen = squares(5)
print(list(square_gen))  # [0, 1, 4, 9, 16]

🚀 Generator Advantages

  • Simpler Syntax: No need to implement __iter__ and __next__
  • Automatic State Management: Python handles the state between yields
  • Memory Efficient: Values are generated on-demand
  • Readable Code: More intuitive than class-based iterators

Generator Expressions

Create generators using a concise syntax similar to list comprehensions:

Generator Expressions
# Basic generator expression
squares = (x**2 for x in range(5))
print(type(squares))  # 
print(list(squares))  # [0, 1, 4, 9, 16]

# Filtered generator
even_squares = (x**2 for x in range(10) if x % 2 == 0)
print(list(even_squares))  # [0, 4, 16, 36, 64]

# String processing
words = ['hello', 'world', 'python', 'programming']
lengths = (len(word) for word in words)
print(list(lengths))  # [5, 5, 6, 11]

# Memory comparison
import sys

# List comprehension (stores all values)
list_comp = [x**2 for x in range(1000)]
print(f"List size: {sys.getsizeof(list_comp)} bytes")

# Generator expression (stores only the expression)
gen_exp = (x**2 for x in range(1000))
print(f"Generator size: {sys.getsizeof(gen_exp)} bytes")

📊 List vs Generator Comparison

List Comprehension

  • ✅ Random access to elements
  • ✅ Can iterate multiple times
  • ❌ Uses more memory
  • ❌ Slower for large datasets

Generator Expression

  • ✅ Memory efficient
  • ✅ Lazy evaluation
  • ❌ One-time iteration
  • ❌ No random access

Itertools Module

Python's itertools module provides powerful iterator building blocks:

Itertools Examples
import itertools

# count() - infinite counting
counter = itertools.count(start=5, step=2)
print(next(counter))  # 5
print(next(counter))  # 7
print(next(counter))  # 9

# cycle() - infinite repetition
colors = itertools.cycle(['red', 'green', 'blue'])
for i, color in enumerate(colors):
    if i >= 6:
        break
    print(color)  # red, green, blue, red, green, blue

# repeat() - repeat a value
repeated = itertools.repeat('hello', 3)
print(list(repeated))  # ['hello', 'hello', 'hello']

# chain() - combine iterators
list1 = [1, 2, 3]
list2 = [4, 5, 6]
chained = itertools.chain(list1, list2)
print(list(chained))  # [1, 2, 3, 4, 5, 6]

# islice() - slice an iterator
numbers = itertools.count()
first_five = itertools.islice(numbers, 5)
print(list(first_five))  # [0, 1, 2, 3, 4]

🛠️ Common Itertools Functions

  • count(): Infinite arithmetic progression
  • cycle(): Infinite repetition of iterable
  • repeat(): Repeat a value n times
  • chain(): Flatten nested iterables
  • islice(): Slice an iterator efficiently
  • takewhile(): Take elements while condition is true
  • dropwhile(): Drop elements while condition is true

Practical Applications

Here are real-world scenarios where iterators shine:

File Processing Iterator
# Memory-efficient file reading
def read_large_file(file_path):
    """Generator to read large files line by line"""
    with open(file_path, 'r') as file:
        for line in file:
            yield line.strip()

# Usage (doesn't load entire file into memory)
# for line in read_large_file('huge_file.txt'):
#     process_line(line)

# Batch processing iterator
def batch_processor(iterable, batch_size):
    """Process items in batches"""
    iterator = iter(iterable)
    while True:
        batch = list(itertools.islice(iterator, batch_size))
        if not batch:
            break
        yield batch

# Usage
data = range(10)
for batch in batch_processor(data, 3):
    print(f"Processing batch: {batch}")
# Output: [0, 1, 2], [3, 4, 5], [6, 7, 8], [9]

# Data transformation pipeline
def process_numbers(numbers):
    """Multi-step data processing pipeline"""
    # Step 1: Filter even numbers
    evens = (x for x in numbers if x % 2 == 0)
    # Step 2: Square them
    squared = (x**2 for x in evens)
    # Step 3: Filter large values
    filtered = (x for x in squared if x < 100)
    return filtered

# Usage
numbers = range(20)
result = process_numbers(numbers)
print(list(result))  # [0, 4, 16, 36, 64]

Best Practices & Performance Tips

Performance

  • Use generators for large datasets
  • Prefer generator expressions over list comprehensions when possible
  • Chain iterators instead of creating intermediate lists
🧠

Memory Management

  • Generators use constant memory regardless of input size
  • Avoid converting generators to lists unless necessary
  • Use itertools for memory-efficient operations
🔧

Code Quality

  • Use descriptive names for generator functions
  • Document the expected behavior and limitations
  • Handle StopIteration gracefully
🚨

Common Pitfalls

  • Remember generators are one-time use
  • Don't modify the underlying data during iteration
  • Be careful with infinite generators in loops

🧠 Test Your Knowledge

Test your understanding of Python iterators:

Question 1: Which methods must a class implement to be an iterator?

Question 2: What is the main advantage of generators over lists?

Question 3: What happens when an iterator has no more items?