Explain Python generators, iterators, and the yield keyword — when are they useful in production?

Question

At a data-heavy company:

> "We need to process a 10GB CSV file but only have 2GB of RAM. Loading it all at once crashes. How do generators solve this?"

Interview OS Community · Accepted Answer

## Generators — Lazy Evaluation

```python
# Regular function — loads ALL into memory
def load_all(filepath):
    rows = []
    with open(filepath) as f:
        for line in f:
            rows.append(parse(line))
    return rows  # 10GB in memory! 💥

# Generator — yields one row at a time
def load_lazy(filepath):
    with open(filepath) as f:
        for line in f:
            yield parse(line)  # Only ONE row in memory

# Usage — same interface
for row in load_lazy("huge.csv"):  # Constant memory
    process(row)
```

## Generator Expression (like list comprehension but lazy)

```python
# List comprehension — all in memory
squares = [x**2 for x in range(10_000_000)]  # ~80MB

# Generator expression — constant memory
squares = (x**2 for x in range(10_000_000))  # ~200 bytes
```

## yield from (delegating to sub-generators)

```python
def flatten(matrix):
    for row in matrix:
        yield from row  # Delegate to each row's iterator

list(flatten([[1, 2], [3, 4], [5]]))  # [1, 2, 3, 4, 5]
```

## Infinite Sequences

```python
def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Take first 10
from itertools import islice
list(islice(fibonacci(), 10))
```

## Pipeline Pattern

```python
def read_lines(path):
    with open(path) as f:
        yield from f

def parse_rows(lines):
    for line in lines:
        yield json.loads(line)

def filter_active(rows):
    for row in rows:
        if row["active"]:
            yield row

# Compose generators — each step is lazy
pipeline = filter_active(parse_rows(read_lines("data.jsonl")))
for row in pipeline:
    print(row)
```

| Feature | List | Generator |
|---------|-------|-----------|
| Memory | O(n) | O(1) |
| Random access | ✅ | ❌ |
| Reusable | ✅ | ❌ (one pass) |
| Use case | Small collections | Large data, streams |

Interview OS

Explain Python generators, iterators, and the yield keyword — when are they useful in production?

Question Details

Suggested Solution

Generators — Lazy Evaluation

Generator Expression (like list comprehension but lazy)

yield from (delegating to sub-generators)

Infinite Sequences

Pipeline Pattern

Discussion (0)

Feature	List	Generator
Memory	O(n)	O(1)
Random access	✅	❌
Reusable	✅	❌ (one pass)
Use case	Small collections	Large data, streams