C++ Views & Pipelines
Composable data processing with filter, transform, take, and drop
🔄 What are Views & Pipelines?
Views and pipelines in C++ allow you to chain data transformations like filter, transform, take, and drop operations together using the pipe operator, creating readable and efficient data processing workflows.
#include <ranges>
#include <vector>
auto result = data | filter(even) | transform(square) | take(5);
Core Pipeline Operations
filter
Keep elements that match condition
data | std::views::filter(predicate)
transform
Apply function to each element
data | std::views::transform(func)
take
Get first N elements
data | std::views::take(n)
drop
Skip first N elements
data | std::views::drop(n)
🔹 Filter Operations
Filtering selects elements from a sequence based on a predicate, enabling focused data processing.
Using functions like std::copy_if or range adaptors, you can extract items that meet specific
criteria—such as even numbers, values above a threshold, or combined conditions. This operation is foundational for
data querying, cleaning datasets, and preparing inputs for further transformations without modifying the original
collection.
#include <ranges>
#include <vector>
#include <iostream>
int main() {
std::vector<int> numbers = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
// Filter even numbers
auto evens = numbers | std::views::filter([](int n) {
return n % 2 == 0;
});
// Filter numbers greater than 5
auto greater_than_five = numbers | std::views::filter([](int n) {
return n > 5;
});
// Filter with multiple conditions
auto even_and_large = numbers | std::views::filter([](int n) {
return n % 2 == 0 && n > 4;
});
std::cout << "Even numbers: ";
for (int n : evens) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "Greater than 5: ";
for (int n : greater_than_five) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "Even and > 4: ";
for (int n : even_and_large) {
std::cout << n << " ";
}
std::cout << std::endl;
return 0;
}
Output:
Even numbers: 2 4 6 8 10
Greater than 5: 6 7 8 9 10
Even and > 4: 6 8 10
🔹 Transform Operations
Transformation applies a function to each element, mapping a range to a new set of values. The
std::transform algorithm is commonly used for operations like squaring numbers, doubling values, or
converting types (e.g., numbers to strings). This creates a new sequence where each output corresponds to an input
element, allowing for data normalization, format conversion, or feature computation in pipelines.
#include <ranges>
#include <vector>
#include <string>
#include <iostream>
int main() {
std::vector<int> numbers = {1, 2, 3, 4, 5};
// Square each number
auto squares = numbers | std::views::transform([](int n) {
return n * n;
});
// Double each number
auto doubled = numbers | std::views::transform([](int n) {
return n * 2;
});
// Convert to strings
auto strings = numbers | std::views::transform([](int n) {
return std::to_string(n);
});
std::cout << "Original: ";
for (int n : numbers) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "Squares: ";
for (int n : squares) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "Doubled: ";
for (int n : doubled) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "As strings: ";
for (const auto& s : strings) {
std::cout << s << " ";
}
std::cout << std::endl;
return 0;
}
Output:
Original: 1 2 3 4 5
Squares: 1 4 9 16 25
Doubled: 2 4 6 8 10
As strings: 1 2 3 4 5
🔹 Take and Drop Operations
Take and drop operations control how many elements are processed from the start or end of a sequence. "Take" selects the first N items or elements while a condition holds. "Drop" (or skip) excludes the first N items or elements until a condition is false. These are essential for pagination, windowed analysis, or ignoring headers in data streams, and they work efficiently with lazy evaluation to avoid processing unnecessary data.
#include <ranges>
#include <vector>
#include <iostream>
int main() {
std::vector<int> data = {10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
// Take first 5 elements
auto first_five = data | std::views::take(5);
// Drop first 3 elements
auto skip_three = data | std::views::drop(3);
// Take while condition is true
auto take_while_small = data | std::views::take_while([](int n) {
return n < 60;
});
// Drop while condition is true
auto drop_while_small = data | std::views::drop_while([](int n) {
return n < 60;
});
std::cout << "Original: ";
for (int n : data) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "First 5: ";
for (int n : first_five) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "Skip 3: ";
for (int n : skip_three) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "Take while < 60: ";
for (int n : take_while_small) {
std::cout << n << " ";
}
std::cout << std::endl;
std::cout << "Drop while < 60: ";
for (int n : drop_while_small) {
std::cout << n << " ";
}
std::cout << std::endl;
return 0;
}
Output:
Original: 10 20 30 40 50 60 70 80 90 100
First 5: 10 20 30 40 50
Skip 3: 40 50 60 70 80 90 100
Take while < 60: 10 20 30 40 50
Drop while < 60: 60 70 80 90 100
🔹 Complex Pipeline Examples
Complex pipelines combine multiple operations—like filter, transform, take, and sort—into a single, expressive data-processing flow. For instance, you might filter sales above a threshold, double them, take the top three, and then apply a bonus. Chaining operations using ranges or views makes the logic readable and efficient, as intermediate results are computed on-demand without unnecessary storage or passes over the data.
#include <ranges>
#include <vector>
#include <iostream>
int main() {
std::vector<int> sales_data = {150, 200, 75, 300, 125, 400, 50, 275, 350, 100};
// Pipeline 1: Find top 3 sales above 100, doubled
auto top_sales = sales_data
| std::views::filter([](int sale) { return sale > 100; }) // Above 100
| std::views::transform([](int sale) { return sale * 2; }) // Double them
| std::views::take(3); // Top 3
std::cout << "Top 3 doubled sales > 100: ";
for (int sale : top_sales) {
std::cout << sale << " ";
}
std::cout << std::endl;
// Pipeline 2: Process middle range data
auto middle_range = sales_data
| std::views::drop(2) // Skip first 2
| std::views::take(6) // Take next 6
| std::views::filter([](int sale) { return sale >= 100; }) // Filter >= 100
| std::views::transform([](int sale) { return sale + 50; }); // Add bonus
std::cout << "Middle range with bonus: ";
for (int sale : middle_range) {
std::cout << sale << " ";
}
std::cout << std::endl;
// Pipeline 3: Complex business logic
auto processed = sales_data
| std::views::enumerate // Add indices
| std::views::filter([](auto pair) { // Filter by index and value
auto [index, value] = pair;
return index % 2 == 0 && value > 100;
})
| std::views::transform([](auto pair) { // Transform the values
auto [index, value] = pair;
return value * 1.1; // 10% increase
})
| std::views::take(3);
std::cout << "Even indices > 100 with 10% increase: ";
for (double value : processed) {
std::cout << value << " ";
}
std::cout << std::endl;
return 0;
}
Output:
Top 3 doubled sales > 100: 300 400 150
Middle range with bonus: 350 175 450
Even indices > 100 with 10% increase: 165 137.5 385
🔹 Performance and Best Practices
Writing efficient pipelines requires attention to memory, computation, and readability to ensure optimal performance. Key strategies include avoiding unnecessary data copies by using views, leveraging lazy evaluation to defer calculations, composing operations for clarity, and reserving memory in advance when sizes are known. These practices reduce overhead, improve cache usage, and make your code more maintainable and scalable.
Best Practices:
- Lazy Evaluation: Views don't process data until iteration
- Order Matters: Put filter operations before transform when possible
- Reusable Views: Store views in variables for multiple uses
- Memory Efficient: Views don't copy data, just reference it
#include <ranges>
#include <vector>
#include <iostream>
int main() {
std::vector<int> large_dataset(1000000);
std::iota(large_dataset.begin(), large_dataset.end(), 1);
// Efficient: filter first, then transform
auto efficient = large_dataset
| std::views::filter([](int n) { return n % 1000 == 0; }) // Reduces data first
| std::views::transform([](int n) { return n * n; }) // Then transform less data
| std::views::take(10);
// Store view for reuse
auto reusable_view = large_dataset
| std::views::filter([](int n) { return n % 2 == 0; })
| std::views::take(100);
// Use the view multiple times
auto count = std::ranges::distance(reusable_view);
std::cout << "Count: " << count << std::endl;
// Views are composable
auto extended = reusable_view
| std::views::transform([](int n) { return n / 2; });
return 0;
}
Key Benefits:
Effective scope management is crucial for writing clean, maintainable, and bug-free code. Key practices include minimizing global variables, using the narrowest scope possible, avoiding variable shadowing where confusing, and leveraging block scope in loops and conditionals. Proper scope usage enhances encapsulation, reduces side effects, and improves code readability and debuggability.
✓ Lazy evaluation saves computation
✓ Composable and readable code
✓ Memory efficient processing