The Functional Approach to Data Processing
One of the strengths of the DataFrame library is its functional programming approach to data processing.This approach offers several benefits:
- Immutability: Operations create new Series without modifying the original data
- Composability: Operations can be easily combined into complex workflows
- Readability: The intent of the code is clear and follows a declarative style
- Maintainability: Logic is broken down into smaller, reusable functions
DataFrame Library API Reference
Core Components
Element Processing
filter
ProcessingCreates a new Serie containing only elements that satisfy a predicate function.
find
ProcessingFinds the first element that matches a predicate function.
forEach
ProcessingIterates over each element in a Serie to perform an operation without changing the Serie.
map
ProcessingTransforms each element in a Serie using a callback function, returning a new Serie.
parallel_map
ProcessingTransforms elements in parallel using multiple threads for better performance.
Outputs Serie contents to the console for debugging and inspection.
reduce
ProcessingReduces a Serie to a single value by applying a function against an accumulator.
reject
ProcessingCreates a new Serie excluding elements that satisfy a predicate function.
where
ProcessingAlternative to filter for selecting elements that match specified criteria.
Control Flow
compose
FlowCreates a new function by composing multiple functions together.
if_then_else
FlowConditionally transforms elements based on a predicate with true/false paths.
map_if
FlowConditionally applies a transformation to elements that match a condition.
memoise
FlowCaches function results to avoid redundant computation on repeated calls.
pipe
FlowComposes multiple operations into a reusable processing pipeline.
switch
FlowSelects from multiple transformations based on element values.
whenAll
FlowExecutes a function when all conditions are satisfied.
Series Creation
Series Manipulation
chain
ManipulationCombines multiple Series sequentially into a single Serie.
concat
ManipulationConcatenates multiple Series into a single Serie.
chunk
ManipulationSplits a Serie into chunks of a specified size.
flatMap
Manipulationfunction applies a transformation to each element in a Serie, where
the transformation returns a Serie for each element, and then flattens all those Series into
a single Serie. It's essentially a combination of map followed by a
flatten operation.
flatten
ManipulationConverts a Serie of arrays or nested Series into a flat Serie.
merge
ManipulationCombines multiple Series with a custom merge function.
partition
ManipulationDivides a Serie into two Series based on a predicate function.
skip
ManipulationCreates a Serie that skips the first n elements from the source Serie.
slice
ManipulationCreates a Serie from a subset of elements specified by start and end indices.
split
ManipulationDivides a Serie into multiple equal-sized parts.
take
ManipulationCreates a Serie with the first n elements from the source Serie.
unzip
ManipulationSeparates a Serie of tuples into multiple Series.
zip
ManipulationCombines elements from multiple Series into tuples based on position.
Ordering & Grouping
sort
GroupCreates a sorted copy of a Serie in ascending or descending order.
orderBy
GroupSorts a Serie based on a key function that determines the sorting order.
groupBy
GroupGroups Serie elements by a key function and returns a map of keys to Series.
unique
GroupCreates a Serie with duplicate elements removed.
Formatting
IO
CSV
IORead and write CSV files with configurable delimiters, quoting, headers, and automatic type detection.
JSON
IORead and write JSON files as arrays of objects, with automatic type mapping and pretty-printing support.
Binary
IOPlatform-independent binary serialization with endianness handling, type safety, and custom type registration.
Maching learning
RandomForest
Maching LearningThe RandomForest class provides an implementation of
the Random Forest algorithm integrated with the DataFrame library.
Lime
Maching LearningLIME (Local Interpretable Model-agnostic Explanations) is a technique designed to explain the predictions of any machine learning model by approximating it locally with an interpretable model.
Genetic Algorithm
Maching LearningEvolutionary optimization using selection, crossover, and mutation operators for both numerical and combinatorial problems.
Bee Algorithm
Maching LearningArtificial Bee Colony (ABC) optimization inspired by honey bee foraging behavior for continuous and discrete problems.
Attribute Decomposition
Manager
AttributesManage attribute decomposition workflows on Series.
Decomposer
AttributesBase interface for attribute decomposition strategies.
Coordinates
AttributesDecompose vector attributes into coordinate components.
Components
AttributesDecompose matrix/tensor attributes into individual components.
Interpolation
Interpolation Methods
InterpolationIDW, RBF, Nearest Neighbor, and Natural Neighbor interpolation for scattered 2D/3D point data.
Kriging
InterpolationOrdinary Kriging with variogram models (Spherical, Exponential, Gaussian, Matérn) for geostatistical interpolation.
Grid Generation
InterpolationCreate 2D/3D Cartesian grids from dimensions or point sets, with RBF-based grid interpolation.