Nested Splits
Note: This feature is currently for internal use only and is not customer-facing.Flatfile’s split functionality allows you to transform a single field into multiple destination fields. With the introduction of nested splits, you can now create more complex transformations by referencing previewed data in agent tool calls.
Overview
Nested splits enhance Flatfile’s data transformation capabilities by:- Tracking mapping rules server-side
- Enabling reference to previewed data in agent tool calls
- Supporting complex, multi-level data transformations
- Preserving transformation context across operations
How Nested Splits Work
When you use the split tool in Flatfile, the system now:- Stores the mapping rules on the server
- Makes these rules available to subsequent agent tool calls
- Allows agents to reference the transformed data
- Maintains the relationship between source and transformed data
Using Nested Splits
Nested splits are available through the preprocessing service and can be accessed using the split tool. Here’s how to implement nested splits in your data transformation workflow:Basic Split Operation
A basic split operation transforms a single source field into multiple destination fields:Nested Split Operation
With nested splits, you can now reference the results of previous splits in subsequent transformations:Example Use Cases
Address Parsing
Split a full address into components, then further split the street address:- First split: “123 Main St, Apt 4, New York, NY 10001” → [“123 Main St, Apt 4”, “New York”, “NY”, “10001”]
- Nested split: “123 Main St, Apt 4” → [“123”, “Main St”, “Apt 4”]
Name Parsing
Split a full name, then further process components:- First split: “Dr. John A. Smith Jr.” → [“Dr.”, “John A.”, “Smith”, “Jr.”]
- Nested split: “John A.” → [“John”, “A.”]
Date and Time Processing
Split a datetime stamp, then further process the date:- First split: “2025-05-01 14:30:45” → [“2025-05-01”, “14:30:45”]
- Nested split: “2025-05-01” → [“2025”, “05”, “01”]
Implementation Details
The nested splits functionality is implemented in the preprocessing service and leverages several key components:- Mapping Rules: Rules are now tracked server-side and can be referenced in subsequent operations
- Virtual Machine: Processes the mapping rules and applies them to the data
- Run Class: Manages the application of mapping rules to the data
- Split Tool: Provides the interface for creating split operations
Best Practices
When working with nested splits:- Plan Your Transformation Chain: Map out the sequence of splits before implementation
- Use Descriptive Field Names: Clear naming helps track the transformation flow
- Validate Intermediate Results: Check the output of each split before proceeding
- Consider Performance: Complex nested operations may impact processing time
- Test with Sample Data: Verify transformations with representative data samples
Limitations
- Deeply nested splits (more than 3-4 levels) may become difficult to manage
- Performance may be affected with very large datasets and complex transformations
- All splits in a chain must be defined within the same agent session