DataSplits#
- class DataSplits(n_splits: int)[source]#
Bases:
objectContainer for DataSplitInfo instances created from the same dataset.
This class provides a fixed-capacity container for managing multiple DataSplitInfo instances that represent different splits of the same dataset. It ensures type safety and provides index-based access to individual splits.
- Parameters:
- n_splitsint
The number of splits to be stored in the container. Must be a positive integer.
- Attributes:
- _data_splitsList[DataSplitInfo or None]
A list of DataSplitInfo instances, initialized with None values
- _current_indexint
The index where the next split will be added
- expected_n_splitsint
The total number of splits the container can hold
- Raises:
- ValueError
If n_splits is not a positive integer
Notes
The container is initialized with a fixed capacity and fills slots sequentially. Once all slots are filled, no more splits can be added.
Examples
- Create a container for 5 splits:
>>> splits = DataSplits(n_splits=5) >>> print(len(splits)) # 5
- Add splits to the container:
>>> for i in range(5): ... split_info = DataSplitInfo(...) # Create split ... splits.add(split_info)
- Access a specific split:
>>> split_0 = splits.get_split(0) >>> split_2 = splits.get_split(2)
- add(split: DataSplitInfo) None[source]#
Add a DataSplitInfo instance to the container.
Adds a DataSplitInfo instance to the next available slot in the container. The split is added sequentially, starting from index 0.
- Parameters:
- splitDataSplitInfo
The DataSplitInfo instance to add to the container
- Raises:
- IndexError
If the number of splits exceeds the expected number of splits
- TypeError
If the split is not a DataSplitInfo instance
Notes
The method adds splits sequentially, incrementing the current index after each addition. Once all slots are filled, no more splits can be added.
Examples
- Add a split to the container:
>>> splits = DataSplits(n_splits=3) >>> split_info = DataSplitInfo(...) >>> splits.add(split_info)
- Attempt to add more splits than capacity:
>>> for i in range(4): # More than n_splits=3 ... splits.add(DataSplitInfo(...)) # Raises IndexError
- get_split(index: int) DataSplitInfo[source]#
Get a DataSplitInfo instance by index.
Retrieves the DataSplitInfo instance at the specified index. Validates that the index is within bounds and that a split exists at that index.
- Parameters:
- indexint
The index of the DataSplitInfo instance to retrieve. Must be between 0 and expected_n_splits - 1
- Returns:
- DataSplitInfo
The DataSplitInfo instance at the specified index
- Raises:
- IndexError
If the index is out of bounds (not between 0 and expected_n_splits - 1)
- ValueError
If no DataSplitInfo instance has been assigned to the specified index
Notes
The method performs two validation checks: 1. Ensures the index is within the valid range 2. Ensures a split has been assigned to that index
Examples
- Get a split by index:
>>> splits = DataSplits(n_splits=3) >>> splits.add(DataSplitInfo(...)) # Adds to index 0 >>> split_0 = splits.get_split(0)
- Attempt to access out-of-bounds index:
>>> splits.get_split(5) # Raises IndexError
- Attempt to access unassigned index:
>>> splits.get_split(1) # Raises ValueError (no split at index 1)