DataSplits#

class DataSplits(n_splits: int)[source]#

Bases: object

Container for DataSplitInfo instances created from the same dataset.

This class provides a fixed-capacity container for managing multiple DataSplitInfo instances that represent different splits of the same dataset. It ensures type safety and provides index-based access to individual splits.

Parameters:
n_splitsint

The number of splits to be stored in the container. Must be a positive integer.

Attributes:
_data_splitsList[DataSplitInfo or None]

A list of DataSplitInfo instances, initialized with None values

_current_indexint

The index where the next split will be added

expected_n_splitsint

The total number of splits the container can hold

Raises:
ValueError

If n_splits is not a positive integer

Notes

The container is initialized with a fixed capacity and fills slots sequentially. Once all slots are filled, no more splits can be added.

Examples

Create a container for 5 splits:
>>> splits = DataSplits(n_splits=5)
>>> print(len(splits))  # 5
Add splits to the container:
>>> for i in range(5):
...     split_info = DataSplitInfo(...)  # Create split
...     splits.add(split_info)
Access a specific split:
>>> split_0 = splits.get_split(0)
>>> split_2 = splits.get_split(2)
add(split: DataSplitInfo) None[source]#

Add a DataSplitInfo instance to the container.

Adds a DataSplitInfo instance to the next available slot in the container. The split is added sequentially, starting from index 0.

Parameters:
splitDataSplitInfo

The DataSplitInfo instance to add to the container

Raises:
IndexError

If the number of splits exceeds the expected number of splits

TypeError

If the split is not a DataSplitInfo instance

Notes

The method adds splits sequentially, incrementing the current index after each addition. Once all slots are filled, no more splits can be added.

Examples

Add a split to the container:
>>> splits = DataSplits(n_splits=3)
>>> split_info = DataSplitInfo(...)
>>> splits.add(split_info)
Attempt to add more splits than capacity:
>>> for i in range(4):  # More than n_splits=3
...     splits.add(DataSplitInfo(...))  # Raises IndexError
get_split(index: int) DataSplitInfo[source]#

Get a DataSplitInfo instance by index.

Retrieves the DataSplitInfo instance at the specified index. Validates that the index is within bounds and that a split exists at that index.

Parameters:
indexint

The index of the DataSplitInfo instance to retrieve. Must be between 0 and expected_n_splits - 1

Returns:
DataSplitInfo

The DataSplitInfo instance at the specified index

Raises:
IndexError

If the index is out of bounds (not between 0 and expected_n_splits - 1)

ValueError

If no DataSplitInfo instance has been assigned to the specified index

Notes

The method performs two validation checks: 1. Ensures the index is within the valid range 2. Ensures a split has been assigned to that index

Examples

Get a split by index:
>>> splits = DataSplits(n_splits=3)
>>> splits.add(DataSplitInfo(...))  # Adds to index 0
>>> split_0 = splits.get_split(0)
Attempt to access out-of-bounds index:
>>> splits.get_split(5)  # Raises IndexError
Attempt to access unassigned index:
>>> splits.get_split(1)  # Raises ValueError (no split at index 1)