Configuration#

class Configuration(default_workflow: str, default_algorithms: List[str], categorical_features: Dict[str, List[str]] | None = None, default_workflow_args: Dict[str, Any] | None = None, plot_settings: PlotSettings | None = None)[source]#

User interface for defining experiment configurations.

This class provides a simple interface for users to define experiment groups and their configurations. It handles default values, ensures unique group names, and provides validation for configuration parameters.

Parameters:
default_workflowstr

Default workflow name to use for experiment groups

default_algorithmsList[str]

List of algorithm names to use as defaults when none specified

categorical_featuresDict[str, List[str]], optional

Dictionary mapping dataset identifiers to lists of categorical feature names, by default None

default_workflow_argsDict[str, Any], optional

Default values to assign as attributes of the Workflow class, by default None

plot_settingsPlotSettings, optional

Plot configuration settings, by default None

Attributes:
default_workflowstr

Default workflow name for experiment groups

experiment_groupsList[ExperimentGroup]

List of configured experiment groups

default_algorithmsList[str]

List of default algorithm names

categorical_featuresDict[str, List[str]]

Mapping of dataset identifiers to categorical feature lists

default_workflow_argsDict[str, Any]

Default workflow arguments

plot_settingsPlotSettings

Plot configuration settings

Notes

The Configuration class serves as the main user interface for setting up experiments. It provides a fluent API for adding experiment groups and automatically handles validation and default value assignment.

Examples

Create a simple configuration:
>>> config = Configuration(
...     default_workflow="workflow",
...     default_algorithms=["linear", "ridge"]
... )
Add experiment groups:
>>> config.add_experiment_group(
...     name="baseline",
...     datasets=["data1.csv", "data2.csv"],
...     algorithms=["linear", "svm"]
... )
Build the configuration manager:
>>> manager = config.build()
add_experiment_group(*, name: str, datasets: List[str | Tuple[str, str]], data_config: Dict[str, Any] | None = None, algorithms: List[str] | None = None, algorithm_config: Dict[str, Dict[str, Any]] | None = None, description: str | None = '', workflow: str | None = None, workflow_args: Dict[str, Any] | None = None) None[source]#

Add a new ExperimentGroup to the configuration.

Adds a new experiment group with the specified parameters. Validates the group name uniqueness and dataset format before adding.

Parameters:
namestr

Unique identifier for the experiment group

datasetsList[str | Tuple[str, str]]

List of dataset paths relative to datasets directory. Can be strings (dataset files) or tuples of (dataset_file, table_name) for multi-table databases

data_configDict[str, Any], optional

Arguments for DataManager used by this experiment group, by default None

algorithmsList[str], optional

List of algorithm names to use. If None, uses default_algorithms, by default None

algorithm_configDict[str, Dict[str, Any]], optional

Algorithm-specific configurations that override values set in algorithms.py, by default None

descriptionstr, optional

Human-readable description for the experiment group, by default “”

workflowstr, optional

Name of the workflow file to use (without .py extension). If None, uses default_workflow, by default None

workflow_argsDict[str, Any], optional

Values to assign as attributes in the Workflow class. Must have same keys as default_workflow_args, by default None

Raises:
ValueError

If group name already exists or workflow_args keys don’t match default_workflow_args

TypeError

If datasets contains invalid types (must be strings or tuples)

Notes

The method performs several validation checks: 1. Ensures group name is unique 2. Validates dataset format (strings or tuples of strings) 3. Validates workflow_args keys match default_workflow_args 4. Converts string datasets to (dataset, None) tuples

Examples

Add a simple experiment group:
>>> config.add_experiment_group(
...     name="baseline",
...     datasets=["data.csv"]
... )
Add group with custom settings:
>>> config.add_experiment_group(
...     name="advanced",
...     datasets=[("data.xlsx", "Sheet1"), "data2.csv"],
...     algorithms=["svm", "rf"],
...     data_config={"test_size": 0.3},
...     description="Advanced experiment with custom settings"
... )
build() ConfigurationManager[source]#

Build and return a ConfigurationManager instance.

Processes all experiment groups and creates a ConfigurationManager that can execute the experiments. Exports configuration parameters for rerun functionality.

Returns:
ConfigurationManager

Fully configured manager ready to execute experiments

Notes

The build process: 1. Exports configuration parameters for rerun functionality 2. Creates a ConfigurationManager with all experiment groups 3. Sets up data managers, algorithm configurations, and workflows 4. Prepares the complete experiment execution environment

Examples

Build and use the configuration:
>>> config = Configuration("workflow", ["linear", "ridge"])
>>> config.add_experiment_group(name="test", datasets=["data.csv"])
>>> manager = config.build()
>>> # manager is ready to execute experiments
export_params() None[source]#

Export configuration parameters for rerun functionality.

Serializes the current configuration to a format that can be used to recreate the experiment setup during rerun operations. This includes all experiment groups, categorical features, and plot settings.

Notes

The exported parameters include: - Default workflow and algorithms - Categorical features mapping - Plot settings configuration - All experiment group configurations - Dataset metadata for validation

This data is used by the rerun system to ensure experiments can be reproduced with identical configurations.

Examples

Export is called automatically during build():
>>> config = Configuration("workflow", ["linear"])
>>> config.add_experiment_group(name="test", datasets=["data.csv"])
>>> manager = config.build()  # export_params() called automatically
set_services(services: ServiceBundle | None = None)[source]#