Rerun Service#

class RerunService(name: str, mode: str = 'capture', rerun_config: Dict[str, Any] | None = None)[source]#

Bases: BaseService

Main service class for managing rerun functionality.

This service provides comprehensive rerun functionality for the Brisk package, enabling exact reproduction of machine learning experiments by capturing runtime configurations and providing mechanisms to restore them. It implements the Strategy pattern to handle different modes of operation: capture and coordinate.

The service supports two modes: - Capture mode: Collects and stores all experiment configurations during execution - Coordinate mode: Reconstructs and provides configurations for exact reproduction

Attributes:
configsDict[str, Any]

Dictionary storing all captured configuration data

strategyRerunStrategy

The current strategy implementation

modestr

The current mode of operation (“capture” or “coordinate”)

is_coordinatingbool

Boolean flag indicating if the service is in coordinating mode

Notes

The service uses the Strategy pattern to switch between capture and coordinate modes without changing the core implementation. All configuration data is stored in memory and can be exported to a JSON file for later use.

Examples

>>> # Capture mode
>>> rerun_service = RerunService("rerun", mode="capture")
>>> # ... run experiments ...
>>> rerun_service.export_and_save(results_dir)
>>> 
>>> # Coordinate mode
>>> with open("run_config.json") as f:
...     config_data = json.load(f)
>>> rerun_service = RerunService("rerun", mode="coordinate", rerun_config=config_data)
add_algorithm_config(algorithm_configs: List[Dict[str, Any]]) None[source]#

Store algorithm configuration data for rerun functionality.

Parameters:
algorithm_configsList[Dict[str, Any]]

List of algorithm configurations exported from AlgorithmCollection.export_params()

add_base_data_manager(config: Dict[str, Any]) None[source]#
add_configuration(configuration: Dict[str, Any]) None[source]#
add_evaluators_config(evaluators_config: Dict[str, Any] | None) None[source]#

Store evaluators configuration data for rerun functionality.

Parameters:
evaluators_configOptional[Dict[str, Any]]

Evaluators configuration exported from EvaluationManager.export_evaluators_config() Can be None if no custom evaluators exist

add_experiment_groups(groups: List[Dict[str, Any]]) None[source]#
add_metric_config(metric_configs: List[Dict[str, Any]]) None[source]#

Store metric configuration data for rerun functionality.

Parameters:
metric_configsList[Dict[str, Any]]

List of metric configurations exported from MetricManager.export_params()

add_workflow_file(workflow_name: str, class_name: str)[source]#
capture_environment() None[source]#

Capture structured environment information.

collect_dataset_metadata(groups_json: List[Dict[str, Any]]) None[source]#

Collect metadata about all datasets used in experiment groups for rerun functionality.

Captures dataset metadata including filename, table name, file size, and feature names to verify dataset compatibility during rerun.

Parameters:
groups_jsonList[Dict[str, Any]]

List of experiment group configurations containing dataset information

export_and_save(results_dir: Path) None[source]#

Write the run configuration to results/run_config.json.

This method exports all captured configuration data to a JSON file in the results directory. The file can be used later to reproduce the exact experiment configuration.

Parameters:
results_dirPath

The results directory where the configuration file will be saved

Notes

This method only operates in capture mode. In coordinate mode, the configuration data is already loaded from an existing file. The exported file contains all necessary information to reproduce the experiment including data managers, algorithms, evaluators, workflows, metrics, and environment information.

get_configuration_args() Dict[source]#
get_experiment_groups()[source]#
handle_load_algorithms(algorithm_config: Path) Any[source]#

Delegate to current strategy.

handle_load_base_data_manager(data_manager) Any[source]#

Delegate to current strategy.

Parameters:
data_manager: DataManager

The DataManager instance loaded by the IOService from data.py

handle_load_custom_evaluators(module, evaluators_file: Path) Any[source]#

Delegate to current strategy.

handle_load_metric_config(metric_config) Any[source]#

Delegate to current strategy.

handle_load_workflow(workflow, workflow_name: str) Any[source]#

Delegate to current strategy.

reconstruct_plot_settings(plot_settings_data: Dict[str, Any]) PlotSettings[source]#

Reconstruct a PlotSettings instance from exported parameters.

Parameters:
plot_settings_datadict

Dictionary containing exported PlotSettings data

Returns:
PlotSettings

Reconstructed PlotSettings instance

class RerunStrategy[source]#

Bases: ABC

Abstract base class for rerun strategy implementations.

This abstract base class defines the interface for different rerun strategies used by the RerunService. It implements the Strategy pattern to allow different behaviors for capturing and coordinating experiment configurations.

The strategy pattern enables the RerunService to switch between different modes of operation (capture vs coordinate) without changing the core service implementation.

Notes

All concrete strategy implementations must implement all abstract methods to handle the loading and processing of different experiment components.

Examples

>>> class CustomRerunStrategy(RerunStrategy):
...     def handle_load_base_data_manager(self, data_manager):
...         # Custom implementation
...         return data_manager
...     # ... implement other abstract methods
abstractmethod handle_load_algorithms(algorithm_config: Any) Any[source]#

Handle loading algorithms.

Parameters:
algorithm_configAny

The algorithm configuration to be processed

Returns:
Any

The processed algorithm configuration

abstractmethod handle_load_base_data_manager(data_manager: Any) Any[source]#

Handle loading base data manager.

Parameters:
data_managerAny

The data manager instance to be processed

Returns:
Any

The processed data manager instance

abstractmethod handle_load_custom_evaluators(module: Any, evaluators_file: Path) Any[source]#

Handle loading custom evaluators.

Parameters:
moduleAny

The evaluators module to be processed

evaluators_filePath

The path to the evaluators file

Returns:
Any

The processed evaluators module

abstractmethod handle_load_metric_config(metric_config: Any) Any[source]#

Handle loading metric config.

Parameters:
metric_configAny

The metric configuration to be processed

Returns:
Any

The processed metric configuration

abstractmethod handle_load_workflow(workflow: Any, workflow_name: str) Any[source]#

Handle loading workflow.

Parameters:
workflowAny

The workflow to be processed

workflow_namestr

The name of the workflow

Returns:
Any

The processed workflow

class CaptureStrategy(rerun_service: RerunService)[source]#

Bases: RerunStrategy

Strategy for capture mode - store data for config file.

This strategy is used during experiment execution to capture and store all configuration data needed for exact reproduction. It loads components normally while simultaneously capturing their configuration parameters and file contents for later use in coordinate mode.

The strategy captures: - Data manager parameters and preprocessor configurations - Algorithm configurations from algorithms.py files - Custom evaluators from evaluators.py files - Workflow files and class names - Metric configurations from metrics.py files

Attributes:
rerun_serviceRerunService

The rerun service instance for storing captured configurations

Notes

This strategy operates in “pass-through” mode, loading components normally while capturing their configuration data for later reproduction.

Examples

>>> rerun_service = RerunService("rerun", mode="capture")
>>> strategy = CaptureStrategy(rerun_service)
>>> data_manager = strategy.handle_load_base_data_manager(data_manager)
handle_load_algorithms(algorithm_config: Any) Any[source]#

Load algorithms normally and capture their config.

This method loads the algorithm configuration normally while capturing the complete algorithms.py file content for later reproduction. The file content is stored as-is to preserve all algorithm definitions and configurations.

Parameters:
algorithm_configAny

The algorithm configuration loaded by the IOService

Returns:
Any

The original algorithm configuration (pass-through behavior)

Notes

The captured file content includes the entire algorithms.py file, preserving all algorithm definitions, hyperparameter grids, and any custom configurations that might be present.

handle_load_base_data_manager(data_manager: Any) Any[source]#

Load data manager normally and capture its config.

This method loads the data manager normally while capturing its configuration parameters for later reproduction. The configuration includes all data manager parameters and preprocessor settings.

Parameters:
data_managerAny

The DataManager instance loaded by the IOService from data.py

Returns:
Any

The original data manager instance (pass-through behavior)

Notes

The captured configuration includes test_size, n_splits, split_method, group_column, stratified flag, random_state, and all preprocessor configurations.

handle_load_custom_evaluators(module: Any, evaluators_file: Path) Any[source]#

Load evaluators normally and capture their config.

This method loads the custom evaluators module normally while capturing the complete evaluators.py file content for later reproduction. Custom evaluators often have complex dependencies and user-defined classes that are best replicated as a complete file.

Parameters:
moduleAny

The evaluators module loaded by the IOService

evaluators_filePath

The path to the evaluators.py file

Returns:
Any

The original evaluators module (pass-through behavior)

Notes

The captured file content includes the entire evaluators.py file, preserving all custom evaluator definitions, imports, and any complex dependencies that might be present.

handle_load_metric_config(metric_config: Any) Any[source]#

Load metric config normally and capture its content.

This method loads the metric configuration normally while capturing the complete metrics.py file content for later reproduction. The file content is stored as-is to preserve all metric definitions and configurations.

Parameters:
metric_configAny

The metric configuration loaded by the IOService

Returns:
Any

The original metric configuration (pass-through behavior)

Notes

The captured file content includes the entire metrics.py file, preserving all metric definitions, display names, and any custom configurations that might be present.

handle_load_workflow(workflow: Any, workflow_name: str) Any[source]#

Load workflow normally and capture its content.

This method loads the workflow normally while capturing the workflow file content and class name for later reproduction. The workflow file is read from the workflows directory.

Parameters:
workflowAny

The workflow class loaded by the IOService

workflow_namestr

The name of the workflow file (without .py extension)

Returns:
Any

The original workflow class (pass-through behavior)

Notes

The captured workflow includes the complete file content and the class name, enabling exact reproduction of the workflow.

class CoordinatingStrategy(config_data: Dict[str, Any])[source]#

Bases: RerunStrategy

Strategy for coordinating mode - provides data from config file.

This strategy is used during experiment reproduction to reconstruct all components from previously captured configuration data. It creates temporary files from captured content and reconstructs objects to enable exact reproduction of the original experiment.

The strategy reconstructs: - Data managers with original parameters and preprocessors - Algorithm configurations from captured file content - Custom evaluators from captured file content - Workflow classes from captured file content - Metric configurations from captured file content

Attributes:
config_dataDict[str, Any]

The captured configuration data from the original experiment

_reconstructed_objectsDict[str, Any]

Cache for reconstructed objects to avoid duplicate reconstruction

_temp_filesList[Path]

List of temporary files created during reconstruction

Notes

This strategy operates in “reconstruction” mode, creating temporary files from captured content and reconstructing objects to match the original experiment configuration exactly.

Examples

>>> with open("run_config.json") as f:
...     config_data = json.load(f)
>>> strategy = CoordinatingStrategy(config_data)
>>> data_manager = strategy.handle_load_base_data_manager(None)
cleanup_temp_files() None[source]#

Clean up all temporary files created during coordination.

This method removes all temporary files created during the reconstruction process. It should be called when the strategy is no longer needed to free up disk space.

Notes

The method attempts to remove each temporary file and logs warnings for any files that cannot be removed. The _temp_files list is cleared after cleanup.

handle_load_algorithms(algorithm_config: Any) Any[source]#

Provide algorithms from config instead of loading from file.

This method reconstructs an AlgorithmCollection instance from the captured algorithms.py file content. It creates a temporary file with the original content and loads the configuration from it.

Parameters:
algorithm_configAny

Ignored parameter (maintained for interface compatibility)

Returns:
algorithm_collection.AlgorithmCollection

A reconstructed AlgorithmCollection instance with original algorithm definitions and configurations

Raises:
ValueError

If no algorithms configuration is found or if the loaded object is not a valid AlgorithmCollection

FileNotFoundError

If the temporary file cannot be created

handle_load_base_data_manager(data_manager: Any) Any[source]#

Provide base data manager from config instead of loading from file.

This method reconstructs a DataManager instance from the captured configuration data, including all original parameters and preprocessors. The reconstructed data manager will have identical behavior to the original one used in the experiment.

Parameters:
data_managerAny

Ignored parameter (maintained for interface compatibility)

Returns:
data_manager_module.DataManager

A reconstructed DataManager instance with original parameters and preprocessors

Raises:
KeyError

If required configuration data is missing

ValueError

If preprocessor class names are not recognized

Notes

The reconstruction process: 1. Extracts data manager parameters from captured config 2. Reconstructs preprocessor instances with original parameters 3. Creates new DataManager with original configuration

handle_load_custom_evaluators(module: Any, evaluators_file: Path) Any[source]#

Provide custom evaluators from config instead of loading from file.

handle_load_metric_config(metric_config) Any[source]#

Provide metric config from config instead of loading from file.

handle_load_workflow(workflow, workflow_name: str) Any[source]#

Provide workflow from config instead of loading from file.