Dataset Evaluators#

class DatasetMeasureEvaluator(method_name: str, description: str)[source]#

Bases: BaseEvaluator

Template for dataset evaluators that calculate measures or plot data.

Abstract base class for evaluators that calculate measures or statistics on datasets. Provides a standardized workflow for dataset evaluation including data processing, metadata generation, result saving, and logging.

Parameters:
method_namestr

The name of the evaluator

descriptionstr

The description of the evaluator output

Attributes:
method_namestr

The name of the evaluator

descriptionstr

The description of the evaluator output

servicesServiceBundle or None

The global services bundle

metric_configMetricManager or None

The metric configuration manager

primary_colorstr

Primary color for plots and visualizations

secondary_colorstr

Secondary color for plots and visualizations

accent_colorstr

Accent color for plots and visualizations

Notes

This abstract base class provides a template for implementing dataset-level evaluation methods. Subclasses must implement the _calculate_measures method to define the specific evaluation logic.

The class handles the complete evaluation workflow: 1. Calculate measures using the implemented _calculate_measures method 2. Generate metadata for the results 3. Save results to JSON file with metadata 4. Log the results

Examples

Create a custom dataset measure evaluator:
>>> class CustomDatasetEvaluator(DatasetMeasureEvaluator):
...     def __init__(self):
...         super().__init__("custom_dataset", "Custom evaluation")
...     
...     def _calculate_measures(self, train_data, test_data, features):
...         # Custom measure calculation logic
...         return {"custom_metric": 0.85}
evaluate(train_data: DataFrame | Series, test_data: DataFrame | Series, feature_names: List[str], filename: str, dataset_name: str, group_name: str) Dict[str, Any][source]#

Template for all measure methods to follow.

Executes the complete evaluation workflow for dataset measures. This method orchestrates the evaluation process by calling the abstract _calculate_measures method and handling result processing.

Parameters:
train_datapd.DataFrame or pd.Series

The training data for evaluation

test_datapd.DataFrame or pd.Series

The testing data for evaluation

feature_namesList[str]

The names of the features in the dataset

filenamestr

The name of the file to save the results to (without extension)

dataset_namestr

The name of the dataset being evaluated

group_namestr

The name of the experiment group

Returns:
Dict[str, Any]

The results of the evaluation containing calculated measures

Notes

This method provides the standard workflow for dataset evaluation: 1. Calculate measures using _calculate_measures 2. Generate metadata for the results 3. Save results to JSON file 4. Log the results 5. Return the calculated measures

The method delegates the actual measure calculation to the _calculate_measures method, which must be implemented by subclasses.

class DatasetPlotEvaluator(method_name: str, description: str, plot_settings)[source]#

Bases: BaseEvaluator

Template for evaluators that plot datasets.

Abstract base class for evaluators that create plots and visualizations from datasets. Provides a standardized workflow for dataset plotting including plot data generation, plot creation, metadata handling, and result saving.

Parameters:
method_namestr

The name of the evaluator

descriptionstr

The description of the evaluator output

plot_settingsPlotSettings

The plot settings containing theme and color configuration

Attributes:
method_namestr

The name of the evaluator

descriptionstr

The description of the evaluator output

themeAny

The plot theme for styling plots

primary_colorstr

Primary color for plots (from plot settings)

secondary_colorstr

Secondary color for plots (from plot settings)

accent_colorstr

Accent color for plots (from plot settings)

servicesServiceBundle or None

The global services bundle (inherited from BaseEvaluator)

metric_configMetricManager or None

The metric configuration manager (inherited from BaseEvaluator)

Notes

This abstract base class provides a template for implementing dataset-level plotting methods. Subclasses must implement the _generate_plot_data and _create_plot methods to define the specific plotting logic.

The class handles the complete plotting workflow: 1. Generate plot data using _generate_plot_data method 2. Create the plot using _create_plot method 3. Generate metadata for the plot 4. Save the plot with metadata 5. Log the results

The constructor automatically configures matplotlib to use a non-interactive backend for thread safety and applies the provided plot settings.

Examples

Create a custom dataset plot evaluator:
>>> class CustomDatasetPlotEvaluator(DatasetPlotEvaluator):
...     def __init__(self, plot_settings):
...         super().__init__("custom", "Custom plot", plot_settings)
...     
...     def _generate_plot_data(self, train_data, test_data, **kwargs):
...         # Custom plot data generation logic
...         return plot_data
...     
...     def _create_plot(self, plot_data, **kwargs):
...         # Custom plot creation logic
...         return plot
plot(train_data: DataFrame | Series, test_data: DataFrame | Series, filename: str, dataset_name: str, group_name: str) None[source]#

Template for all plot methods to follow.

Executes the complete plotting workflow for dataset plots. This method orchestrates the plotting process by calling the abstract methods and handling plot processing.

Parameters:
train_datapd.DataFrame or pd.Series

The training data for plotting

test_datapd.DataFrame or pd.Series

The testing data for plotting

filenamestr

The name of the file to save the plot to (without extension)

dataset_namestr

The name of the dataset being plotted

group_namestr

The name of the experiment group

Returns:
None

Notes

This method provides the standard workflow for dataset plotting: 1. Generate plot data using _generate_plot_data 2. Create the plot using _create_plot 3. Generate metadata for the plot 4. Save the plot with metadata 5. Log the results

The method delegates the actual plot data generation and plot creation to the abstract methods, which must be implemented by subclasses.