Dataset Evaluators#

class DatasetMeasureEvaluator(method_name: str, description: str)[source]#

Bases: BaseEvaluator

Template for dataset evaluators that calculate measures or plot data.

Abstract base class for evaluators that calculate measures or statistics on datasets. Provides a standardized workflow for dataset evaluation including data processing, metadata generation, result saving, and logging.

Parameters:

method_namestr: The name of the evaluator
descriptionstr: The description of the evaluator output

Attributes:

method_namestr: The name of the evaluator
descriptionstr: The description of the evaluator output
servicesServiceBundle or None: The global services bundle
metric_configMetricManager or None: The metric configuration manager
primary_colorstr: Primary color for plots and visualizations
secondary_colorstr: Secondary color for plots and visualizations
accent_colorstr: Accent color for plots and visualizations

Notes

This abstract base class provides a template for implementing dataset-level evaluation methods. Subclasses must implement the _calculate_measures method to define the specific evaluation logic.

The class handles the complete evaluation workflow: 1. Calculate measures using the implemented _calculate_measures method 2. Generate metadata for the results 3. Save results to JSON file with metadata 4. Log the results

Examples

Create a custom dataset measure evaluator:

>>> class CustomDatasetEvaluator(DatasetMeasureEvaluator):
...     def __init__(self):
...         super().__init__("custom_dataset", "Custom evaluation")
...     
...     def _calculate_measures(self, train_data, test_data, features):
...         # Custom measure calculation logic
...         return {"custom_metric": 0.85}

evaluate(train_data: DataFrame | Series, test_data: DataFrame | Series, feature_names: List[str], filename: str, dataset_name: str, group_name: str) → Dict[str, Any][source]#

Template for all measure methods to follow.

Executes the complete evaluation workflow for dataset measures. This method orchestrates the evaluation process by calling the abstract _calculate_measures method and handling result processing.

Parameters:

train_datapd.DataFrame or pd.Series: The training data for evaluation
test_datapd.DataFrame or pd.Series: The testing data for evaluation
feature_namesList[str]: The names of the features in the dataset
filenamestr: The name of the file to save the results to (without extension)
dataset_namestr: The name of the dataset being evaluated
group_namestr: The name of the experiment group

Returns:

Dict[str, Any]: The results of the evaluation containing calculated measures

Notes

This method provides the standard workflow for dataset evaluation: 1. Calculate measures using _calculate_measures 2. Generate metadata for the results 3. Save results to JSON file 4. Log the results 5. Return the calculated measures

The method delegates the actual measure calculation to the _calculate_measures method, which must be implemented by subclasses.

class DatasetPlotEvaluator(method_name: str, description: str, plot_settings)[source]#

Bases: BaseEvaluator

Template for evaluators that plot datasets.

Abstract base class for evaluators that create plots and visualizations from datasets. Provides a standardized workflow for dataset plotting including plot data generation, plot creation, metadata handling, and result saving.

Parameters:

method_namestr: The name of the evaluator
descriptionstr: The description of the evaluator output
plot_settingsPlotSettings: The plot settings containing theme and color configuration

Attributes:

method_namestr: The name of the evaluator
descriptionstr: The description of the evaluator output
themeAny: The plot theme for styling plots
primary_colorstr: Primary color for plots (from plot settings)
secondary_colorstr: Secondary color for plots (from plot settings)
accent_colorstr: Accent color for plots (from plot settings)
servicesServiceBundle or None: The global services bundle (inherited from BaseEvaluator)
metric_configMetricManager or None: The metric configuration manager (inherited from BaseEvaluator)

Notes

This abstract base class provides a template for implementing dataset-level plotting methods. Subclasses must implement the _generate_plot_data and _create_plot methods to define the specific plotting logic.

The class handles the complete plotting workflow: 1. Generate plot data using _generate_plot_data method 2. Create the plot using _create_plot method 3. Generate metadata for the plot 4. Save the plot with metadata 5. Log the results

The constructor automatically configures matplotlib to use a non-interactive backend for thread safety and applies the provided plot settings.

Examples

Create a custom dataset plot evaluator:

>>> class CustomDatasetPlotEvaluator(DatasetPlotEvaluator):
...     def __init__(self, plot_settings):
...         super().__init__("custom", "Custom plot", plot_settings)
...     
...     def _generate_plot_data(self, train_data, test_data, **kwargs):
...         # Custom plot data generation logic
...         return plot_data
...     
...     def _create_plot(self, plot_data, **kwargs):
...         # Custom plot creation logic
...         return plot

plot(train_data: DataFrame | Series, test_data: DataFrame | Series, filename: str, dataset_name: str, group_name: str) → None[source]#

Template for all plot methods to follow.

Executes the complete plotting workflow for dataset plots. This method orchestrates the plotting process by calling the abstract methods and handling plot processing.

Parameters:

train_datapd.DataFrame or pd.Series: The training data for plotting
test_datapd.DataFrame or pd.Series: The testing data for plotting
filenamestr: The name of the file to save the plot to (without extension)
dataset_namestr: The name of the dataset being plotted
group_namestr: The name of the experiment group

Returns:

None

Notes

This method provides the standard workflow for dataset plotting: 1. Generate plot data using _generate_plot_data 2. Create the plot using _create_plot 3. Generate metadata for the plot 4. Save the plot with metadata 5. Log the results

The method delegates the actual plot data generation and plot creation to the abstract methods, which must be implemented by subclasses.

Dataset Evaluators#

This Page