Dataset Evaluators#
- class DatasetMeasureEvaluator(method_name: str, description: str)[source]#
Bases:
BaseEvaluatorTemplate for dataset evaluators that calculate measures or plot data.
Abstract base class for evaluators that calculate measures or statistics on datasets. Provides a standardized workflow for dataset evaluation including data processing, metadata generation, result saving, and logging.
- Parameters:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- Attributes:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- servicesServiceBundle or None
The global services bundle
- metric_configMetricManager or None
The metric configuration manager
- primary_colorstr
Primary color for plots and visualizations
- secondary_colorstr
Secondary color for plots and visualizations
- accent_colorstr
Accent color for plots and visualizations
Notes
This abstract base class provides a template for implementing dataset-level evaluation methods. Subclasses must implement the _calculate_measures method to define the specific evaluation logic.
The class handles the complete evaluation workflow: 1. Calculate measures using the implemented _calculate_measures method 2. Generate metadata for the results 3. Save results to JSON file with metadata 4. Log the results
Examples
- Create a custom dataset measure evaluator:
>>> class CustomDatasetEvaluator(DatasetMeasureEvaluator): ... def __init__(self): ... super().__init__("custom_dataset", "Custom evaluation") ... ... def _calculate_measures(self, train_data, test_data, features): ... # Custom measure calculation logic ... return {"custom_metric": 0.85}
- evaluate(train_data: DataFrame | Series, test_data: DataFrame | Series, feature_names: List[str], filename: str, dataset_name: str, group_name: str) Dict[str, Any][source]#
Template for all measure methods to follow.
Executes the complete evaluation workflow for dataset measures. This method orchestrates the evaluation process by calling the abstract _calculate_measures method and handling result processing.
- Parameters:
- train_datapd.DataFrame or pd.Series
The training data for evaluation
- test_datapd.DataFrame or pd.Series
The testing data for evaluation
- feature_namesList[str]
The names of the features in the dataset
- filenamestr
The name of the file to save the results to (without extension)
- dataset_namestr
The name of the dataset being evaluated
- group_namestr
The name of the experiment group
- Returns:
- Dict[str, Any]
The results of the evaluation containing calculated measures
Notes
This method provides the standard workflow for dataset evaluation: 1. Calculate measures using _calculate_measures 2. Generate metadata for the results 3. Save results to JSON file 4. Log the results 5. Return the calculated measures
The method delegates the actual measure calculation to the _calculate_measures method, which must be implemented by subclasses.
- class DatasetPlotEvaluator(method_name: str, description: str, plot_settings)[source]#
Bases:
BaseEvaluatorTemplate for evaluators that plot datasets.
Abstract base class for evaluators that create plots and visualizations from datasets. Provides a standardized workflow for dataset plotting including plot data generation, plot creation, metadata handling, and result saving.
- Parameters:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- plot_settingsPlotSettings
The plot settings containing theme and color configuration
- Attributes:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- themeAny
The plot theme for styling plots
- primary_colorstr
Primary color for plots (from plot settings)
- secondary_colorstr
Secondary color for plots (from plot settings)
- accent_colorstr
Accent color for plots (from plot settings)
- servicesServiceBundle or None
The global services bundle (inherited from BaseEvaluator)
- metric_configMetricManager or None
The metric configuration manager (inherited from BaseEvaluator)
Notes
This abstract base class provides a template for implementing dataset-level plotting methods. Subclasses must implement the _generate_plot_data and _create_plot methods to define the specific plotting logic.
The class handles the complete plotting workflow: 1. Generate plot data using _generate_plot_data method 2. Create the plot using _create_plot method 3. Generate metadata for the plot 4. Save the plot with metadata 5. Log the results
The constructor automatically configures matplotlib to use a non-interactive backend for thread safety and applies the provided plot settings.
Examples
- Create a custom dataset plot evaluator:
>>> class CustomDatasetPlotEvaluator(DatasetPlotEvaluator): ... def __init__(self, plot_settings): ... super().__init__("custom", "Custom plot", plot_settings) ... ... def _generate_plot_data(self, train_data, test_data, **kwargs): ... # Custom plot data generation logic ... return plot_data ... ... def _create_plot(self, plot_data, **kwargs): ... # Custom plot creation logic ... return plot
- plot(train_data: DataFrame | Series, test_data: DataFrame | Series, filename: str, dataset_name: str, group_name: str) None[source]#
Template for all plot methods to follow.
Executes the complete plotting workflow for dataset plots. This method orchestrates the plotting process by calling the abstract methods and handling plot processing.
- Parameters:
- train_datapd.DataFrame or pd.Series
The training data for plotting
- test_datapd.DataFrame or pd.Series
The testing data for plotting
- filenamestr
The name of the file to save the plot to (without extension)
- dataset_namestr
The name of the dataset being plotted
- group_namestr
The name of the experiment group
- Returns:
- None
Notes
This method provides the standard workflow for dataset plotting: 1. Generate plot data using _generate_plot_data 2. Create the plot using _create_plot 3. Generate metadata for the plot 4. Save the plot with metadata 5. Log the results
The method delegates the actual plot data generation and plot creation to the abstract methods, which must be implemented by subclasses.