Common Measures#
- class EvaluateModel(method_name: str, description: str)[source]#
Bases:
MeasureEvaluatorEvaluate a model on the provided measures and save the results.
This evaluator calculates specified performance measures for a single trained model on a given dataset. It supports any metric that is configured in the metric configuration manager.
- Parameters:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- Attributes:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- servicesServiceBundle or None
The global services bundle
- metric_configMetricManager or None
The metric configuration manager
Notes
This evaluator provides a straightforward way to calculate performance measures for a single model. It uses the metric configuration manager to retrieve the appropriate metric functions and calculates scores for all specified metrics.
The evaluator supports both classification and regression metrics, depending on what is configured in the metric configuration manager.
Examples
- Use the model evaluation evaluator:
>>> from brisk.evaluation.evaluators import registry >>> evaluator = registry.get("brisk_evaluate_model") >>> evaluator.evaluate(model, X, y, ["accuracy", "f1_score"], "results")
- report(results: Dict[str, Any]) Tuple[List[str], List[List[Any]]][source]#
Generate a report of the evaluation results.
Converts evaluation results into a format suitable for reporting with metric names and scores in a tabular format.
- Parameters:
- resultsDict[str, Any]
The results of the evaluation
- Returns:
- Tuple[List[str], List[List[Any]]]
A tuple containing: - List of column headers: [“Metric”, “Score”] - Nested list of rows with metric names and scores
Notes
The report format is designed for easy display in tables or reports, with one row per metric showing the metric name and its corresponding score.
The metadata key is excluded from the report.
- class EvaluateModelCV(method_name: str, description: str)[source]#
Bases:
MeasureEvaluatorEvaluate a model using cross-validation and save the scores.
This evaluator calculates performance measures for a model using cross-validation, providing more robust estimates of model performance by averaging scores across multiple train-test splits.
- Parameters:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- Attributes:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- servicesServiceBundle or None
The global services bundle
- metric_configMetricManager or None
The metric configuration manager
Notes
Cross-validation provides a more reliable estimate of model performance by reducing the variance associated with a single train-test split. The evaluator calculates mean scores, standard deviations, and stores all individual fold scores for detailed analysis.
The evaluator uses the utility service to get the appropriate cross-validation splitter based on the data characteristics.
Examples
- Use the cross-validation evaluator:
>>> from brisk.evaluation.evaluators import registry >>> evaluator = registry.get("brisk_evaluate_model_cv") >>> evaluator.evaluate( ... model, X, y, ["accuracy", "f1_score"], "cv_results", cv=5 ... )
- evaluate(model: BaseEstimator, X: DataFrame, y: Series, metrics: List[str], filename: str, cv: int = 5) None[source]#
Evaluate a model using cross-validation and save the scores.
Executes the complete cross-validation evaluation workflow. This includes calculating scores across multiple folds, computing statistics, and saving the results with metadata.
- Parameters:
- modelbase.BaseEstimator
The model to evaluate
- Xpd.DataFrame
The input features for evaluation
- ypd.Series
The target data
- metricsList[str]
A list of metric names to calculate
- filenamestr
The name of the output file (without extension)
- cvint, optional
The number of cross-validation folds, by default 5
- Returns:
- None
Notes
The cross-validation process uses the utility service to get the appropriate splitter based on the data characteristics (e.g., stratified splits for classification, grouped splits if groups are specified).
Results include mean scores, standard deviations, and all individual fold scores for comprehensive analysis.
- report(results: Dict[str, Any]) Tuple[List[str], List[List[Any]]][source]#
Generate a report of the cross-validation results.
Converts cross-validation results into a format suitable for reporting with mean scores, standard deviations, and all scores.
- Parameters:
- resultsDict[str, Any]
The results of the cross-validation
- Returns:
- Tuple[List[str], List[List[Any]]]
A tuple containing: - List of column headers: [“Metric”, “Mean Score”, “All Scores”] - Nested list of rows with metric statistics
Notes
The report format shows mean scores with standard deviations in parentheses, and all individual fold scores for detailed analysis.
The metadata key is excluded from the report.
- class CompareModels(method_name: str, description: str)[source]#
Bases:
MeasureEvaluatorCompare multiple models using specified measures.
This evaluator allows comparison of multiple models on the same dataset using specified performance measures. It can optionally calculate differences between model performances for detailed analysis.
- Parameters:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- Attributes:
- method_namestr
The name of the evaluator
- descriptionstr
The description of the evaluator output
- servicesServiceBundle or None
The global services bundle
- metric_configMetricManager or None
The metric configuration manager
Notes
This evaluator is particularly useful for model selection and performance comparison. It can compare any number of models on the same dataset using the same metrics, ensuring fair comparison.
When calculate_diff is True, the evaluator calculates pairwise differences between all model pairs for each metric, providing detailed performance comparisons.
Examples
- Compare multiple models:
>>> from brisk.evaluation.evaluators import registry >>> evaluator = registry.get("brisk_compare_models") >>> evaluator.evaluate(model1, model2, model3, X=X, y=y, ... metrics=["accuracy", "f1_score"], ... filename="comparison", calculate_diff=True)
- evaluate(*models: BaseEstimator, X: DataFrame, y: Series, metrics: List[str], filename: str, calculate_diff: bool = False) None[source]#
Compare multiple models using specified metrics.
Executes the complete model comparison workflow. This includes evaluating each model on the specified metrics and optionally calculating pairwise differences between models.
- Parameters:
- *modelsbase.BaseEstimator
Models to compare (variable number of arguments)
- Xpd.DataFrame
Input features for evaluation
- ypd.Series
Target values for evaluation
- metricsList[str]
Names of metrics to calculate
- filenamestr
Name for output file (without extension)
- calculate_diffbool, optional
Whether to calculate differences between models, by default False
- Returns:
- None
Notes
The method evaluates each model individually on the same dataset using the same metrics, ensuring fair comparison. If calculate_diff is True, it also calculates pairwise differences between all model pairs for each metric.
Results are saved with metadata for later analysis and reporting.