Reporting Service#
- class ReportingService(name: str)[source]#
Bases:
BaseServiceMain service class for creating and managing comprehensive experiment reports.
This service handles the creation and management of experiment reports, including dataset information, model performance metrics, hyperparameter tuning results, and visualization data. It provides caching mechanisms for efficient data storage and retrieval, and supports both individual experiment tracking and group-level analysis.
The service maintains internal caches for images, tables, and tuned parameters, and processes this data into structured ReportData objects suitable for HTML report generation.
- Attributes:
- navbarreport_data.Navbar
Navigation bar information including version and timestamp
- datasetsDict[str, report_data.Dataset]
Dictionary mapping dataset IDs to Dataset objects
- experimentsDict[str, report_data.Experiment]
Dictionary mapping experiment IDs to Experiment objects
- experiment_groupsList[report_data.ExperimentGroup]
List of experiment group objects
- data_managersDict[str, report_data.DataManager]
Dictionary mapping group names to DataManager objects
- metric_managerOptional[metric_config.MetricManager]
The metric manager for performance evaluation
- registryOptional[EvaluatorRegistry]
The evaluator registry for processing evaluation results
- group_to_experimentDict[str, List[str]]
Dictionary mapping group names to experiment IDs
- _current_contextOptional[ReportingContext]
The current reporting context
- _image_cacheDict[Tuple[str, str, str], Tuple[str, Dict[str, str]]]
Cache for storing plot images and metadata
- _table_cacheDict[
Tuple[str, str, str, str], Tuple[Dict[str, Any], Dict[str, str]]
- ]
Cache for storing table data and metadata
- _cached_tuned_paramsDict[str, Any]
Cache for storing hyperparameter tuning results
- test_scoresDict[str, Dict[str, Dict[str, Dict[str, List[str]]]]]
Nested dictionary storing test scores by group, dataset, and split
- best_score_by_splitDict[
str, Dict[str, Dict[str, Tuple[str, str, str, str]]]
- ]
Nested dictionary storing best scores by group, dataset, and split
- tuning_metricOptional[Tuple[str, str]]
The tuning metric abbreviation and display name
Notes
The service uses internal caches to efficiently store and retrieve data during report generation. Caches are cleared when new data is added to ensure consistency. The service requires both metric manager and evaluator registry to be set before processing evaluation results.
Examples
>>> from brisk.services.reporting import ReportingService >>> from brisk.evaluation import metric_manager >>> >>> # Create and configure reporting service >>> reporting_service = ReportingService("reporting") >>> reporting_service.set_metric_config(metric_manager) >>> reporting_service.set_evaluator_registry(registry) >>> >>> # Set context and add data >>> reporting_service.set_context("classification", "iris", 0) >>> reporting_service.add_data_manager("classification", data_manager) >>> reporting_service.add_dataset("classification", data_splits) >>> >>> # Generate final report >>> report_data = reporting_service.get_report_data()
- add_data_manager(group_name: str, data_manager: DataManager) None[source]#
Add a DataManager instance to the report.
This method converts a DataManager instance into a report_data.DataManager object and stores it in the report. The DataManager contains information about data splitting configuration, including test size, number of splits, split method, and other data management parameters.
- Parameters:
- group_namestr
The name of the experiment group this data manager belongs to
- data_managerDataManager
The DataManager instance containing data splitting configuration
Notes
This method clears all internal caches after adding the data manager to ensure data consistency. The data manager information is used in the final report to document the data splitting methodology.
Examples
>>> from brisk.data import DataManager >>> reporting_service = ReportingService("reporting") >>> data_manager = DataManager(test_size=0.2, n_splits=5) >>> reporting_service.add_data_manager("classification", data_manager)
- add_dataset(group_name: str, data_splits: DataSplits) None[source]#
Add a dataset to the report.
This method processes a DataSplits instance and creates a comprehensive dataset report including split information, target statistics, correlation matrices, and feature distributions. It analyzes the data to determine if it’s categorical or continuous and generates appropriate statistics.
- Parameters:
- group_namestr
The name of the experiment group this dataset belongs to
- data_splitsDataSplits
The DataSplits instance containing the dataset and split information
Notes
This method performs extensive data analysis including: - Split size calculations (total, train, test observations) - Target variable statistics (categorical: proportions, entropy;
continuous: mean, std, min, max)
Correlation matrix generation for each split
Feature distribution analysis and visualization
Categorical vs continuous data detection (based on unique value ratio)
Examples
>>> from brisk.data import DataSplits >>> reporting_service = ReportingService("reporting") >>> data_splits = DataSplits.load_from_file("iris.csv") >>> reporting_service.add_dataset("classification", data_splits)
- add_experiment(algorithms: Dict) None[source]#
Add an experiment to the report.
- Parameters:
- algorithmsDict
The algorithms to add to the experiment
- Returns:
- None
- add_experiment_groups(groups: List) None[source]#
Add experiment groups to the report.
- Parameters:
- groupsList
The experiment groups to add
- Returns:
- None
- cache_tuned_params(tuned_params: Dict[str, Any]) None[source]#
Cache the tuned parameters from hyperparameter tuning.
- Parameters:
- tuned_paramsDict[str, Any]
The tuned parameters
- Returns:
- None
- clear_context() None[source]#
Clear the current reporting context.
This method removes the current reporting context, effectively resetting the context state. This is useful when switching between different experiment groups or datasets.
Notes
After clearing the context, a new context must be set using set_context() before adding new data to the report.
Examples
>>> reporting_service = ReportingService("reporting") >>> reporting_service.set_context("group1", "dataset1", 0) >>> # Process data for group1/dataset1 >>> reporting_service.clear_context() >>> reporting_service.set_context("group2", "dataset2", 0)
- get_context() Tuple[str, str, int, List[str] | None, List[str] | None][source]#
Get the current reporting context.
This method retrieves the current reporting context as a tuple containing all context information. The context must be set before calling this method.
- Returns:
- Tuple[str, str, int, Optional[List[str]], Optional[List[str]]]
A tuple containing: - group_name: The name of the experiment group - dataset_name: The name of the dataset - split_index: The index of the current split - feature_names: The names of the features (or None) - algorithm_names: The names of the algorithms (or None)
- Raises:
- ValueError
If no context is currently set
Examples
>>> report = ReportingService("reporting") >>> report.set_context("classification", "iris", 0) >>> group, dataset, split, features, algorithms = report.get_context() >>> print(f"Processing {group}/{dataset}, split {split}")
- get_report_data() ReportData[source]#
Get the complete report data object.
This method creates and returns a ReportData object containing all the collected experiment data, including datasets, experiments, experiment groups, and data managers. This is the final data structure used for HTML report generation.
- Returns:
- report_data.ReportData
The complete report data object containing: - navbar: Navigation information with version and timestamp - datasets: Dictionary of all processed datasets - experiments: Dictionary of all experiments - experiment_groups: List of experiment groups - data_managers: Dictionary of data managers
Notes
This method should be called after all data has been added to the reporting service. The returned ReportData object can be used with the ReportRenderer to generate HTML reports.
Examples
>>> reporting_service = ReportingService("reporting") >>> # Add all data... >>> report_data = reporting_service.get_report_data() >>> # Use with ReportRenderer to generate HTML
- set_context(group_name: str, dataset_name: str, split_index: int, feature_names: List[str] | None = None, algorithm_names: List[str] | None = None) None[source]#
Set the current reporting context.
This method establishes the current reporting context, which includes information about the experiment group, dataset, data split, features, and algorithms being processed. The context is used throughout the reporting pipeline to ensure data is properly associated and organized.
- Parameters:
- group_namestr
The name of the experiment group being processed
- dataset_namestr
The name of the dataset being analyzed
- split_indexint
The index of the current data split (0-based)
- feature_namesOptional[List[str]], default=None
The names of the features in the dataset
- algorithm_namesOptional[List[str]], default=None
The names of the algorithms being evaluated
Notes
The context is used internally to determine where to store and retrieve data from the various caches. It should be set before adding data managers, datasets, or experiments to ensure proper data organization.
Examples
>>> reporting_service = ReportingService("reporting") >>> reporting_service.set_context( ... group_name="classification", ... dataset_name="iris", ... split_index=0, ... feature_names=["sepal_length", "sepal_width"], ... algorithm_names=["RandomForest", "SVM"] ... )
- set_evaluator_registry(registry: EvaluatorRegistry) None[source]#
Set the evaluator registry for this reporting service.
This method configures the evaluator registry that will be used for processing evaluation results and generating report data. The registry is required for converting cached data into TableData and PlotData objects during report generation.
- Parameters:
- registryEvaluatorRegistry
The evaluator registry instance containing evaluator definitions and methods for processing evaluation results
Notes
The evaluator registry is used to: - Process cached table data into TableData objects - Process cached image data into PlotData objects - Generate appropriate descriptions and metadata for reports - Handle different types of evaluators (measures, plots, etc.)
Examples
>>> from brisk.evaluation.evaluators.registry import EvaluatorRegistry >>> reporting_service = ReportingService("reporting") >>> registry = EvaluatorRegistry() >>> reporting_service.set_evaluator_registry(registry)
- set_metric_config(metric_config: MetricManager) None[source]#
Set the metric manager for this reporting service.
This method configures the metric manager that will be used for performance evaluation and metric resolution throughout the reporting process. The metric manager is required for processing evaluation results and determining metric properties.
- Parameters:
- metric_configmetric_config.MetricManager
The metric manager instance containing metric definitions and configuration
Notes
The metric manager is used for resolving metric identifiers, determining whether higher values are better for specific metrics, and accessing metric display names and abbreviations.
Examples
>>> from brisk.evaluation import metric_manager >>> reporting_service = ReportingService("reporting") >>> reporting_service.set_metric_config(metric_manager)
- set_tuning_measure(measure: str) None[source]#
Set the measure used for hyperparameter tuning.
- Parameters:
- measurestr
The measure
- Returns:
- None
- store_plot_svg(image: str, metadata: Dict[str, str]) None[source]#
Store plot SVG data in the image cache.
This method stores SVG plot data along with its metadata in the internal image cache. The plot is associated with the current reporting context (group, dataset, split) and the method name from the metadata.
- Parameters:
- imagestr
The SVG image data as a string
- metadataDict[str, str]
The metadata dictionary containing method information and other plot-related metadata
Notes
The plot is stored using a key composed of the current context (group_name, dataset_name, split_id) and the method name from metadata. This ensures plots are properly organized and can be retrieved during report generation.
Examples
>>> reporting_service = ReportingService("reporting") >>> reporting_service.set_context("classification", "iris", 0) >>> svg_data = "<svg>...</svg>" >>> metadata = {"method": "brisk_correlation_matrix", "type": "plot"} >>> reporting_service.store_plot_svg(svg_data, metadata)
- store_table_data(data: Dict[str, Any], metadata: Dict[str, str]) None[source]#
Store table data in the table cache using current context.
This method stores table data along with its metadata in the internal table cache. The table is associated with the current reporting context (group, dataset, split) and the method name from the metadata.
- Parameters:
- dataDict[str, Any]
The table data dictionary containing the actual data to be displayed in the report
- metadataDict[str, str]
The metadata dictionary containing method information and other table-related metadata
Notes
The table is stored using a key composed of the current context (group_name, dataset_name, split_id) and the method name from metadata. This ensures tables are properly organized and can be retrieved during report generation.
Examples
>>> reporting_service = ReportingService("reporting") >>> reporting_service.set_context("classification", "iris", 0) >>> table_data = {"accuracy": 0.95, "precision": 0.92, "recall": 0.88} >>> metadata = {"method": "brisk_evaluate_model", "is_test": "True"} >>> reporting_service.store_table_data(table_data, metadata)
- class ReportingContext(group_name: str, dataset_name: str, split_index: int, feature_names: List[str] | None = None, algorithm_names: List[str] | None = None)[source]#
Bases:
objectContext class for tracking current reporting state and parameters.
This class encapsulates the current reporting context, including information about the experiment group, dataset, data split, features, and algorithms being processed. It provides a convenient way to pass context information throughout the reporting pipeline.
- Attributes:
- group_namestr
The name of the experiment group being processed
- dataset_namestr
The name of the dataset being analyzed
- split_indexint
The index of the current data split (0-based)
- feature_namesOptional[List[str]]
The names of the features in the dataset
- algorithm_namesOptional[List[str]]
The names of the algorithms being evaluated
Notes
This class is used internally by the ReportingService to maintain context state during report generation. It helps ensure that data is properly associated with the correct experiment group, dataset, and split.
Examples
>>> context = ReportingContext( ... group_name="classification", ... dataset_name="iris", ... split_index=0, ... feature_names=[ ... "sepal_length", "sepal_width", "petal_length", "petal_width" ... ], ... algorithm_names=["RandomForest", "SVM", "LogisticRegression"] ... ) >>> print(f"Processing {context.group_name} group")