Reporting Service#

class ReportingService(name: str)[source]#

Bases: BaseService

Main service class for creating and managing comprehensive experiment reports.

This service handles the creation and management of experiment reports, including dataset information, model performance metrics, hyperparameter tuning results, and visualization data. It provides caching mechanisms for efficient data storage and retrieval, and supports both individual experiment tracking and group-level analysis.

The service maintains internal caches for images, tables, and tuned parameters, and processes this data into structured ReportData objects suitable for HTML report generation.

Attributes:

navbarreport_data.Navbar: Navigation bar information including version and timestamp
datasetsDict[str, report_data.Dataset]: Dictionary mapping dataset IDs to Dataset objects
experimentsDict[str, report_data.Experiment]: Dictionary mapping experiment IDs to Experiment objects
experiment_groupsList[report_data.ExperimentGroup]: List of experiment group objects
data_managersDict[str, report_data.DataManager]: Dictionary mapping group names to DataManager objects
metric_managerOptional[metric_config.MetricManager]: The metric manager for performance evaluation
registryOptional[EvaluatorRegistry]: The evaluator registry for processing evaluation results
group_to_experimentDict[str, List[str]]: Dictionary mapping group names to experiment IDs
_current_contextOptional[ReportingContext]: The current reporting context
_image_cacheDict[Tuple[str, str, str], Tuple[str, Dict[str, str]]]: Cache for storing plot images and metadata
_table_cacheDict[: Tuple[str, str, str, str], Tuple[Dict[str, Any], Dict[str, str]]
]: Cache for storing table data and metadata
_cached_tuned_paramsDict[str, Any]: Cache for storing hyperparameter tuning results
test_scoresDict[str, Dict[str, Dict[str, Dict[str, List[str]]]]]: Nested dictionary storing test scores by group, dataset, and split
best_score_by_splitDict[: str, Dict[str, Dict[str, Tuple[str, str, str, str]]]
]: Nested dictionary storing best scores by group, dataset, and split
tuning_metricOptional[Tuple[str, str]]: The tuning metric abbreviation and display name

Notes

The service uses internal caches to efficiently store and retrieve data during report generation. Caches are cleared when new data is added to ensure consistency. The service requires both metric manager and evaluator registry to be set before processing evaluation results.

Examples

>>> from brisk.services.reporting import ReportingService
>>> from brisk.evaluation import metric_manager
>>> 
>>> # Create and configure reporting service
>>> reporting_service = ReportingService("reporting")
>>> reporting_service.set_metric_config(metric_manager)
>>> reporting_service.set_evaluator_registry(registry)
>>> 
>>> # Set context and add data
>>> reporting_service.set_context("classification", "iris", 0)
>>> reporting_service.add_data_manager("classification", data_manager)
>>> reporting_service.add_dataset("classification", data_splits)
>>> 
>>> # Generate final report
>>> report_data = reporting_service.get_report_data()

add_data_manager(group_name: str, data_manager: DataManager) → None[source]#

Add a DataManager instance to the report.

This method converts a DataManager instance into a report_data.DataManager object and stores it in the report. The DataManager contains information about data splitting configuration, including test size, number of splits, split method, and other data management parameters.

Parameters:

group_namestr: The name of the experiment group this data manager belongs to
data_managerDataManager: The DataManager instance containing data splitting configuration

Notes

This method clears all internal caches after adding the data manager to ensure data consistency. The data manager information is used in the final report to document the data splitting methodology.

Examples

>>> from brisk.data import DataManager
>>> reporting_service = ReportingService("reporting")
>>> data_manager = DataManager(test_size=0.2, n_splits=5)
>>> reporting_service.add_data_manager("classification", data_manager)

add_dataset(group_name: str, data_splits: DataSplits) → None[source]#

Add a dataset to the report.

This method processes a DataSplits instance and creates a comprehensive dataset report including split information, target statistics, correlation matrices, and feature distributions. It analyzes the data to determine if it’s categorical or continuous and generates appropriate statistics.

Parameters:

group_namestr: The name of the experiment group this dataset belongs to
data_splitsDataSplits: The DataSplits instance containing the dataset and split information

Notes

This method performs extensive data analysis including: - Split size calculations (total, train, test observations) - Target variable statistics (categorical: proportions, entropy;

continuous: mean, std, min, max)

Correlation matrix generation for each split
Feature distribution analysis and visualization
Categorical vs continuous data detection (based on unique value ratio)

Examples

>>> from brisk.data import DataSplits
>>> reporting_service = ReportingService("reporting")
>>> data_splits = DataSplits.load_from_file("iris.csv")
>>> reporting_service.add_dataset("classification", data_splits)

add_experiment(algorithms: Dict) → None[source]#

Add an experiment to the report.

Parameters:

algorithmsDict: The algorithms to add to the experiment

Returns:

None

add_experiment_groups(groups: List) → None[source]#

Add experiment groups to the report.

Parameters:

groupsList: The experiment groups to add

Returns:

None

cache_tuned_params(tuned_params: Dict[str, Any]) → None[source]#

Cache the tuned parameters from hyperparameter tuning.

Parameters:

tuned_paramsDict[str, Any]: The tuned parameters

Returns:

None

clear_context() → None[source]#

Clear the current reporting context.

This method removes the current reporting context, effectively resetting the context state. This is useful when switching between different experiment groups or datasets.

Notes

After clearing the context, a new context must be set using set_context() before adding new data to the report.

Examples

>>> reporting_service = ReportingService("reporting")
>>> reporting_service.set_context("group1", "dataset1", 0)
>>> # Process data for group1/dataset1
>>> reporting_service.clear_context()
>>> reporting_service.set_context("group2", "dataset2", 0)

get_context() → Tuple[str, str, int, List[str] | None, List[str] | None][source]#

Get the current reporting context.

This method retrieves the current reporting context as a tuple containing all context information. The context must be set before calling this method.

Returns:

Tuple[str, str, int, Optional[List[str]], Optional[List[str]]]: A tuple containing: - group_name: The name of the experiment group - dataset_name: The name of the dataset - split_index: The index of the current split - feature_names: The names of the features (or None) - algorithm_names: The names of the algorithms (or None)

Raises:

ValueError: If no context is currently set

Examples

>>> report = ReportingService("reporting")
>>> report.set_context("classification", "iris", 0)
>>> group, dataset, split, features, algorithms = report.get_context()
>>> print(f"Processing {group}/{dataset}, split {split}")

get_report_data() → ReportData[source]#

Get the complete report data object.

This method creates and returns a ReportData object containing all the collected experiment data, including datasets, experiments, experiment groups, and data managers. This is the final data structure used for HTML report generation.

Returns:

report_data.ReportData: The complete report data object containing: - navbar: Navigation information with version and timestamp - datasets: Dictionary of all processed datasets - experiments: Dictionary of all experiments - experiment_groups: List of experiment groups - data_managers: Dictionary of data managers

Notes

This method should be called after all data has been added to the reporting service. The returned ReportData object can be used with the ReportRenderer to generate HTML reports.

Examples

>>> reporting_service = ReportingService("reporting")
>>> # Add all data...
>>> report_data = reporting_service.get_report_data()
>>> # Use with ReportRenderer to generate HTML

set_context(group_name: str, dataset_name: str, split_index: int, feature_names: List[str] | None = None, algorithm_names: List[str] | None = None) → None[source]#

Set the current reporting context.

This method establishes the current reporting context, which includes information about the experiment group, dataset, data split, features, and algorithms being processed. The context is used throughout the reporting pipeline to ensure data is properly associated and organized.

Parameters:

group_namestr: The name of the experiment group being processed
dataset_namestr: The name of the dataset being analyzed
split_indexint: The index of the current data split (0-based)
feature_namesOptional[List[str]], default=None: The names of the features in the dataset
algorithm_namesOptional[List[str]], default=None: The names of the algorithms being evaluated

Notes

The context is used internally to determine where to store and retrieve data from the various caches. It should be set before adding data managers, datasets, or experiments to ensure proper data organization.

Examples

>>> reporting_service = ReportingService("reporting")
>>> reporting_service.set_context(
...     group_name="classification",
...     dataset_name="iris",
...     split_index=0,
...     feature_names=["sepal_length", "sepal_width"],
...     algorithm_names=["RandomForest", "SVM"]
... )

set_evaluator_registry(registry: EvaluatorRegistry) → None[source]#

Set the evaluator registry for this reporting service.

This method configures the evaluator registry that will be used for processing evaluation results and generating report data. The registry is required for converting cached data into TableData and PlotData objects during report generation.

Parameters:

registryEvaluatorRegistry: The evaluator registry instance containing evaluator definitions and methods for processing evaluation results

Notes

The evaluator registry is used to: - Process cached table data into TableData objects - Process cached image data into PlotData objects - Generate appropriate descriptions and metadata for reports - Handle different types of evaluators (measures, plots, etc.)

Examples

>>> from brisk.evaluation.evaluators.registry import EvaluatorRegistry
>>> reporting_service = ReportingService("reporting")
>>> registry = EvaluatorRegistry()
>>> reporting_service.set_evaluator_registry(registry)

set_metric_config(metric_config: MetricManager) → None[source]#

Set the metric manager for this reporting service.

This method configures the metric manager that will be used for performance evaluation and metric resolution throughout the reporting process. The metric manager is required for processing evaluation results and determining metric properties.

Parameters:

metric_configmetric_config.MetricManager: The metric manager instance containing metric definitions and configuration

Notes

The metric manager is used for resolving metric identifiers, determining whether higher values are better for specific metrics, and accessing metric display names and abbreviations.

Examples

>>> from brisk.evaluation import metric_manager
>>> reporting_service = ReportingService("reporting")
>>> reporting_service.set_metric_config(metric_manager)

set_tuning_measure(measure: str) → None[source]#

Set the measure used for hyperparameter tuning.

Parameters:

measurestr: The measure

Returns:

None

store_plot_svg(image: str, metadata: Dict[str, str]) → None[source]#

Store plot SVG data in the image cache.

This method stores SVG plot data along with its metadata in the internal image cache. The plot is associated with the current reporting context (group, dataset, split) and the method name from the metadata.

Parameters:

imagestr: The SVG image data as a string
metadataDict[str, str]: The metadata dictionary containing method information and other plot-related metadata

Notes

The plot is stored using a key composed of the current context (group_name, dataset_name, split_id) and the method name from metadata. This ensures plots are properly organized and can be retrieved during report generation.

Examples

>>> reporting_service = ReportingService("reporting")
>>> reporting_service.set_context("classification", "iris", 0)
>>> svg_data = "<svg>...</svg>"
>>> metadata = {"method": "brisk_correlation_matrix", "type": "plot"}
>>> reporting_service.store_plot_svg(svg_data, metadata)

store_table_data(data: Dict[str, Any], metadata: Dict[str, str]) → None[source]#

Store table data in the table cache using current context.

This method stores table data along with its metadata in the internal table cache. The table is associated with the current reporting context (group, dataset, split) and the method name from the metadata.

Parameters:

dataDict[str, Any]: The table data dictionary containing the actual data to be displayed in the report
metadataDict[str, str]: The metadata dictionary containing method information and other table-related metadata

Notes

The table is stored using a key composed of the current context (group_name, dataset_name, split_id) and the method name from metadata. This ensures tables are properly organized and can be retrieved during report generation.

Examples

>>> reporting_service = ReportingService("reporting")
>>> reporting_service.set_context("classification", "iris", 0)
>>> table_data = {"accuracy": 0.95, "precision": 0.92, "recall": 0.88}
>>> metadata = {"method": "brisk_evaluate_model", "is_test": "True"}
>>> reporting_service.store_table_data(table_data, metadata)

class ReportingContext(group_name: str, dataset_name: str, split_index: int, feature_names: List[str] | None = None, algorithm_names: List[str] | None = None)[source]#

Bases: object

Context class for tracking current reporting state and parameters.

This class encapsulates the current reporting context, including information about the experiment group, dataset, data split, features, and algorithms being processed. It provides a convenient way to pass context information throughout the reporting pipeline.

Attributes:

group_namestr: The name of the experiment group being processed
dataset_namestr: The name of the dataset being analyzed
split_indexint: The index of the current data split (0-based)
feature_namesOptional[List[str]]: The names of the features in the dataset
algorithm_namesOptional[List[str]]: The names of the algorithms being evaluated

Notes

This class is used internally by the ReportingService to maintain context state during report generation. It helps ensure that data is properly associated with the correct experiment group, dataset, and split.

Examples

>>> context = ReportingContext(
...     group_name="classification",
...     dataset_name="iris",
...     split_index=0,
...     feature_names=[
...         "sepal_length", "sepal_width", "petal_length", "petal_width"
...     ],
...     algorithm_names=["RandomForest", "SVM", "LogisticRegression"]
... )
>>> print(f"Processing {context.group_name} group")

Reporting Service#

This Page