Metadata Service#

class MetadataService(name: str)[source]#

Bases: BaseService

Metadata generation service for evaluators and experiments.

This service provides comprehensive metadata generation functionality for the Brisk package, including model evaluation metadata, dataset information, and rerun configuration metadata. It ensures consistent tracking and organization of metadata throughout the machine learning pipeline.

The service generates standardized metadata dictionaries that include timestamps, method names, and type-specific information for different evaluation contexts.

Attributes:
algorithm_configOptional[algorithm_collection.AlgorithmCollection]

The algorithm configuration used for model metadata generation

Notes

The service requires an algorithm configuration to be set before generating model metadata. This can be done using the set_algorithm_config() method.

Examples

>>> from brisk.services.metadata import MetadataService
>>> from brisk.configuration import AlgorithmCollection
>>> 
>>> # Create metadata service
>>> metadata_service = MetadataService("metadata")
>>> metadata_service.set_algorithm_config(algorithm_config)
>>> 
>>> # Generate different types of metadata
>>> model_meta = metadata_service.get_model(models, "evaluate", is_test=False)
>>> dataset_meta = metadata_service.get_dataset("analyze", "iris", "classification")
>>> rerun_meta = metadata_service.get_rerun("save_config")
get_dataset(method_name: str, dataset_name: str, group_name: str) Dict[str, Any][source]#

Generate metadata for a dataset evaluation.

This method creates metadata for dataset-related evaluations, including information about the dataset being analyzed and the group context. This is typically used for dataset analysis and visualization operations.

Parameters:
method_namestr

The name of the calling method that is performing the analysis

dataset_namestr

The name of the dataset being analyzed

group_namestr

The name of the group or experiment group the dataset belongs to

Returns:
Dict[str, Any]

Metadata dictionary containing: - timestamp: When the analysis was performed - method: Name of the calling method - type: “dataset” (indicates this is dataset metadata) - dataset: Name of the dataset - group: Name of the group

Examples

>>> metadata_service = MetadataService("metadata")
>>> 
>>> # Generate dataset metadata
>>> dataset_meta = metadata_service.get_dataset(
...     method_name="analyze_dataset",
...     dataset_name="iris",
...     group_name="classification"
... )
>>> 
>>> # For visualization metadata
>>> viz_meta = metadata_service.get_dataset(
...     method_name="create_plots",
...     dataset_name="housing",
...     group_name="regression"
... )
get_model(models: BaseEstimator | List[BaseEstimator], method_name: str, is_test: bool = False) Dict[str, Any][source]#

Generate metadata for a model evaluation.

This method creates comprehensive metadata for model evaluations, including information about the models used, evaluation context, and timing information. It extracts algorithm wrapper information from the configured algorithm collection.

Parameters:
modelsUnion[base.BaseEstimator, List[base.BaseEstimator]]

The model(s) to include in metadata. Can be a single model or a list of models

method_namestr

The name of the calling method that is performing the evaluation

is_testbool, default=False

Whether the evaluation is being performed on test data

Returns:
Dict[str, Any]

Metadata dictionary containing: - timestamp: When the evaluation was performed - method: Name of the calling method - type: “model” (indicates this is model metadata) - models: Dictionary mapping wrapper names to display names - is_test: String representation of whether this is test data

Raises:
AttributeError

If algorithm_config is not set or models don’t have wrapper_name

KeyError

If model wrapper names are not found in algorithm_config

Examples

>>> metadata_service = MetadataService("metadata")
>>> metadata_service.set_algorithm_config(algorithm_config)
>>> 
>>> # Single model
>>> model_meta = metadata_service.get_model(
...     models=my_model,
...     method_name="evaluate_model",
...     is_test=False
... )
>>> 
>>> # Multiple models
>>> models_meta = metadata_service.get_model(
...     models=[model1, model2],
...     method_name="compare_models",
...     is_test=True
... )
get_rerun(method_name: str) Dict[str, Any][source]#

Generate metadata for rerun configuration files.

This method creates metadata for rerun configuration operations, which are used to track and manage experiment reruns and configuration persistence.

Parameters:
method_namestr

The name of the calling method that is handling rerun configuration

Returns:
Dict[str, Any]

Metadata dictionary containing: - timestamp: When the rerun operation was performed - method: Name of the calling method - type: “rerun_config” (indicates this is rerun metadata)

Examples

>>> metadata_service = MetadataService("metadata")
>>> 
>>> # Generate rerun metadata
>>> rerun_meta = metadata_service.get_rerun("save_rerun_config")
>>> 
>>> # For loading rerun configuration
>>> load_meta = metadata_service.get_rerun("load_rerun_config")
set_algorithm_config(algorithm_config: AlgorithmCollection) None[source]#

Set the algorithm configuration for model metadata generation.

This method sets the algorithm configuration that is used when generating model metadata. The configuration is required to extract algorithm wrapper information and display names for models.

Parameters:
algorithm_configalgorithm_collection.AlgorithmCollection

The algorithm configuration containing wrapper information

Notes

This method must be called before using get_model() to generate model metadata. The algorithm configuration is not required for dataset or rerun metadata generation.

Examples

>>> from brisk.configuration import AlgorithmCollection
>>> metadata_service = MetadataService("metadata")
>>> 
>>> # Set algorithm configuration
>>> metadata_service.set_algorithm_config(algorithm_config)
>>> 
>>> # Now model metadata can be generated
>>> model_meta = metadata_service.get_model(models, "evaluate")