Metadata Service#
- class MetadataService(name: str)[source]#
Bases:
BaseServiceMetadata generation service for evaluators and experiments.
This service provides comprehensive metadata generation functionality for the Brisk package, including model evaluation metadata, dataset information, and rerun configuration metadata. It ensures consistent tracking and organization of metadata throughout the machine learning pipeline.
The service generates standardized metadata dictionaries that include timestamps, method names, and type-specific information for different evaluation contexts.
- Attributes:
- algorithm_configOptional[algorithm_collection.AlgorithmCollection]
The algorithm configuration used for model metadata generation
Notes
The service requires an algorithm configuration to be set before generating model metadata. This can be done using the set_algorithm_config() method.
Examples
>>> from brisk.services.metadata import MetadataService >>> from brisk.configuration import AlgorithmCollection >>> >>> # Create metadata service >>> metadata_service = MetadataService("metadata") >>> metadata_service.set_algorithm_config(algorithm_config) >>> >>> # Generate different types of metadata >>> model_meta = metadata_service.get_model(models, "evaluate", is_test=False) >>> dataset_meta = metadata_service.get_dataset("analyze", "iris", "classification") >>> rerun_meta = metadata_service.get_rerun("save_config")
- get_dataset(method_name: str, dataset_name: str, group_name: str) Dict[str, Any][source]#
Generate metadata for a dataset evaluation.
This method creates metadata for dataset-related evaluations, including information about the dataset being analyzed and the group context. This is typically used for dataset analysis and visualization operations.
- Parameters:
- method_namestr
The name of the calling method that is performing the analysis
- dataset_namestr
The name of the dataset being analyzed
- group_namestr
The name of the group or experiment group the dataset belongs to
- Returns:
- Dict[str, Any]
Metadata dictionary containing: - timestamp: When the analysis was performed - method: Name of the calling method - type: “dataset” (indicates this is dataset metadata) - dataset: Name of the dataset - group: Name of the group
Examples
>>> metadata_service = MetadataService("metadata") >>> >>> # Generate dataset metadata >>> dataset_meta = metadata_service.get_dataset( ... method_name="analyze_dataset", ... dataset_name="iris", ... group_name="classification" ... ) >>> >>> # For visualization metadata >>> viz_meta = metadata_service.get_dataset( ... method_name="create_plots", ... dataset_name="housing", ... group_name="regression" ... )
- get_model(models: BaseEstimator | List[BaseEstimator], method_name: str, is_test: bool = False) Dict[str, Any][source]#
Generate metadata for a model evaluation.
This method creates comprehensive metadata for model evaluations, including information about the models used, evaluation context, and timing information. It extracts algorithm wrapper information from the configured algorithm collection.
- Parameters:
- modelsUnion[base.BaseEstimator, List[base.BaseEstimator]]
The model(s) to include in metadata. Can be a single model or a list of models
- method_namestr
The name of the calling method that is performing the evaluation
- is_testbool, default=False
Whether the evaluation is being performed on test data
- Returns:
- Dict[str, Any]
Metadata dictionary containing: - timestamp: When the evaluation was performed - method: Name of the calling method - type: “model” (indicates this is model metadata) - models: Dictionary mapping wrapper names to display names - is_test: String representation of whether this is test data
- Raises:
- AttributeError
If algorithm_config is not set or models don’t have wrapper_name
- KeyError
If model wrapper names are not found in algorithm_config
Examples
>>> metadata_service = MetadataService("metadata") >>> metadata_service.set_algorithm_config(algorithm_config) >>> >>> # Single model >>> model_meta = metadata_service.get_model( ... models=my_model, ... method_name="evaluate_model", ... is_test=False ... ) >>> >>> # Multiple models >>> models_meta = metadata_service.get_model( ... models=[model1, model2], ... method_name="compare_models", ... is_test=True ... )
- get_rerun(method_name: str) Dict[str, Any][source]#
Generate metadata for rerun configuration files.
This method creates metadata for rerun configuration operations, which are used to track and manage experiment reruns and configuration persistence.
- Parameters:
- method_namestr
The name of the calling method that is handling rerun configuration
- Returns:
- Dict[str, Any]
Metadata dictionary containing: - timestamp: When the rerun operation was performed - method: Name of the calling method - type: “rerun_config” (indicates this is rerun metadata)
Examples
>>> metadata_service = MetadataService("metadata") >>> >>> # Generate rerun metadata >>> rerun_meta = metadata_service.get_rerun("save_rerun_config") >>> >>> # For loading rerun configuration >>> load_meta = metadata_service.get_rerun("load_rerun_config")
- set_algorithm_config(algorithm_config: AlgorithmCollection) None[source]#
Set the algorithm configuration for model metadata generation.
This method sets the algorithm configuration that is used when generating model metadata. The configuration is required to extract algorithm wrapper information and display names for models.
- Parameters:
- algorithm_configalgorithm_collection.AlgorithmCollection
The algorithm configuration containing wrapper information
Notes
This method must be called before using get_model() to generate model metadata. The algorithm configuration is not required for dataset or rerun metadata generation.
Examples
>>> from brisk.configuration import AlgorithmCollection >>> metadata_service = MetadataService("metadata") >>> >>> # Set algorithm configuration >>> metadata_service.set_algorithm_config(algorithm_config) >>> >>> # Now model metadata can be generated >>> model_meta = metadata_service.get_model(models, "evaluate")