Report Data Models#
- class ReportData(*, navbar: Navbar, datasets: Dict[str, ~brisk.reporting.report_data.Dataset]=<factory>, experiments: Dict[str, ~brisk.reporting.report_data.Experiment]=<factory>, experiment_groups: List[ExperimentGroup] = <factory>, data_managers: Dict[str, ~brisk.reporting.report_data.DataManager]=<factory>)[source]#
Bases:
RoundedModelRepresents the entire machine learning report.
This is the root model that contains all data for a complete machine learning report, including navigation information, datasets, experiments, and data managers.
- Attributes:
- navbarNavbar
Navigation bar data with version and timestamp information
- datasetsDict[str, Dataset]
Map of dataset IDs to Dataset instances
- experimentsDict[str, Experiment]
Map of experiment IDs to Experiment instances
- experiment_groupsList[ExperimentGroup]
List of experiment groups for organizing related experiments
- data_managersDict[str, DataManager]
Map of data manager IDs to DataManager instances
Examples
>>> report = ReportData( ... navbar=Navbar(brisk_version="1.0.0", timestamp="2024-01-15"), ... datasets={"dataset_1": Dataset(...)}, ... experiments={"exp_1": Experiment(...)}, ... experiment_groups=[ExperimentGroup(...)], ... data_managers={"dm_1": DataManager(...)} ... )
- data_managers: Dict[str, DataManager]#
- experiment_groups: List[ExperimentGroup]#
- experiments: Dict[str, Experiment]#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class RoundedModel[source]#
Bases:
BaseModelBase Pydantic model that enforces rounding of all numbers.
This model automatically rounds all numerical values to 3 decimal places before validation. It uses the _deep_round function to handle nested data structures and special string formats.
- Attributes:
- All attributes are automatically rounded to 3 decimal places
Notes
This class should be used as a base class for all models that need consistent numerical rounding for display purposes.
Examples
>>> class MyModel(RoundedModel): ... value: float ... scores: List[float] >>> model = MyModel(value=1.234567, scores=[0.1, 0.234567]) >>> model.value 1.235 >>> model.scores [0.1, 0.235]
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class TableData(*, name: str, description: str | None = None, columns: List[str], rows: List[List[str]])[source]#
Bases:
RoundedModelRepresents tabular data with columns and rows.
This model is used to structure tabular data for display in reports. It includes metadata like name and description along with the actual table structure.
- Attributes:
- namestr
The name/title of the table
- descriptionOptional[str]
Optional description text displayed below the table
- columnsList[str]
List of column headers
- rowsList[List[str]]
List of rows, each row is a list of cell values
Examples
>>> table = TableData( ... name="Model Performance", ... description="Cross-validation results", ... columns=["Algorithm", "Accuracy", "Precision"], ... rows=[["Random Forest", "0.95", "0.92"], ["SVM", "0.93", "0.89"]] ... )
- columns: List[str]#
- description: str | None#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str#
- rows: List[List[str]]#
- class PlotData(*, name: str, description: str, image: str)[source]#
Bases:
RoundedModelStructure for all plots in the report.
This model represents plot data including metadata and the actual plot content (typically as SVG or base64 encoded image data).
- Attributes:
- namestr
The name/title of the plot
- descriptionstr
Description of what the plot shows
- imagestr
The plot content, typically as SVG string or base64 encoded image
Examples
>>> plot = PlotData( ... name="Feature Importance", ... description="Shows the importance of each feature", ... image="<svg>...</svg>" ... )
- description: str#
- image: str#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str#
- class FeatureDistribution(*, ID: str, tables: List[TableData], plot: PlotData)[source]#
Bases:
RoundedModelDistribution of a feature across train and test splits.
This model represents the distribution analysis of a single feature across different data splits, including both tabular statistics and visual plots.
- Attributes:
- IDstr
Unique identifier for the feature
- tablesList[TableData]
List of tables containing distribution statistics
- plotPlotData
Plot showing the feature distribution
Examples
>>> feature_dist = FeatureDistribution( ... ID="feature_1", ... tables=[TableData(...)], ... plot=PlotData(...) ... )
- ID: str#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class DataManager(*, ID: str, test_size: float, n_splits: int, split_method: str, group_column: str, stratified: str, random_state: int | None)[source]#
Bases:
RoundedModelRepresents a DataManager instance configuration.
This model stores the configuration parameters used for data splitting and management in machine learning experiments.
- Attributes:
- IDstr
Unique identifier for the data manager
- test_sizefloat
Proportion of data to use for testing (0.0 to 1.0)
- n_splitsint
Number of cross-validation splits
- split_methodstr
Method used for splitting data (e.g., ‘random’, ‘stratified’)
- group_columnstr
Column name used for group-based splitting
- stratifiedstr
Whether stratification is used (‘True’ or ‘False’)
- random_stateint | None
Random seed for reproducible splits, None if not set
Examples
>>> data_mgr = DataManager( ... ID="dm_1", ... test_size=0.2, ... n_splits=5, ... split_method="stratified", ... group_column="group_id", ... stratified="True", ... random_state=42 ... )
- ID: str#
- group_column: str#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- n_splits: int#
- random_state: int | None#
- split_method: str#
- stratified: str#
- test_size: float#
Bases:
RoundedModelData for the navigation bar.
This model contains metadata displayed in the report’s navigation bar, typically including version information and timestamps.
- Attributes:
- brisk_versionstr
Version of the Brisk library used to generate the report
- timestampstr
Timestamp when the report was generated
Examples
>>> navbar = Navbar( ... brisk_version="1.0.0", ... timestamp="2024-01-15 10:30:00" ... )
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class ExperimentGroup(*, name: str, description: str, datasets: List[str] = <factory>, experiments: List[str] = <factory>, data_split_scores: Dict[str, ~typing.List[~typing.Tuple[str, str | None, str, str | None]]]=<factory>, test_scores: Dict[str, ~brisk.reporting.report_data.TableData]=<factory>)[source]#
Bases:
RoundedModelData for an ExperimentGroup card on the home page.
This model represents a group of related experiments that are displayed together on the report’s home page. It includes metadata about the group and references to datasets and experiments within the group.
- Attributes:
- namestr
Name of the experiment group
- descriptionstr
Description of what the experiment group contains
- datasetsList[str]
List of dataset IDs included in this group
- experimentsList[str]
List of experiment IDs included in this group
- data_split_scoresDict[str, List[Tuple[str, str | None, str, str | None]]]
Best algorithm and score for each data split, keyed by dataset name
- test_scoresDict[str, TableData]
Test data scores indexed on dataset name and split number
Examples
>>> group = ExperimentGroup( ... name="Classification Experiments", ... description="Binary classification on various datasets", ... datasets=["dataset_1", "dataset_2"], ... experiments=["exp_1", "exp_2"], ... data_split_scores={"dataset_1": [("XTree", "0.95", "0.92", None)]}, ... test_scores={"dataset_1": TableData(...)} ... )
- data_split_scores: Dict[str, List[Tuple[str, str | None, str, str | None]]]#
- datasets: List[str]#
- description: str#
- experiments: List[str]#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str#
- class Experiment(*, ID: str, dataset: str, algorithm: List[str] = <factory>, tuned_params: Dict[str, ~typing.Any]=<factory>, hyperparam_grid: Dict[str, ~typing.Any]=<factory>, tables: Dict[str, ~typing.List[~brisk.reporting.report_data.TableData]]=<factory>, plots: Dict[str, ~typing.List[~brisk.reporting.report_data.PlotData]]=<factory>)[source]#
Bases:
RoundedModelResults of a single machine learning experiment.
This model represents the complete results of a single experiment, including algorithm information, hyperparameters, and all associated tables and plots.
- Attributes:
- IDstr
Unique identifier for the experiment
- datasetstr
Name of the dataset used in this experiment
- algorithmList[str]
Display names of algorithms used in the experiment
- tuned_paramsDict[str, Any]
Tuned hyperparameter names and values
- hyperparam_gridDict[str, Any]
Hyperparameter grid used for tuning
- tablesDict[str, List[TableData]]
Tables keyed by split_id (e.g. ‘split_0’, ‘split_1’)
- plotsDict[str, List[PlotData]]
Plots keyed by split_id (e.g. ‘split_0’, ‘split_1’)
Examples
>>> experiment = Experiment( ... ID="exp_1", ... dataset="iris", ... algorithm=["Random Forest", "SVM"], ... tuned_params={"n_estimators": 100, "max_depth": 10}, ... hyperparam_grid={"n_estimators": [50, 100, 200]}, ... tables={"split_0": [TableData(...)]}, ... plots={"split_0": [PlotData(...)]} ... )
- ID: str#
- algorithm: List[str]#
- dataset: str#
- hyperparam_grid: Dict[str, Any]#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- tuned_params: Dict[str, Any]#
- class Dataset(*, ID: str, splits: List[str] = <factory>, split_sizes: Dict[str, ~typing.Dict[str, int]]=<factory>, split_target_stats: Dict[str, ~typing.Dict[str, float | dict]]=<factory>, split_corr_matrices: Dict[str, ~brisk.reporting.report_data.PlotData]=<factory>, data_manager_id: str, features: List[str] = <factory>, split_feature_distributions: Dict[str, ~typing.List[~brisk.reporting.report_data.FeatureDistribution]]=<factory>)[source]#
Bases:
RoundedModelRepresents a dataset within an ExperimentGroup.
This model contains comprehensive information about a dataset including its splits, feature information, and various statistical analyses.
- Attributes:
- IDstr
Unique identifier for the dataset
- splitsList[str]
List of data split indexes (e.g., [“0”, “1”, “2”])
- split_sizesDict[str, Dict[str, int]]
Size of dataset and train/test split for each split
- split_target_statsDict[str, Dict[str, Union[float, dict]]]
Target feature statistics per split
- split_corr_matricesDict[str, PlotData]
Correlation matrix plots per split
- data_manager_idstr
ID of the associated DataManager
- featuresList[str]
List of feature names in the dataset
- split_feature_distributionsDict[str, List[FeatureDistribution]]
Feature distribution analyses per split
Examples
>>> dataset = Dataset( ... ID="dataset_1", ... splits=["0", "1", "2"], ... split_sizes={"0": {"total": 1000, "train": 800, "test": 200}}, ... split_target_stats={"0": {"mean": 0.5, "std": 0.1}}, ... split_corr_matrices={"0": PlotData(...)}, ... data_manager_id="dm_1", ... features=["feature_1", "feature_2"], ... split_feature_distributions={"0": [FeatureDistribution(...)]} ... )
- ID: str#
- data_manager_id: str#
- features: List[str]#
- model_config = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- split_feature_distributions: Dict[str, List[FeatureDistribution]]#
- split_sizes: Dict[str, Dict[str, int]]#
- split_target_stats: Dict[str, Dict[str, float | dict]]#
- splits: List[str]#