Report Data Models#

class ReportData(*, navbar: Navbar, datasets: Dict[str, ~brisk.reporting.report_data.Dataset]=<factory>, experiments: Dict[str, ~brisk.reporting.report_data.Experiment]=<factory>, experiment_groups: List[ExperimentGroup] = <factory>, data_managers: Dict[str, ~brisk.reporting.report_data.DataManager]=<factory>)[source]#

Bases: RoundedModel

Represents the entire machine learning report.

This is the root model that contains all data for a complete machine learning report, including navigation information, datasets, experiments, and data managers.

Attributes:

navbarNavbar: Navigation bar data with version and timestamp information
datasetsDict[str, Dataset]: Map of dataset IDs to Dataset instances
experimentsDict[str, Experiment]: Map of experiment IDs to Experiment instances
experiment_groupsList[ExperimentGroup]: List of experiment groups for organizing related experiments
data_managersDict[str, DataManager]: Map of data manager IDs to DataManager instances

Examples

>>> report = ReportData(
...     navbar=Navbar(brisk_version="1.0.0", timestamp="2024-01-15"),
...     datasets={"dataset_1": Dataset(...)},
...     experiments={"exp_1": Experiment(...)},
...     experiment_groups=[ExperimentGroup(...)],
...     data_managers={"dm_1": DataManager(...)}
... )

data_managers: Dict[str, DataManager]#

datasets: Dict[str, Dataset]#

experiment_groups: List[ExperimentGroup]#

experiments: Dict[str, Experiment]#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

navbar: Navbar#

class RoundedModel[source]#

Bases: BaseModel

Base Pydantic model that enforces rounding of all numbers.

This model automatically rounds all numerical values to 3 decimal places before validation. It uses the _deep_round function to handle nested data structures and special string formats.

Attributes:

All attributes are automatically rounded to 3 decimal places

Notes

This class should be used as a base class for all models that need consistent numerical rounding for display purposes.

Examples

>>> class MyModel(RoundedModel):
...     value: float
...     scores: List[float]
>>> model = MyModel(value=1.234567, scores=[0.1, 0.234567])
>>> model.value
1.235
>>> model.scores
[0.1, 0.235]

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class TableData(*, name: str, description: str | None = None, columns: List[str], rows: List[List[str]])[source]#

Bases: RoundedModel

Represents tabular data with columns and rows.

This model is used to structure tabular data for display in reports. It includes metadata like name and description along with the actual table structure.

Attributes:

namestr: The name/title of the table
descriptionOptional[str]: Optional description text displayed below the table
columnsList[str]: List of column headers
rowsList[List[str]]: List of rows, each row is a list of cell values

Examples

>>> table = TableData(
...     name="Model Performance",
...     description="Cross-validation results",
...     columns=["Algorithm", "Accuracy", "Precision"],
...     rows=[["Random Forest", "0.95", "0.92"], ["SVM", "0.93", "0.89"]]
... )

columns: List[str]#

description: str | None#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#

rows: List[List[str]]#

class PlotData(*, name: str, description: str, image: str)[source]#

Bases: RoundedModel

Structure for all plots in the report.

This model represents plot data including metadata and the actual plot content (typically as SVG or base64 encoded image data).

Attributes:

namestr: The name/title of the plot
descriptionstr: Description of what the plot shows
imagestr: The plot content, typically as SVG string or base64 encoded image

Examples

>>> plot = PlotData(
...     name="Feature Importance",
...     description="Shows the importance of each feature",
...     image="<svg>...</svg>"
... )

description: str#

image: str#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#

class FeatureDistribution(*, ID: str, tables: List[TableData], plot: PlotData)[source]#

Bases: RoundedModel

Distribution of a feature across train and test splits.

This model represents the distribution analysis of a single feature across different data splits, including both tabular statistics and visual plots.

Attributes:

IDstr: Unique identifier for the feature
tablesList[TableData]: List of tables containing distribution statistics
plotPlotData: Plot showing the feature distribution

Examples

>>> feature_dist = FeatureDistribution(
...     ID="feature_1",
...     tables=[TableData(...)],
...     plot=PlotData(...)
... )

ID: str#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

plot: PlotData#

tables: List[TableData]#

class DataManager(*, ID: str, test_size: float, n_splits: int, split_method: str, group_column: str, stratified: str, random_state: int | None)[source]#

Bases: RoundedModel

Represents a DataManager instance configuration.

This model stores the configuration parameters used for data splitting and management in machine learning experiments.

Attributes:

IDstr: Unique identifier for the data manager
test_sizefloat: Proportion of data to use for testing (0.0 to 1.0)
n_splitsint: Number of cross-validation splits
split_methodstr: Method used for splitting data (e.g., ‘random’, ‘stratified’)
group_columnstr: Column name used for group-based splitting
stratifiedstr: Whether stratification is used (‘True’ or ‘False’)
random_stateint | None: Random seed for reproducible splits, None if not set

Examples

>>> data_mgr = DataManager(
...     ID="dm_1",
...     test_size=0.2,
...     n_splits=5,
...     split_method="stratified",
...     group_column="group_id",
...     stratified="True",
...     random_state=42
... )

ID: str#

group_column: str#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_splits: int#

random_state: int | None#

split_method: str#

stratified: str#

test_size: float#

class Navbar(*, brisk_version: str, timestamp: str)[source]#

Bases: RoundedModel

Data for the navigation bar.

This model contains metadata displayed in the report’s navigation bar, typically including version information and timestamps.

Attributes:

brisk_versionstr: Version of the Brisk library used to generate the report
timestampstr: Timestamp when the report was generated

Examples

>>> navbar = Navbar(
...     brisk_version="1.0.0",
...     timestamp="2024-01-15 10:30:00"
... )

brisk_version: str#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

timestamp: str#

class ExperimentGroup(*, name: str, description: str, datasets: List[str] = <factory>, experiments: List[str] = <factory>, data_split_scores: Dict[str, ~typing.List[~typing.Tuple[str, str | None, str, str | None]]]=<factory>, test_scores: Dict[str, ~brisk.reporting.report_data.TableData]=<factory>)[source]#

Bases: RoundedModel

Data for an ExperimentGroup card on the home page.

This model represents a group of related experiments that are displayed together on the report’s home page. It includes metadata about the group and references to datasets and experiments within the group.

Attributes:

namestr: Name of the experiment group
descriptionstr: Description of what the experiment group contains
datasetsList[str]: List of dataset IDs included in this group
experimentsList[str]: List of experiment IDs included in this group
data_split_scoresDict[str, List[Tuple[str, str | None, str, str | None]]]: Best algorithm and score for each data split, keyed by dataset name
test_scoresDict[str, TableData]: Test data scores indexed on dataset name and split number

Examples

>>> group = ExperimentGroup(
...     name="Classification Experiments",
...     description="Binary classification on various datasets",
...     datasets=["dataset_1", "dataset_2"],
...     experiments=["exp_1", "exp_2"],
...     data_split_scores={"dataset_1": [("XTree", "0.95", "0.92", None)]},
...     test_scores={"dataset_1": TableData(...)}
... )

data_split_scores: Dict[str, List[Tuple[str, str | None, str, str | None]]]#

datasets: List[str]#

description: str#

experiments: List[str]#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#

test_scores: Dict[str, TableData]#

class Experiment(*, ID: str, dataset: str, algorithm: List[str] = <factory>, tuned_params: Dict[str, ~typing.Any]=<factory>, hyperparam_grid: Dict[str, ~typing.Any]=<factory>, tables: List[TableData] = <factory>, plots: List[PlotData] = <factory>)[source]#

Bases: RoundedModel

Results of a single machine learning experiment.

This model represents the complete results of a single experiment, including algorithm information, hyperparameters, and all associated tables and plots.

Attributes:

IDstr: Unique identifier for the experiment
datasetstr: Name of the dataset used in this experiment
algorithmList[str]: Display names of algorithms used in the experiment
tuned_paramsDict[str, Any]: Tuned hyperparameter names and values
hyperparam_gridDict[str, Any]: Hyperparameter grid used for tuning
tablesList[TableData]: List of tables containing experiment results
plotsList[PlotData]: List of plots visualizing experiment results

Examples

>>> experiment = Experiment(
...     ID="exp_1",
...     dataset="iris",
...     algorithm=["Random Forest", "SVM"],
...     tuned_params={"n_estimators": 100, "max_depth": 10},
...     hyperparam_grid={"n_estimators": [50, 100, 200]},
...     tables=[TableData(...)],
...     plots=[PlotData(...)]
... )

ID: str#

algorithm: List[str]#

dataset: str#

hyperparam_grid: Dict[str, Any]#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

plots: List[PlotData]#

tables: List[TableData]#

tuned_params: Dict[str, Any]#

class Dataset(*, ID: str, splits: List[str] = <factory>, split_sizes: Dict[str, ~typing.Dict[str, int]]=<factory>, split_target_stats: Dict[str, ~typing.Dict[str, float | dict]]=<factory>, split_corr_matrices: Dict[str, ~brisk.reporting.report_data.PlotData]=<factory>, data_manager_id: str, features: List[str] = <factory>, split_feature_distributions: Dict[str, ~typing.List[~brisk.reporting.report_data.FeatureDistribution]]=<factory>)[source]#

Bases: RoundedModel

Represents a dataset within an ExperimentGroup.

This model contains comprehensive information about a dataset including its splits, feature information, and various statistical analyses.

Attributes:

IDstr: Unique identifier for the dataset
splitsList[str]: List of data split indexes (e.g., [“0”, “1”, “2”])
split_sizesDict[str, Dict[str, int]]: Size of dataset and train/test split for each split
split_target_statsDict[str, Dict[str, Union[float, dict]]]: Target feature statistics per split
split_corr_matricesDict[str, PlotData]: Correlation matrix plots per split
data_manager_idstr: ID of the associated DataManager
featuresList[str]: List of feature names in the dataset
split_feature_distributionsDict[str, List[FeatureDistribution]]: Feature distribution analyses per split

Examples

>>> dataset = Dataset(
...     ID="dataset_1",
...     splits=["0", "1", "2"],
...     split_sizes={"0": {"total": 1000, "train": 800, "test": 200}},
...     split_target_stats={"0": {"mean": 0.5, "std": 0.1}},
...     split_corr_matrices={"0": PlotData(...)},
...     data_manager_id="dm_1",
...     features=["feature_1", "feature_2"],
...     split_feature_distributions={"0": [FeatureDistribution(...)]}
... )

ID: str#

data_manager_id: str#

features: List[str]#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

split_corr_matrices: Dict[str, PlotData]#

split_feature_distributions: Dict[str, List[FeatureDistribution]]#

split_sizes: Dict[str, Dict[str, int]]#

split_target_stats: Dict[str, Dict[str, float | dict]]#

splits: List[str]#

Report Data Models#

This Page