CLI Commands#

create(*args: t.Any, **kwargs: t.Any) t.Any[source]#

Create a new project directory with template files.

Initializes a new Brisk project with all necessary configuration files and directory structure. Creates template files for algorithms, metrics, data management, workflows, and evaluators.

Parameters:
project_namestr

Name of the project directory to create

Notes

Creates the following structure: - .briskconfig : Project configuration file - settings.py : Configuration settings with default experiment groups - algorithms.py : Algorithm definitions using Brisk’s built-in algorithms - metrics.py : Metric definitions using Brisk’s built-in metrics - data.py : Data management setup with default parameters - evaluators.py : Template for custom evaluators - workflows/ : Directory for workflow definitions

  • workflow.py : Template workflow class

  • datasets/ : Directory for data storage

The created files contain working examples and can be customized for specific project needs.

run(*args: t.Any, **kwargs: t.Any) t.Any[source]#

Run experiments using experiment groups in settings.py.

Executes machine learning experiments based on configuration defined in the project’s settings.py file. Can run from scratch or rerun from a saved configuration.

Parameters:
results_namestr, optional

Custom name for results directory. If not provided, uses timestamp format: DD_MM_YYYY_HH_MM_SS

config_filestr, optional

Name of the results folder to run from saved configuration. If provided, reruns experiments using the saved configuration.

disable_reportbool, default=False

Whether to disable HTML report generation after experiments complete

verbosebool, default=False

Whether to enable verbose logging output

Raises:
FileNotFoundError

If project root not found or required files are missing

FileExistsError

If results directory already exists

ValueError

If experiment groups are missing workflow mappings or configuration errors

Notes

The function automatically: 1. Finds the project root directory 2. Creates a results directory with timestamp or custom name 3. Loads algorithms, metrics, and configuration from project files 4. Executes experiments according to the workflow 5. Generates an HTML report (unless disabled)

load_data(*args: t.Any, **kwargs: t.Any) t.Any[source]#

Load a scikit-learn dataset into the project.

Downloads and saves a scikit-learn dataset as a CSV file in the project’s datasets directory. Automatically handles feature names and target variable formatting.

Parameters:
dataset{‘iris’, ‘wine’, ‘breast_cancer’, ‘diabetes’, ‘linnerud’}

Name of the scikit-learn dataset to load

dataset_namestr, optional

Custom name for the saved dataset file. If not provided, uses the original dataset name

Raises:
FileNotFoundError

If project root directory is not found

Notes

Saves the dataset as a CSV file in the project’s datasets directory. The CSV includes: - Feature columns with proper names (or feature_0, feature_1, etc.) - Target column named ‘target’ - No index column

Available datasets: - iris: 150 samples, 4 features, 3 classes - wine: 178 samples, 13 features, 3 classes - breast_cancer: 569 samples, 30 features, 2 classes - diabetes: 442 samples, 10 features, regression target - linnerud: 20 samples, 3 features, 3 targets

create_data(*args: t.Any, **kwargs: t.Any) t.Any[source]#

Create synthetic data and add it to the project.

Generates synthetic datasets for testing and experimentation using scikit-learn’s data generation functions. Supports both classification and regression datasets with configurable parameters.

Parameters:
data_type{‘classification’, ‘regression’}

Type of dataset to generate

n_samplesint, default=100

Number of samples to generate

n_featuresint, default=20

Number of features to generate

n_classesint, default=2

Number of classes for classification datasets

random_stateint, default=42

Random seed for reproducibility

dataset_namestr, default=’synthetic_dataset’

Name for the output CSV file (without extension)

Raises:
FileNotFoundError

If project root directory is not found

ValueError

If data_type is not ‘classification’ or ‘regression’

Notes

For classification datasets:
  • 80% informative features

  • 20% redundant features

  • No repeated features

  • Balanced class distribution

For regression datasets:
  • 80% informative features

  • 0.1 noise level

  • Linear relationship between features and target

The generated dataset is saved as a CSV file in the project’s datasets directory with feature columns and a ‘target’ column.

export_env(*args: t.Any, **kwargs: t.Any) t.Any[source]#

Export environment requirements from a previous run.

Creates a requirements.txt file from the environment captured during a previous experiment run. By default, only includes critical packages that affect computation results.

Parameters:
run_idstr

The run ID to export environment from (e.g., ‘2024_01_15_14_30_00’)

outputstr, optional

Output path for requirements file. If not provided, saves as ‘requirements_{run_id}.txt’ in the project root

include_allbool, default=False

Include all packages from the original environment, not just critical ones (numpy, pandas, scikit-learn, scipy, joblib)

Raises:
FileNotFoundError

If run configuration file is not found

Notes

The generated requirements.txt file includes: - Header comments with generation timestamp - Python version information - Critical packages section (always included) - Other packages section (if include_all=True) - Proper package version pinning for reproducibility

Examples

Export critical packages only:

brisk export-env my_run_20240101_120000

Export all packages to custom file:

brisk export-env my_run_20240101_120000 –output my_requirements.txt –include-all

check_env(*args: t.Any, **kwargs: t.Any) t.Any[source]#

Check environment compatibility with a previous run.

Compares the current Python environment with the environment used in a previous experiment run. Identifies version differences and potential compatibility issues that could affect reproducibility.

Parameters:
run_idstr

The run ID to check environment against (e.g., ‘2024_01_15_14_30_00’)

verbosebool, default=False

Show detailed compatibility report with all package differences. If False, shows only summary information

Raises:
FileNotFoundError

If run configuration file is not found

Notes

The compatibility check examines: - Python version compatibility (major.minor version must match) - Critical package versions (numpy, pandas, scikit-learn, scipy, joblib) - Non-critical package differences - Missing or extra packages

Compatibility rules: - Critical packages: major.minor version must match exactly - Non-critical packages: major version must match - Missing critical packages: breaks compatibility - Python version: major.minor must match

Examples

Quick compatibility check:

brisk check-env my_run_20240101_120000

Detailed compatibility report:

brisk check-env my_run_20240101_120000 –verbose