CLI Commands#

create(*args: t.Any, **kwargs: t.Any) → t.Any[source]#

Create a new project directory with template files.

Initializes a new Brisk project with all necessary configuration files and directory structure. Creates template files for algorithms, metrics, data management, workflows, and evaluators.

Parameters:

project_namestr: Name of the project directory to create

Notes

Creates the following structure: - .briskconfig : Project configuration file - settings.py : Configuration settings with default experiment groups - algorithms.py : Algorithm definitions using Brisk’s built-in algorithms - metrics.py : Metric definitions using Brisk’s built-in metrics - data.py : Data management setup with default parameters - evaluators.py : Template for custom evaluators - workflows/ : Directory for workflow definitions

workflow.py : Template workflow class

datasets/ : Directory for data storage

The created files contain working examples and can be customized for specific project needs.

run(*args: t.Any, **kwargs: t.Any) → t.Any[source]#

Run experiments using experiment groups in settings.py.

Executes machine learning experiments based on configuration defined in the project’s settings.py file. Can run from scratch or rerun from a saved configuration.

Parameters:

results_namestr, optional: Custom name for results directory. If not provided, uses timestamp format: DD_MM_YYYY_HH_MM_SS
config_filestr, optional: Name of the results folder to run from saved configuration. If provided, reruns experiments using the saved configuration.
disable_reportbool, default=False: Whether to disable HTML report generation after experiments complete
verbosebool, default=False: Whether to enable verbose logging output

Raises:

FileNotFoundError: If project root not found or required files are missing
FileExistsError: If results directory already exists
ValueError: If experiment groups are missing workflow mappings or configuration errors

Notes

The function automatically: 1. Finds the project root directory 2. Creates a results directory with timestamp or custom name 3. Loads algorithms, metrics, and configuration from project files 4. Executes experiments according to the workflow 5. Generates an HTML report (unless disabled)

load_data(*args: t.Any, **kwargs: t.Any) → t.Any[source]#

Load a scikit-learn dataset into the project.

Downloads and saves a scikit-learn dataset as a CSV file in the project’s datasets directory. Automatically handles feature names and target variable formatting.

Parameters:

dataset{‘iris’, ‘wine’, ‘breast_cancer’, ‘diabetes’, ‘linnerud’}: Name of the scikit-learn dataset to load
dataset_namestr, optional: Custom name for the saved dataset file. If not provided, uses the original dataset name

Raises:

FileNotFoundError: If project root directory is not found

Notes

Saves the dataset as a CSV file in the project’s datasets directory. The CSV includes: - Feature columns with proper names (or feature_0, feature_1, etc.) - Target column named ‘target’ - No index column

Available datasets: - iris: 150 samples, 4 features, 3 classes - wine: 178 samples, 13 features, 3 classes - breast_cancer: 569 samples, 30 features, 2 classes - diabetes: 442 samples, 10 features, regression target - linnerud: 20 samples, 3 features, 3 targets

create_data(*args: t.Any, **kwargs: t.Any) → t.Any[source]#

Create synthetic data and add it to the project.

Generates synthetic datasets for testing and experimentation using scikit-learn’s data generation functions. Supports both classification and regression datasets with configurable parameters.

Parameters:

data_type{‘classification’, ‘regression’}: Type of dataset to generate
n_samplesint, default=100: Number of samples to generate
n_featuresint, default=20: Number of features to generate
n_classesint, default=2: Number of classes for classification datasets
random_stateint, default=42: Random seed for reproducibility
dataset_namestr, default=’synthetic_dataset’: Name for the output CSV file (without extension)

Raises:

FileNotFoundError: If project root directory is not found
ValueError: If data_type is not ‘classification’ or ‘regression’

Notes

For classification datasets:

80% informative features
20% redundant features
No repeated features
Balanced class distribution

For regression datasets:

80% informative features
0.1 noise level
Linear relationship between features and target

The generated dataset is saved as a CSV file in the project’s datasets directory with feature columns and a ‘target’ column.

export_env(*args: t.Any, **kwargs: t.Any) → t.Any[source]#

Export environment requirements from a previous run.

Creates a requirements.txt file from the environment captured during a previous experiment run. By default, only includes critical packages that affect computation results.

Parameters:

run_idstr: The run ID to export environment from (e.g., ‘2024_01_15_14_30_00’)
outputstr, optional: Output path for requirements file. If not provided, saves as ‘requirements_{run_id}.txt’ in the project root
include_allbool, default=False: Include all packages from the original environment, not just critical ones (numpy, pandas, scikit-learn, scipy, joblib)

Raises:

FileNotFoundError: If run configuration file is not found

Notes

The generated requirements.txt file includes: - Header comments with generation timestamp - Python version information - Critical packages section (always included) - Other packages section (if include_all=True) - Proper package version pinning for reproducibility

Examples

Export critical packages only:: brisk export-env my_run_20240101_120000
Export all packages to custom file:: brisk export-env my_run_20240101_120000 –output my_requirements.txt –include-all

check_env(*args: t.Any, **kwargs: t.Any) → t.Any[source]#

Check environment compatibility with a previous run.

Compares the current Python environment with the environment used in a previous experiment run. Identifies version differences and potential compatibility issues that could affect reproducibility.

Parameters:

run_idstr: The run ID to check environment against (e.g., ‘2024_01_15_14_30_00’)
verbosebool, default=False: Show detailed compatibility report with all package differences. If False, shows only summary information

Raises:

FileNotFoundError: If run configuration file is not found

Notes

The compatibility check examines: - Python version compatibility (major.minor version must match) - Critical package versions (numpy, pandas, scikit-learn, scipy, joblib) - Non-critical package differences - Missing or extra packages

Compatibility rules: - Critical packages: major.minor version must match exactly - Non-critical packages: major version must match - Missing critical packages: breaks compatibility - Python version: major.minor must match

Examples

Quick compatibility check:: brisk check-env my_run_20240101_120000
Detailed compatibility report:: brisk check-env my_run_20240101_120000 –verbose

CLI Commands#

This Page