Brisk CLI#

How to use the Brisk CLI#

Brisk provides a comprehensive command-line interface (CLI) to help you create and run machine learning projects, manage datasets, and ensure reproducible experiments. The CLI is installed automatically when you install Brisk into a virtual environment, making it available through the brisk command in your terminal.

To see all available commands and options, run:

brisk --help

Available Commands#

Brisk provides six main commands for managing your machine learning workflow:

  • create - Initialize a new project with template files

  • run - Execute experiments based on your configuration

  • load_data - Load scikit-learn datasets into your project

  • create_data - Generate synthetic datasets for testing

  • export-env - Export environment requirements from previous runs

  • check-env - Check environment compatibility with previous runs

create#

The create command initializes a new Brisk project with all the necessary files and directory structure:

brisk create -n <project_name>

Arguments:

  • -n, --project_name (required): Name of the project directory to create

Example:

brisk create -n my_regression_project

This creates a new directory structure with the following files:

  • .briskconfig: Project configuration file

  • settings.py: Configuration settings for experiments

  • algorithms.py: Algorithm definitions

  • metrics.py: Metric definitions

  • data.py: Data management setup

  • evaluators.py: Template for custom evaluators

  • workflows/: Directory for workflow files

  • datasets/: Directory for data storage

run#

The run command runs the experiments defined in the settings.py file. You can either use one workflow for all experiment groups (by setting a default_workflow) or assign different workflows to specific experiment groups. All workflow assignments are configured in settings.py, so you just run the command from your project root.

brisk run [OPTIONS]

Arguments:

  • -n, --results_name (optional): Custom name for the results directory. If not provided, uses timestamp format DD_MM_YYYY_HH_MM_SS

  • -f, --config_file (optional): Name of the results folder to run from saved configuration. If provided, reruns experiments using the saved configuration

  • --disable_report (optional): Flag to disable HTML report generation after experiments complete

  • --verbose (optional): Flag to enable verbose logging output

Examples:

# Run experiments with custom results name
brisk run -n experiment_1_results

# Run experiments with verbose output and no report
brisk run --verbose --disable_report

# Rerun experiments from a previous configuration
brisk run -f previous_run

load_data#

The load_data command wraps the load_sklearn_dataset function from scikit-learn and saves the dataset as a CSV file in the project’s datasets directory.

brisk load_data --dataset <dataset_name> --dataset_name <custom_name>

Arguments:

  • --dataset (required): Name of the sklearn dataset to load (options: iris, wine, breast_cancer, diabetes, linnerud)

  • --dataset_name (optional): Custom name to save the dataset as

Example:

brisk load_data --dataset diabetes --dataset_name diabetes_data

This downloads the diabetes dataset from scikit-learn and saves it as “diabetes_data.csv” in your project’s datasets directory.

create_data#

The create_data command generates synthetic datasets for testing:

brisk create_data --data_type <type> [options]

Arguments:

  • --data_type (required): Type of dataset to generate (classification or regression)

  • --n_samples (optional): Number of samples to generate (default: 100)

  • --n_features (optional): Number of features to generate (default: 20)

  • --n_classes (optional): Number of classes for classification (default: 2)

  • --random_state (optional): Random seed for reproducibility (default: 42)

  • --dataset_name (optional): Name for the dataset file (default: synthetic_dataset)

Example:

brisk create_data --data_type regression --n_samples 500 --n_features 10 --dataset_name synthetic_regression

This creates a synthetic regression dataset with 500 samples and 10 features, saving it as “synthetic_regression.csv” in your project’s datasets directory.

export-env#

The export-env command creates a requirements.txt file from the environment captured during a previous experiment run. This helps with reproducibility by allowing you to recreate the exact environment used for specific experiments.

brisk export-env <run_id> [OPTIONS]

Arguments:

  • run_id (required): The run ID to export environment from (e.g., ‘2024_01_15_14_30_00’)

  • -o, --output (optional): Output path for requirements file. If not provided, saves as ‘requirements_{run_id}.txt’ in the project root

  • --include-all (optional): Flag to include all packages from the original environment, not just critical ones (numpy, pandas, scikit-learn, scipy, joblib)

Examples:

# Export critical packages only
brisk export-env my_run_20240101_120000

# Export all packages to custom file
brisk export-env my_run_20240101_120000 --output my_requirements.txt --include-all

# Export to specific location
brisk export-env my_run_20240101_120000 -o /path/to/requirements.txt

This generates a requirements.txt file with proper package version pinning for reproducibility, including header comments with generation timestamp and Python version information.

check-env#

The check-env command compares the current Python environment with the environment used in a previous experiment run. It identifies version differences and potential compatibility issues that could affect reproducibility.

brisk check-env <run_id> [OPTIONS]

Arguments:

  • run_id (required): The run ID to check environment against (e.g., ‘2024_01_15_14_30_00’)

  • -v, --verbose (optional): Flag to show detailed compatibility report with all package differences. If not provided, shows only summary information

Examples:

# Quick compatibility check
brisk check-env my_run_20240101_120000

# Detailed compatibility report
brisk check-env my_run_20240101_120000 --verbose

The compatibility check examines Python version compatibility, critical package versions (numpy, pandas, scikit-learn, scipy, joblib), and identifies missing or extra packages. Critical packages must have matching major.minor versions for compatibility.

Working with the CLI#

The Brisk CLI is designed to be used from the root of your project directory. When running commands, Brisk will look for the .briskconfig file to identify the project root.