CLI Commands#
- create(*args: t.Any, **kwargs: t.Any) t.Any[source]#
Create a new project directory with template files.
Initializes a new Brisk project with all necessary configuration files and directory structure. Creates template files for algorithms, metrics, data management, workflows, and evaluators.
- Parameters:
- project_namestr
Name of the project directory to create
Notes
Creates the following structure: - .briskconfig : Project configuration file - settings.py : Configuration settings with default experiment groups - algorithms.py : Algorithm definitions using Brisk’s built-in algorithms - metrics.py : Metric definitions using Brisk’s built-in metrics - data.py : Data management setup with default parameters - evaluators.py : Template for custom evaluators - workflows/ : Directory for workflow definitions
workflow.py : Template workflow class
datasets/ : Directory for data storage
The created files contain working examples and can be customized for specific project needs.
- run(*args: t.Any, **kwargs: t.Any) t.Any[source]#
Run experiments using experiment groups in settings.py.
Executes machine learning experiments based on configuration defined in the project’s settings.py file. Can run from scratch or rerun from a saved configuration.
- Parameters:
- results_namestr, optional
Custom name for results directory. If not provided, uses timestamp format: DD_MM_YYYY_HH_MM_SS
- config_filestr, optional
Name of the results folder to run from saved configuration. If provided, reruns experiments using the saved configuration.
- disable_reportbool, default=False
Whether to disable HTML report generation after experiments complete
- verbosebool, default=False
Whether to enable verbose logging output
- Raises:
- FileNotFoundError
If project root not found or required files are missing
- FileExistsError
If results directory already exists
- ValueError
If experiment groups are missing workflow mappings or configuration errors
Notes
The function automatically: 1. Finds the project root directory 2. Creates a results directory with timestamp or custom name 3. Loads algorithms, metrics, and configuration from project files 4. Executes experiments according to the workflow 5. Generates an HTML report (unless disabled)
- load_data(*args: t.Any, **kwargs: t.Any) t.Any[source]#
Load a scikit-learn dataset into the project.
Downloads and saves a scikit-learn dataset as a CSV file in the project’s datasets directory. Automatically handles feature names and target variable formatting.
- Parameters:
- dataset{‘iris’, ‘wine’, ‘breast_cancer’, ‘diabetes’, ‘linnerud’}
Name of the scikit-learn dataset to load
- dataset_namestr, optional
Custom name for the saved dataset file. If not provided, uses the original dataset name
- Raises:
- FileNotFoundError
If project root directory is not found
Notes
Saves the dataset as a CSV file in the project’s datasets directory. The CSV includes: - Feature columns with proper names (or feature_0, feature_1, etc.) - Target column named ‘target’ - No index column
Available datasets: - iris: 150 samples, 4 features, 3 classes - wine: 178 samples, 13 features, 3 classes - breast_cancer: 569 samples, 30 features, 2 classes - diabetes: 442 samples, 10 features, regression target - linnerud: 20 samples, 3 features, 3 targets
- create_data(*args: t.Any, **kwargs: t.Any) t.Any[source]#
Create synthetic data and add it to the project.
Generates synthetic datasets for testing and experimentation using scikit-learn’s data generation functions. Supports both classification and regression datasets with configurable parameters.
- Parameters:
- data_type{‘classification’, ‘regression’}
Type of dataset to generate
- n_samplesint, default=100
Number of samples to generate
- n_featuresint, default=20
Number of features to generate
- n_classesint, default=2
Number of classes for classification datasets
- random_stateint, default=42
Random seed for reproducibility
- dataset_namestr, default=’synthetic_dataset’
Name for the output CSV file (without extension)
- Raises:
- FileNotFoundError
If project root directory is not found
- ValueError
If data_type is not ‘classification’ or ‘regression’
Notes
- For classification datasets:
80% informative features
20% redundant features
No repeated features
Balanced class distribution
- For regression datasets:
80% informative features
0.1 noise level
Linear relationship between features and target
The generated dataset is saved as a CSV file in the project’s datasets directory with feature columns and a ‘target’ column.
- export_env(*args: t.Any, **kwargs: t.Any) t.Any[source]#
Export environment requirements from a previous run.
Creates a requirements.txt file from the environment captured during a previous experiment run. By default, only includes critical packages that affect computation results.
- Parameters:
- run_idstr
The run ID to export environment from (e.g., ‘2024_01_15_14_30_00’)
- outputstr, optional
Output path for requirements file. If not provided, saves as ‘requirements_{run_id}.txt’ in the project root
- include_allbool, default=False
Include all packages from the original environment, not just critical ones (numpy, pandas, scikit-learn, scipy, joblib)
- Raises:
- FileNotFoundError
If run configuration file is not found
Notes
The generated requirements.txt file includes: - Header comments with generation timestamp - Python version information - Critical packages section (always included) - Other packages section (if include_all=True) - Proper package version pinning for reproducibility
Examples
- Export critical packages only:
brisk export-env my_run_20240101_120000
- Export all packages to custom file:
brisk export-env my_run_20240101_120000 –output my_requirements.txt –include-all
- check_env(*args: t.Any, **kwargs: t.Any) t.Any[source]#
Check environment compatibility with a previous run.
Compares the current Python environment with the environment used in a previous experiment run. Identifies version differences and potential compatibility issues that could affect reproducibility.
- Parameters:
- run_idstr
The run ID to check environment against (e.g., ‘2024_01_15_14_30_00’)
- verbosebool, default=False
Show detailed compatibility report with all package differences. If False, shows only summary information
- Raises:
- FileNotFoundError
If run configuration file is not found
Notes
The compatibility check examines: - Python version compatibility (major.minor version must match) - Critical package versions (numpy, pandas, scikit-learn, scipy, joblib) - Non-critical package differences - Missing or extra packages
Compatibility rules: - Critical packages: major.minor version must match exactly - Non-critical packages: major version must match - Missing critical packages: breaks compatibility - Python version: major.minor must match
Examples
- Quick compatibility check:
brisk check-env my_run_20240101_120000
- Detailed compatibility report:
brisk check-env my_run_20240101_120000 –verbose