.. _brisk_cli: Brisk CLI ========= How to use the Brisk CLI ------------------------ Brisk provides a comprehensive command-line interface (CLI) to help you create and run machine learning projects, manage datasets, and ensure reproducible experiments. The CLI is installed automatically when you install Brisk into a virtual environment, making it available through the ``brisk`` command in your terminal. To see all available commands and options, run: .. code-block:: bash brisk --help Available Commands ------------------ Brisk provides six main commands for managing your machine learning workflow: * ``create`` - Initialize a new project with template files * ``run`` - Execute experiments based on your configuration * ``load_data`` - Load scikit-learn datasets into your project * ``create_data`` - Generate synthetic datasets for testing * ``export-env`` - Export environment requirements from previous runs * ``check-env`` - Check environment compatibility with previous runs create ^^^^^^ The ``create`` command initializes a new Brisk project with all the necessary files and directory structure: .. code-block:: bash brisk create -n **Arguments:** * ``-n, --project_name`` (required): Name of the project directory to create **Example:** .. code-block:: bash brisk create -n my_regression_project This creates a new directory structure with the following files: * ``.briskconfig``: Project configuration file * ``settings.py``: Configuration settings for experiments * ``algorithms.py``: Algorithm definitions * ``metrics.py``: Metric definitions * ``data.py``: Data management setup * ``evaluators.py``: Template for custom evaluators * ``workflows/``: Directory for workflow files * ``datasets/``: Directory for data storage run ^^^ The ``run`` command runs the experiments defined in the ``settings.py`` file. You can either use one workflow for all experiment groups (by setting a ``default_workflow``) or assign different workflows to specific experiment groups. All workflow assignments are configured in ``settings.py``, so you just run the command from your project root. .. code-block:: bash brisk run [OPTIONS] **Arguments:** * ``-n, --results_name`` (optional): Custom name for the results directory. If not provided, uses timestamp format DD_MM_YYYY_HH_MM_SS * ``-f, --config_file`` (optional): Name of the results folder to run from saved configuration. If provided, reruns experiments using the saved configuration * ``--disable_report`` (optional): Flag to disable HTML report generation after experiments complete * ``--verbose`` (optional): Flag to enable verbose logging output **Examples:** .. code-block:: bash # Run experiments with custom results name brisk run -n experiment_1_results # Run experiments with verbose output and no report brisk run --verbose --disable_report # Rerun experiments from a previous configuration brisk run -f previous_run load_data ^^^^^^^^^ The ``load_data`` command wraps the ``load_sklearn_dataset`` function from scikit-learn and saves the dataset as a CSV file in the project's datasets directory. .. code-block:: bash brisk load_data --dataset --dataset_name **Arguments:** * ``--dataset`` (required): Name of the sklearn dataset to load (options: iris, wine, breast_cancer, diabetes, linnerud) * ``--dataset_name`` (optional): Custom name to save the dataset as **Example:** .. code-block:: bash brisk load_data --dataset diabetes --dataset_name diabetes_data This downloads the diabetes dataset from scikit-learn and saves it as "diabetes_data.csv" in your project's datasets directory. create_data ^^^^^^^^^^^ The ``create_data`` command generates synthetic datasets for testing: .. code-block:: bash brisk create_data --data_type [options] **Arguments:** * ``--data_type`` (required): Type of dataset to generate (classification or regression) * ``--n_samples`` (optional): Number of samples to generate (default: 100) * ``--n_features`` (optional): Number of features to generate (default: 20) * ``--n_classes`` (optional): Number of classes for classification (default: 2) * ``--random_state`` (optional): Random seed for reproducibility (default: 42) * ``--dataset_name`` (optional): Name for the dataset file (default: synthetic_dataset) **Example:** .. code-block:: bash brisk create_data --data_type regression --n_samples 500 --n_features 10 --dataset_name synthetic_regression This creates a synthetic regression dataset with 500 samples and 10 features, saving it as "synthetic_regression.csv" in your project's datasets directory. export-env ^^^^^^^^^^ The ``export-env`` command creates a requirements.txt file from the environment captured during a previous experiment run. This helps with reproducibility by allowing you to recreate the exact environment used for specific experiments. .. code-block:: bash brisk export-env [OPTIONS] **Arguments:** * ``run_id`` (required): The run ID to export environment from (e.g., '2024_01_15_14_30_00') * ``-o, --output`` (optional): Output path for requirements file. If not provided, saves as 'requirements_{run_id}.txt' in the project root * ``--include-all`` (optional): Flag to include all packages from the original environment, not just critical ones (numpy, pandas, scikit-learn, scipy, joblib) **Examples:** .. code-block:: bash # Export critical packages only brisk export-env my_run_20240101_120000 # Export all packages to custom file brisk export-env my_run_20240101_120000 --output my_requirements.txt --include-all # Export to specific location brisk export-env my_run_20240101_120000 -o /path/to/requirements.txt This generates a requirements.txt file with proper package version pinning for reproducibility, including header comments with generation timestamp and Python version information. check-env ^^^^^^^^^ The ``check-env`` command compares the current Python environment with the environment used in a previous experiment run. It identifies version differences and potential compatibility issues that could affect reproducibility. .. code-block:: bash brisk check-env [OPTIONS] **Arguments:** * ``run_id`` (required): The run ID to check environment against (e.g., '2024_01_15_14_30_00') * ``-v, --verbose`` (optional): Flag to show detailed compatibility report with all package differences. If not provided, shows only summary information **Examples:** .. code-block:: bash # Quick compatibility check brisk check-env my_run_20240101_120000 # Detailed compatibility report brisk check-env my_run_20240101_120000 --verbose The compatibility check examines Python version compatibility, critical package versions (numpy, pandas, scikit-learn, scipy, joblib), and identifies missing or extra packages. Critical packages must have matching major.minor versions for compatibility. Working with the CLI -------------------- The Brisk CLI is designed to be used from the root of your project directory. When running commands, Brisk will look for the `.briskconfig` file to identify the project root.