I/O Service#

class IOService(name: str, results_dir: Path, output_dir: Path)[source]#

Bases: BaseService

I/O service for file operations, data loading, and plot management.

This service provides comprehensive I/O functionality for the Brisk package, including saving/loading data files, generating and saving plots, dynamic module loading, and configuration management. It handles various file formats and provides robust error handling and metadata management.

The service maintains separate directories for results (static) and output (dynamic), allowing for organized file management throughout experiments.

Attributes:
results_dirPath

The root directory for all results, does not change at runtime

output_dirPath

The current output directory, will be changed at runtime

formatstr

Default format for saving plots (default: “png”)

widthint

Default plot width in inches (default: 10)

heightint

Default plot height in inches (default: 8)

dpiint

Default plot DPI (default: 300)

transparentbool

Whether to save plots with transparent background (default: False)

Notes

The service automatically creates output directories as needed and integrates with the reporting service to store plot data for reports.

Examples

>>> from brisk.services.io import IOService
>>> from pathlib import Path
>>> 
>>> # Create I/O service
>>> io_service = IOService("io", Path("results"), Path("output"))
>>> 
>>> # Save data
>>> data = {"accuracy": 0.95, "precision": 0.92}
>>> io_service.save_to_json(data, Path("results.json"), {})
>>> 
>>> # Save plot
>>> io_service.save_plot(Path("plot.png"), plot=my_plot)
>>> 
>>> # Load data
>>> df = io_service.load_data("data.csv")
load_algorithms(algorithm_file: Path)[source]#
load_base_data_manager(data_file: Path)[source]#
load_custom_evaluators(evaluators_file: Path)[source]#

Load the register_custom_evaluators() function from evaluators.py

static load_data(data_path: str, table_name: str | None = None) DataFrame[source]#

Load data from CSV, Excel, or SQL database files.

This static method loads data from various file formats into a pandas DataFrame. It automatically detects the file format based on the file extension and handles the appropriate loading method.

Parameters:
data_pathstr

Path to the dataset file

table_nameOptional[str], default=None

Name of the table in SQL database (required for SQL files)

Returns:
pd.DataFrame

The loaded dataset as a pandas DataFrame

Raises:
ValueError

If file format is unsupported or table_name is missing for SQL database

Examples

>>> from brisk.services.io import IOService
>>> 
>>> # Load CSV file
>>> df = IOService.load_data("data.csv")
>>> 
>>> # Load Excel file
>>> df = IOService.load_data("data.xlsx")
>>> 
>>> # Load SQL database
>>> df = IOService.load_data("data.db", table_name="my_table")
load_metric_config(metric_file)[source]#
static load_module_object(project_root: str, module_filename: str, object_name: str, required: bool = True) object | None[source]#

Dynamically load an object from a specified module file.

This static method loads a Python object from a module file at runtime. It’s useful for loading configuration objects, custom evaluators, or other dynamic components from project files.

Parameters:
project_rootstr

Path to project root directory

module_filenamestr

Name of the module file (e.g., “algorithms.py”)

object_namestr

Name of the object to load from the module

requiredbool, default=True

Whether to raise an error if the object is not found

Returns:
Union[object, None]

The loaded object, or None if not found and not required

Raises:
FileNotFoundError

If the module file is not found

AttributeError

If the required object is not found in the module

Examples

>>> from brisk.services.io import IOService
>>> 
>>> # Load a configuration object
>>> config = IOService.load_module_object(
...     "/path/to/project", "algorithms.py", "ALGORITHM_CONFIG"
... )
>>> 
>>> # Load optional object (returns None if not found)
>>> optional = IOService.load_module_object(
...     "/path/to/project", "optional.py", "OPTIONAL_OBJ",
...     required=False
... )
load_workflow(workflow_name: str)[source]#
save_plot(output_path: Path, metadata: Dict[str, Any] | None = None, plot: ggplot | Figure | None = None, **kwargs) None[source]#

Save plot to file with metadata and SVG conversion.

This method saves a plot to a file in the specified format, with automatic SVG conversion for report generation. It supports multiple plot types including matplotlib, plotnine, and plotly figures.

Parameters:
output_pathPath

Path where the plot file will be saved

metadataOptional[Dict[str, Any]], default=None

Metadata to include with the plot

plotOptional[pn.ggplot | go.Figure], default=None

Plot object to save (plotnine or plotly figure)

**kwargs

Additional plot parameters (height, width, etc.)

Notes

The method automatically converts plots to SVG format for reports and handles different plot types. If no plot is provided, it saves the current matplotlib figure.

Examples

>>> io_service = IOService("io", Path("results"), Path("output"))
>>> # Save plotnine plot
>>> io_service.save_plot(Path("plot.png"), plot=my_plotnine_plot)
>>> 
>>> # Save plotly plot
>>> io_service.save_plot(Path("plot.png"), plot=my_plotly_figure)
>>> 
>>> # Save current matplotlib figure
>>> plt.plot([1, 2, 3], [1, 4, 9])
>>> io_service.save_plot(Path("plot.png"))
save_rerun_config(data: Dict, metadata: Dict, output_path: Path | str)[source]#
save_to_json(data: Dict[str, Any], output_path: Path | str, metadata: Dict[str, Any]) None[source]#

Save dictionary to JSON file with metadata.

This method saves a dictionary to a JSON file with optional metadata. It automatically creates parent directories if they don’t exist and handles NumPy data types through the NumpyEncoder. The data is also stored in the reporting service for report generation.

Parameters:
dataDict[str, Any]

Dictionary containing the data to save

output_pathUnion[Path, str]

Path where the JSON file will be saved

metadataDict[str, Any]

Metadata to include with the data (stored as “_metadata” key)

Notes

The method automatically creates parent directories and handles NumPy data types. If saving fails, an error is logged but no exception is raised.

Examples

>>> io_service = IOService("io", Path("results"), Path("output"))
>>> data = {"accuracy": 0.95, "precision": 0.92}
>>> metadata = {"experiment": "exp_1", "timestamp": "2024-01-15"}
>>> io_service.save_to_json(data, Path("results.json"), metadata)
set_io_settings(io_settings: Dict[str, Any]) None[source]#

Set settings to use when saving plots.

set_output_dir(output_dir: Path) None[source]#

Set the current output directory.

This method updates the current output directory where files will be saved. This is typically called when starting a new experiment to organize outputs by experiment.

Parameters:
output_dirpathlib.Path

The new output directory path

Examples

>>> io_service = IOService("io", Path("results"), Path("output"))
>>> io_service.set_output_dir(Path("experiment_1"))
>>> # Now all saves will go to experiment_1 directory
class NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]#

Bases: JSONEncoder

Custom JSON encoder for NumPy data types.

This encoder extends the standard JSON encoder to handle NumPy data types that are not natively JSON serializable. It converts NumPy integers, floats, and arrays to their Python equivalents.

Notes

This encoder is used automatically when saving data with NumPy arrays or scalars to JSON files through the IOService.

Examples

>>> import json
>>> import numpy as np
>>> from brisk.services.io import NumpyEncoder
>>> 
>>> data = {
...     "accuracy": np.float64(0.95),
...     "scores": np.array([0.1, 0.2, 0.3])
... }
>>> json_str = json.dumps(data, cls=NumpyEncoder)
>>> print(json_str)  # {"accuracy": 0.95, "scores": [0.1, 0.2, 0.3]}
default(o: Any) Any[source]#

Convert NumPy objects to JSON-serializable types.

Parameters:
oAny

The object to convert

Returns:
Any

JSON-serializable representation of the object