I/O Service#
- class IOService(name: str, results_dir: Path, output_dir: Path)[source]#
Bases:
BaseServiceI/O service for file operations, data loading, and plot management.
This service provides comprehensive I/O functionality for the Brisk package, including saving/loading data files, generating and saving plots, dynamic module loading, and configuration management. It handles various file formats and provides robust error handling and metadata management.
The service maintains separate directories for results (static) and output (dynamic), allowing for organized file management throughout experiments.
- Attributes:
- results_dirPath
The root directory for all results, does not change at runtime
- output_dirPath
The current output directory, will be changed at runtime
- formatstr
Default format for saving plots (default: “png”)
- widthint
Default plot width in inches (default: 10)
- heightint
Default plot height in inches (default: 8)
- dpiint
Default plot DPI (default: 300)
- transparentbool
Whether to save plots with transparent background (default: False)
Notes
The service automatically creates output directories as needed and integrates with the reporting service to store plot data for reports.
Examples
>>> from brisk.services.io import IOService >>> from pathlib import Path >>> >>> # Create I/O service >>> io_service = IOService("io", Path("results"), Path("output")) >>> >>> # Save data >>> data = {"accuracy": 0.95, "precision": 0.92} >>> io_service.save_to_json(data, Path("results.json"), {}) >>> >>> # Save plot >>> io_service.save_plot(Path("plot.png"), plot=my_plot) >>> >>> # Load data >>> df = io_service.load_data("data.csv")
- load_custom_evaluators(evaluators_file: Path)[source]#
Load the register_custom_evaluators() function from evaluators.py
- static load_data(data_path: str, table_name: str | None = None) DataFrame[source]#
Load data from CSV, Excel, or SQL database files.
This static method loads data from various file formats into a pandas DataFrame. It automatically detects the file format based on the file extension and handles the appropriate loading method.
- Parameters:
- data_pathstr
Path to the dataset file
- table_nameOptional[str], default=None
Name of the table in SQL database (required for SQL files)
- Returns:
- pd.DataFrame
The loaded dataset as a pandas DataFrame
- Raises:
- ValueError
If file format is unsupported or table_name is missing for SQL database
Examples
>>> from brisk.services.io import IOService >>> >>> # Load CSV file >>> df = IOService.load_data("data.csv") >>> >>> # Load Excel file >>> df = IOService.load_data("data.xlsx") >>> >>> # Load SQL database >>> df = IOService.load_data("data.db", table_name="my_table")
- static load_module_object(project_root: str, module_filename: str, object_name: str, required: bool = True) object | None[source]#
Dynamically load an object from a specified module file.
This static method loads a Python object from a module file at runtime. It’s useful for loading configuration objects, custom evaluators, or other dynamic components from project files.
- Parameters:
- project_rootstr
Path to project root directory
- module_filenamestr
Name of the module file (e.g., “algorithms.py”)
- object_namestr
Name of the object to load from the module
- requiredbool, default=True
Whether to raise an error if the object is not found
- Returns:
- Union[object, None]
The loaded object, or None if not found and not required
- Raises:
- FileNotFoundError
If the module file is not found
- AttributeError
If the required object is not found in the module
Examples
>>> from brisk.services.io import IOService >>> >>> # Load a configuration object >>> config = IOService.load_module_object( ... "/path/to/project", "algorithms.py", "ALGORITHM_CONFIG" ... ) >>> >>> # Load optional object (returns None if not found) >>> optional = IOService.load_module_object( ... "/path/to/project", "optional.py", "OPTIONAL_OBJ", ... required=False ... )
- save_plot(output_path: Path, metadata: Dict[str, Any] | None = None, plot: ggplot | Figure | None = None, **kwargs) None[source]#
Save plot to file with metadata and SVG conversion.
This method saves a plot to a file in the specified format, with automatic SVG conversion for report generation. It supports multiple plot types including matplotlib, plotnine, and plotly figures.
- Parameters:
- output_pathPath
Path where the plot file will be saved
- metadataOptional[Dict[str, Any]], default=None
Metadata to include with the plot
- plotOptional[pn.ggplot | go.Figure], default=None
Plot object to save (plotnine or plotly figure)
- **kwargs
Additional plot parameters (height, width, etc.)
Notes
The method automatically converts plots to SVG format for reports and handles different plot types. If no plot is provided, it saves the current matplotlib figure.
Examples
>>> io_service = IOService("io", Path("results"), Path("output")) >>> # Save plotnine plot >>> io_service.save_plot(Path("plot.png"), plot=my_plotnine_plot) >>> >>> # Save plotly plot >>> io_service.save_plot(Path("plot.png"), plot=my_plotly_figure) >>> >>> # Save current matplotlib figure >>> plt.plot([1, 2, 3], [1, 4, 9]) >>> io_service.save_plot(Path("plot.png"))
- save_to_json(data: Dict[str, Any], output_path: Path | str, metadata: Dict[str, Any]) None[source]#
Save dictionary to JSON file with metadata.
This method saves a dictionary to a JSON file with optional metadata. It automatically creates parent directories if they don’t exist and handles NumPy data types through the NumpyEncoder. The data is also stored in the reporting service for report generation.
- Parameters:
- dataDict[str, Any]
Dictionary containing the data to save
- output_pathUnion[Path, str]
Path where the JSON file will be saved
- metadataDict[str, Any]
Metadata to include with the data (stored as “_metadata” key)
Notes
The method automatically creates parent directories and handles NumPy data types. If saving fails, an error is logged but no exception is raised.
Examples
>>> io_service = IOService("io", Path("results"), Path("output")) >>> data = {"accuracy": 0.95, "precision": 0.92} >>> metadata = {"experiment": "exp_1", "timestamp": "2024-01-15"} >>> io_service.save_to_json(data, Path("results.json"), metadata)
- set_output_dir(output_dir: Path) None[source]#
Set the current output directory.
This method updates the current output directory where files will be saved. This is typically called when starting a new experiment to organize outputs by experiment.
- Parameters:
- output_dirpathlib.Path
The new output directory path
Examples
>>> io_service = IOService("io", Path("results"), Path("output")) >>> io_service.set_output_dir(Path("experiment_1")) >>> # Now all saves will go to experiment_1 directory
- class NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]#
Bases:
JSONEncoderCustom JSON encoder for NumPy data types.
This encoder extends the standard JSON encoder to handle NumPy data types that are not natively JSON serializable. It converts NumPy integers, floats, and arrays to their Python equivalents.
Notes
This encoder is used automatically when saving data with NumPy arrays or scalars to JSON files through the IOService.
Examples
>>> import json >>> import numpy as np >>> from brisk.services.io import NumpyEncoder >>> >>> data = { ... "accuracy": np.float64(0.95), ... "scores": np.array([0.1, 0.2, 0.3]) ... } >>> json_str = json.dumps(data, cls=NumpyEncoder) >>> print(json_str) # {"accuracy": 0.95, "scores": [0.1, 0.2, 0.3]}