Scaling#

class ScalingPreprocessor(method: str = 'standard', **kwargs)[source]#

Bases: BasePreprocessor

Preprocessor for scaling numerical features.

Provides various scaling methods for numerical features while preserving categorical features in their original form. Supports standard, min-max, robust, max-abs, and normalizer scaling methods.

Parameters:
methodstr, default=”standard”

Scaling method: “standard”, “minmax”, “robust”, “maxabs”, or “normalizer”

Attributes:
scalersklearn.preprocessing scaler

The fitted scaler object

_scaled_featureslist

List of feature names that were scaled during fit

is_fittedbool

Whether the preprocessor has been fitted

Notes

The preprocessor automatically excludes categorical features from scaling to preserve their original form. Only continuous numerical features are scaled using the specified method.

Examples

Standard scaling:
>>> preprocessor = ScalingPreprocessor(method="standard")
Min-max scaling:
>>> preprocessor = ScalingPreprocessor(method="minmax")
Robust scaling:
>>> preprocessor = ScalingPreprocessor(method="robust")
export_params() Dict[str, Any][source]#

Export parameters for serialization.

Returns:
Dict[str, Any]

Dictionary containing all parameters

fit(X: DataFrame, y: Series | None = None, categorical_features: List[str] | None = None) ScalingPreprocessor[source]#

Fit the scaler to the data.

Learns scaling parameters from the training data, excluding categorical features from scaling.

Parameters:
Xpd.DataFrame

Training data

ypd.Series, optional

Target values (not used for scaling)

categorical_featuresList[str], optional

List of categorical feature names to exclude from scaling

Returns:
selfScalingPreprocessor

Fitted preprocessor

Notes

The scaler is fitted only on continuous features, excluding any categorical features specified in categorical_features. This ensures that categorical features remain in their original form while continuous features are properly scaled.

fit_transform(X: DataFrame, y: Series | None = None, categorical_features: List[str] | None = None) Tuple[DataFrame, Series | None][source]#

Fit the scaler and transform the data.

Parameters:
Xpd.DataFrame

Data to fit and transform

ypd.Series, optional

Target values (not used for scaling)

categorical_featuresList[str], optional

List of categorical feature names to exclude from scaling

Returns:
Tuple[pd.DataFrame, Optional[pd.Series]]

Tuple containing (scaled_X, y). The y is returned unchanged

Notes

This method combines fit and transform operations, scaling only the continuous features while preserving categorical features.

get_feature_names(feature_names: List[str] | None = None) List[str][source]#

Get the feature names after transformation.

Parameters:
feature_namesList[str], optional

Original feature names

Returns:
List[str]

Feature names after transformation (same as input)

Notes

Scaling does not change feature names, so the original names are returned unchanged.

transform(X: DataFrame, y: Series | None = None) Tuple[DataFrame, Series | None][source]#

Transform the data using the fitted scaler.

Applies scaling to continuous features while preserving categorical features in their original form.

Parameters:
Xpd.DataFrame

Features to transform

ypd.Series, optional

Target values (passed through unchanged)

Returns:
Tuple[pd.DataFrame, Optional[pd.Series]]

Tuple containing (scaled_X, y). The y is returned unchanged as this preprocessor only scales features

Notes

Only features that were scaled during fit are transformed. Categorical features remain unchanged. The target variable y is not modified by this preprocessor.