Scaling#

class ScalingPreprocessor(method: str = 'standard', **kwargs)[source]#

Bases: BasePreprocessor

Preprocessor for scaling numerical features.

Provides various scaling methods for numerical features while preserving categorical features in their original form. Supports standard, min-max, robust, max-abs, and normalizer scaling methods.

Parameters:

methodstr, default=”standard”: Scaling method: “standard”, “minmax”, “robust”, “maxabs”, or “normalizer”

Attributes:

scalersklearn.preprocessing scaler: The fitted scaler object
_scaled_featureslist: List of feature names that were scaled during fit
is_fittedbool: Whether the preprocessor has been fitted

Notes

The preprocessor automatically excludes categorical features from scaling to preserve their original form. Only continuous numerical features are scaled using the specified method.

Examples

Standard scaling:

>>> preprocessor = ScalingPreprocessor(method="standard")

Min-max scaling:

>>> preprocessor = ScalingPreprocessor(method="minmax")

Robust scaling:

>>> preprocessor = ScalingPreprocessor(method="robust")

export_params() → Dict[str, Any][source]#

Export parameters for serialization.

Returns:

Dict[str, Any]: Dictionary containing all parameters

fit(X: DataFrame, y: Series | None = None, categorical_features: List[str] | None = None) → ScalingPreprocessor[source]#

Fit the scaler to the data.

Learns scaling parameters from the training data, excluding categorical features from scaling.

Parameters:

Xpd.DataFrame: Training data
ypd.Series, optional: Target values (not used for scaling)
categorical_featuresList[str], optional: List of categorical feature names to exclude from scaling

Returns:

selfScalingPreprocessor: Fitted preprocessor

Notes

The scaler is fitted only on continuous features, excluding any categorical features specified in categorical_features. This ensures that categorical features remain in their original form while continuous features are properly scaled.

fit_transform(X: DataFrame, y: Series | None = None, categorical_features: List[str] | None = None) → Tuple[DataFrame, Series | None][source]#

Fit the scaler and transform the data.

Parameters:

Xpd.DataFrame: Data to fit and transform
ypd.Series, optional: Target values (not used for scaling)
categorical_featuresList[str], optional: List of categorical feature names to exclude from scaling

Returns:

Tuple[pd.DataFrame, Optional[pd.Series]]: Tuple containing (scaled_X, y). The y is returned unchanged

Notes

This method combines fit and transform operations, scaling only the continuous features while preserving categorical features.

get_feature_names(feature_names: List[str] | None = None) → List[str][source]#

Get the feature names after transformation.

Parameters:

feature_namesList[str], optional: Original feature names

Returns:

List[str]: Feature names after transformation (same as input)

Notes

Scaling does not change feature names, so the original names are returned unchanged.

transform(X: DataFrame, y: Series | None = None) → Tuple[DataFrame, Series | None][source]#

Transform the data using the fitted scaler.

Applies scaling to continuous features while preserving categorical features in their original form.

Parameters:

Xpd.DataFrame: Features to transform
ypd.Series, optional: Target values (passed through unchanged)

Returns:

Tuple[pd.DataFrame, Optional[pd.Series]]: Tuple containing (scaled_X, y). The y is returned unchanged as this preprocessor only scales features

Notes

Only features that were scaled during fit are transformed. Categorical features remain unchanged. The target variable y is not modified by this preprocessor.

Scaling#

This Page