Scaling#
- class ScalingPreprocessor(method: str = 'standard', **kwargs)[source]#
Bases:
BasePreprocessorPreprocessor for scaling numerical features.
Provides various scaling methods for numerical features while preserving categorical features in their original form. Supports standard, min-max, robust, max-abs, and normalizer scaling methods.
- Parameters:
- methodstr, default=”standard”
Scaling method: “standard”, “minmax”, “robust”, “maxabs”, or “normalizer”
- Attributes:
- scalersklearn.preprocessing scaler
The fitted scaler object
- _scaled_featureslist
List of feature names that were scaled during fit
- is_fittedbool
Whether the preprocessor has been fitted
Notes
The preprocessor automatically excludes categorical features from scaling to preserve their original form. Only continuous numerical features are scaled using the specified method.
Examples
- Standard scaling:
>>> preprocessor = ScalingPreprocessor(method="standard")
- Min-max scaling:
>>> preprocessor = ScalingPreprocessor(method="minmax")
- Robust scaling:
>>> preprocessor = ScalingPreprocessor(method="robust")
- export_params() Dict[str, Any][source]#
Export parameters for serialization.
- Returns:
- Dict[str, Any]
Dictionary containing all parameters
- fit(X: DataFrame, y: Series | None = None, categorical_features: List[str] | None = None) ScalingPreprocessor[source]#
Fit the scaler to the data.
Learns scaling parameters from the training data, excluding categorical features from scaling.
- Parameters:
- Xpd.DataFrame
Training data
- ypd.Series, optional
Target values (not used for scaling)
- categorical_featuresList[str], optional
List of categorical feature names to exclude from scaling
- Returns:
- selfScalingPreprocessor
Fitted preprocessor
Notes
The scaler is fitted only on continuous features, excluding any categorical features specified in categorical_features. This ensures that categorical features remain in their original form while continuous features are properly scaled.
- fit_transform(X: DataFrame, y: Series | None = None, categorical_features: List[str] | None = None) Tuple[DataFrame, Series | None][source]#
Fit the scaler and transform the data.
- Parameters:
- Xpd.DataFrame
Data to fit and transform
- ypd.Series, optional
Target values (not used for scaling)
- categorical_featuresList[str], optional
List of categorical feature names to exclude from scaling
- Returns:
- Tuple[pd.DataFrame, Optional[pd.Series]]
Tuple containing (scaled_X, y). The y is returned unchanged
Notes
This method combines fit and transform operations, scaling only the continuous features while preserving categorical features.
- get_feature_names(feature_names: List[str] | None = None) List[str][source]#
Get the feature names after transformation.
- Parameters:
- feature_namesList[str], optional
Original feature names
- Returns:
- List[str]
Feature names after transformation (same as input)
Notes
Scaling does not change feature names, so the original names are returned unchanged.
- transform(X: DataFrame, y: Series | None = None) Tuple[DataFrame, Series | None][source]#
Transform the data using the fitted scaler.
Applies scaling to continuous features while preserving categorical features in their original form.
- Parameters:
- Xpd.DataFrame
Features to transform
- ypd.Series, optional
Target values (passed through unchanged)
- Returns:
- Tuple[pd.DataFrame, Optional[pd.Series]]
Tuple containing (scaled_X, y). The y is returned unchanged as this preprocessor only scales features
Notes
Only features that were scaled during fit are transformed. Categorical features remain unchanged. The target variable y is not modified by this preprocessor.