Linear Models¶

`sklego.linear_model.LowessRegression` ¶

Bases: BaseEstimator, RegressorMixin

LowessRegression estimator: LOWESS (Locally Weighted Scatterplot Smoothing) is a type of local regression.

Warning

This model can get expensive to predict. In fact the prediction step needs to compute the distance between each sample to predict x_i with all the training samples.

Parameters:

Name	Type	Description	Default
`sigma`	`float`	The bandwidth parameter that determines the width of the smoothing.	`1.0`
`span`	`float \| None`	The fraction of data points to consider during smoothing.	`None`

Attributes:

Name	Type	Description
`X_`	`np.ndarray of shape (n_samples, n_features)`	The training data.
`y_`	`np.ndarray of shape (n_samples,)`	The target (training) values.

Examples:

from sklego.linear_model import LowessRegression
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

X, y = make_regression(n_samples=100, n_features=2, noise=10)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

lowess = LowessRegression(sigma=1, span=0.5)
lowess.fit(X_train, y_train)

y_pred = lowess.predict(X_test)
print(y_pred)

Source code in sklego/linear_model.py

class LowessRegression(BaseEstimator, RegressorMixin):
    """`LowessRegression` estimator: LOWESS (Locally Weighted Scatterplot Smoothing) is a type of
    [local regression](https://en.wikipedia.org/wiki/Local_regression).

    !!! warning
        This model *can* get expensive to predict.
        In fact the prediction step needs to compute the distance between each sample to predict `x_i` with all the
        training samples.

    Parameters
    ----------
    sigma : float, default=1.0
        The bandwidth parameter that determines the width of the smoothing.
    span : float | None, default=None
        The fraction of data points to consider during smoothing.

    Attributes
    ----------
    X_ : np.ndarray of shape (n_samples, n_features)
        The training data.
    y_ : np.ndarray of shape (n_samples,)
        The target (training) values.


    Examples
    --------
    ```python
    from sklego.linear_model import LowessRegression
    from sklearn.datasets import make_regression
    from sklearn.model_selection import train_test_split

    X, y = make_regression(n_samples=100, n_features=2, noise=10)

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

    lowess = LowessRegression(sigma=1, span=0.5)
    lowess.fit(X_train, y_train)

    y_pred = lowess.predict(X_test)
    print(y_pred)
    ```
    """

    def __init__(self, sigma=1, span=None):
        self.sigma = sigma
        self.span = span

    def fit(self, X, y):
        """Fit the estimator on training data `X` and `y` by storing them in `self.X_` and `self.y_`, and
        validating the parameters.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features )
            The training data.
        y : array-like of shape (n_samples,)
            The target values.

        Returns
        -------
        self : LowessRegression
            The fitted estimator.

        Raises
        ------
        ValueError
            - If `span` is not between 0 and 1.
            - If `sigma` is negative.
        """
        X, y = check_X_y(X, y, estimator=self, dtype=FLOAT_DTYPES)
        if self.span is not None:
            if not 0 <= self.span <= 1:
                raise ValueError(f"Param `span` must be 0 <= span <= 1, got: {self.span}")
        if self.sigma < 0:
            raise ValueError(f"Param `sigma` must be >= 0, got: {self.sigma}")
        self.X_ = X
        self.y_ = y
        self.n_features_in_ = X.shape[1]
        return self

    def _calc_wts(self, x_i):
        """Calculate the weights for a single point `x_i` using the training data `self.X_` and the parameters
        `self.sigma` and `self.span`. The weights are calculated as `np.exp(-(distances**2) / self.sigma)`,
        where distances are the distances between `x_i` and all the training samples.

        If `self.span` is not None, then the weights are multiplied by
        `(distances <= np.quantile(distances, q=self.span))`.
        """
        distances = np.linalg.norm(self.X_ - x_i, axis=1)
        weights = np.exp(-(distances**2) / self.sigma)
        if self.span:
            weights = weights * (distances <= np.quantile(distances, q=self.span))
        return weights

    def predict(self, X):
        """Predict target values for `X` using fitted estimator. This process is expensive because it needs to compute
        the distance between each sample `x_i` with all the training samples.

        Then it calculates the weights for **each sample** `x_i` as `np.exp(-(distances**2) / self.sigma)` and finally
        it computes the weighted average of the `y` values weighted by these weights.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            The data to predict.

        Returns
        -------
        array-like of shape (n_samples,)
            The predicted values.
        """
        X = check_array(X, estimator=self, dtype=FLOAT_DTYPES)
        check_is_fitted(self, ["X_", "y_"])

        try:
            results = np.stack([np.average(self.y_, weights=self._calc_wts(x_i=x_i)) for x_i in X])
        except ZeroDivisionError:
            msg = (
                "Weights, resulting from `np.exp(-(distances**2) / self.sigma)`, are all zero. "
                "Try to increase the value of `sigma` or to normalize the input data.\n\n"
                "`distances` refer to the distance between each sample `x_i` with all the"
                "training samples."
            )
            raise ValueError(msg)

        return results

`fit(X, y)` ¶

Fit the estimator on training data X and y by storing them in self.X_ and self.y_, and validating the parameters.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features )`	The training data.	required
`y`	`array-like of shape (n_samples,)`	The target values.	required

Returns:

Name	Type	Description
`self`	`LowessRegression`	The fitted estimator.

Raises:

Type	Description
`ValueError`	If `span` is not between 0 and 1. If `sigma` is negative.

Source code in sklego/linear_model.py

def fit(self, X, y):
    """Fit the estimator on training data `X` and `y` by storing them in `self.X_` and `self.y_`, and
    validating the parameters.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features )
        The training data.
    y : array-like of shape (n_samples,)
        The target values.

    Returns
    -------
    self : LowessRegression
        The fitted estimator.

    Raises
    ------
    ValueError
        - If `span` is not between 0 and 1.
        - If `sigma` is negative.
    """
    X, y = check_X_y(X, y, estimator=self, dtype=FLOAT_DTYPES)
    if self.span is not None:
        if not 0 <= self.span <= 1:
            raise ValueError(f"Param `span` must be 0 <= span <= 1, got: {self.span}")
    if self.sigma < 0:
        raise ValueError(f"Param `sigma` must be >= 0, got: {self.sigma}")
    self.X_ = X
    self.y_ = y
    self.n_features_in_ = X.shape[1]
    return self

`predict(X)` ¶

Predict target values for X using fitted estimator. This process is expensive because it needs to compute the distance between each sample x_i with all the training samples.

Then it calculates the weights for each sample x_i as np.exp(-(distances**2) / self.sigma) and finally it computes the weighted average of the y values weighted by these weights.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	The data to predict.	required

Returns:

Type	Description
`array-like of shape (n_samples,)`	The predicted values.

Source code in sklego/linear_model.py

def predict(self, X):
    """Predict target values for `X` using fitted estimator. This process is expensive because it needs to compute
    the distance between each sample `x_i` with all the training samples.

    Then it calculates the weights for **each sample** `x_i` as `np.exp(-(distances**2) / self.sigma)` and finally
    it computes the weighted average of the `y` values weighted by these weights.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        The data to predict.

    Returns
    -------
    array-like of shape (n_samples,)
        The predicted values.
    """
    X = check_array(X, estimator=self, dtype=FLOAT_DTYPES)
    check_is_fitted(self, ["X_", "y_"])

    try:
        results = np.stack([np.average(self.y_, weights=self._calc_wts(x_i=x_i)) for x_i in X])
    except ZeroDivisionError:
        msg = (
            "Weights, resulting from `np.exp(-(distances**2) / self.sigma)`, are all zero. "
            "Try to increase the value of `sigma` or to normalize the input data.\n\n"
            "`distances` refer to the distance between each sample `x_i` with all the"
            "training samples."
        )
        raise ValueError(msg)

    return results

`sklego.linear_model.ProbWeightRegression` ¶

Bases: BaseEstimator, RegressorMixin

ProbWeightRegression assumes that all input signals in X need to be reweighted with weights that sum up to one in order to predict y.

This can be very useful in combination with sklego.meta.EstimatorTransformer because it allows to construct an ensemble.

Parameters:

Name	Type	Description	Default
`non_negative`	`bool`	If True, forces all weights to be non-negative.	`True`

Attributes:

Name	Type	Description
`n_features_in_`	`int`	The number of features seen during `fit`.
`coef_`	`(ndarray, shape(n_columns))`	The learned coefficients after fitting the model.
`coefs_`	`(ndarray, shape(n_columns))`	Deprecated, please use `coef_` instead.

Examples:

import numpy as np
from sklego.linear_model import ProbWeightRegression

X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
y = np.array([1, 2, 3, 4])

pwr = ProbWeightRegression().fit(X, y)

# The weights sum up to 1
assert np.isclose(pwr.coef_.sum(), 1)

X_test = np.array([[5, 6], [6, 7]])

# The prediction is positive (all weights are positive, and features are positive)
assert all(pwr.predict(X_test) > 0)

# The weights are positive
assert all(pwr.coef_ > -1e-8)

Info

This model requires cvxpy to be installed. If you don't have it installed, you can install it with:

pip install cvxpy
# or pip install scikit-lego"[cvxpy]"

Source code in sklego/linear_model.py

class ProbWeightRegression(BaseEstimator, RegressorMixin):
    """`ProbWeightRegression` assumes that all input signals in `X` need to be reweighted with weights that sum up to
    one in order to predict `y`.

    This can be very useful in combination with `sklego.meta.EstimatorTransformer` because it allows to construct
    an ensemble.

    Parameters
    ----------
    non_negative : bool, default=True
        If True, forces all weights to be non-negative.

    Attributes
    ----------
    n_features_in_ : int
        The number of features seen during `fit`.
    coef_ : np.ndarray, shape (n_columns,)
        The learned coefficients after fitting the model.
    coefs_ : np.ndarray, shape (n_columns,)
        Deprecated, please use `coef_` instead.

    Examples
    --------
    ```python
    import numpy as np
    from sklego.linear_model import ProbWeightRegression

    X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
    y = np.array([1, 2, 3, 4])

    pwr = ProbWeightRegression().fit(X, y)

    # The weights sum up to 1
    assert np.isclose(pwr.coef_.sum(), 1)

    X_test = np.array([[5, 6], [6, 7]])

    # The prediction is positive (all weights are positive, and features are positive)
    assert all(pwr.predict(X_test) > 0)

    # The weights are positive
    assert all(pwr.coef_ > -1e-8)
    ```

    !!! info

        This model requires [`cvxpy`](https://www.cvxpy.org/) to be installed. If you don't have it installed, you can
        install it with:

        ```bash
        pip install cvxpy
        # or pip install scikit-lego"[cvxpy]"
        ```
    """

    def __init__(self, non_negative=True):
        self.non_negative = non_negative

    def fit(self, X, y):
        r"""Fit the estimator on training data `X` and `y` by solving the following convex optimization problem:

        $$\begin{array}{ll}{\operatorname{minimize}} & {\sum_{i=1}^{N}\left(\mathbf{x}_{i}
        \boldsymbol{\beta}-y_{i}\right)^{2}} \\
        {\text { subject to }} & {\sum_{j=1}^{p} \beta_{j}=1} \\
        {(\text{If non_negative=True})} & {\beta_{j} \geq 0, \quad j=1, \ldots, p} \end{array}$$

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features )
            The training data.
        y : array-like of shape (n_samples,)
            The target values.

        Returns
        -------
        self : ProbWeightRegression
            The fitted estimator.
        """
        X, y = check_X_y(X, y, estimator=self, dtype=FLOAT_DTYPES)

        # Construct the problem.
        betas = cp.Variable(X.shape[1])
        objective = cp.Minimize(cp.sum_squares(X @ betas - y))
        constraints = [sum(betas) == 1]
        if self.non_negative:
            constraints.append(0 <= betas)

        # Solve the problem.
        prob = cp.Problem(objective, constraints)
        prob.solve()
        self.coef_ = betas.value
        self.n_features_in_ = X.shape[1]

        return self

    def predict(self, X):
        """Predict target values for `X` using fitted estimator by multiplying `X` with the learned coefficients.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            The data to predict.

        Returns
        -------
        array-like of shape (n_samples,)
            The predicted data.
        """
        X = check_array(X, estimator=self, dtype=FLOAT_DTYPES)
        check_is_fitted(self, ["coef_"])
        return np.dot(X, self.coef_)

    @property
    def coefs_(self):
        warn(
            "Please use `coef_` instead of `coefs_`, `coefs_` will be deprecated in future versions",
            DeprecationWarning,
        )
        return self.coef_

`fit(X, y)` ¶

Fit the estimator on training data X and y by solving the following convex optimization problem:

\[\begin{array}{ll}{\operatorname{minimize}} & {\sum_{i=1}^{N}\left(\mathbf{x}_{i} \boldsymbol{\beta}-y_{i}\right)^{2}} \\ {\text { subject to }} & {\sum_{j=1}^{p} \beta_{j}=1} \\ {(\text{If non_negative=True})} & {\beta_{j} \geq 0, \quad j=1, \ldots, p} \end{array}\]

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features )`	The training data.	required
`y`	`array-like of shape (n_samples,)`	The target values.	required

Returns:

Name	Type	Description
`self`	`ProbWeightRegression`	The fitted estimator.

Source code in sklego/linear_model.py

def fit(self, X, y):
    r"""Fit the estimator on training data `X` and `y` by solving the following convex optimization problem:

    $$\begin{array}{ll}{\operatorname{minimize}} & {\sum_{i=1}^{N}\left(\mathbf{x}_{i}
    \boldsymbol{\beta}-y_{i}\right)^{2}} \\
    {\text { subject to }} & {\sum_{j=1}^{p} \beta_{j}=1} \\
    {(\text{If non_negative=True})} & {\beta_{j} \geq 0, \quad j=1, \ldots, p} \end{array}$$

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features )
        The training data.
    y : array-like of shape (n_samples,)
        The target values.

    Returns
    -------
    self : ProbWeightRegression
        The fitted estimator.
    """
    X, y = check_X_y(X, y, estimator=self, dtype=FLOAT_DTYPES)

    # Construct the problem.
    betas = cp.Variable(X.shape[1])
    objective = cp.Minimize(cp.sum_squares(X @ betas - y))
    constraints = [sum(betas) == 1]
    if self.non_negative:
        constraints.append(0 <= betas)

    # Solve the problem.
    prob = cp.Problem(objective, constraints)
    prob.solve()
    self.coef_ = betas.value
    self.n_features_in_ = X.shape[1]

    return self

`predict(X)` ¶

Predict target values for X using fitted estimator by multiplying X with the learned coefficients.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	The data to predict.	required

Returns:

Type	Description
`array-like of shape (n_samples,)`	The predicted data.

Source code in sklego/linear_model.py

def predict(self, X):
    """Predict target values for `X` using fitted estimator by multiplying `X` with the learned coefficients.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        The data to predict.

    Returns
    -------
    array-like of shape (n_samples,)
        The predicted data.
    """
    X = check_array(X, estimator=self, dtype=FLOAT_DTYPES)
    check_is_fitted(self, ["coef_"])
    return np.dot(X, self.coef_)

`sklego.linear_model.DeadZoneRegressor` ¶

Bases: BaseEstimator, RegressorMixin

The DeadZoneRegressor estimator implements a regression model that incorporates a dead zone effect for improving the robustness of regression predictions.

The dead zone effect allows the model to reduce the impact of small errors in the training data on the regression results, which can be particularly useful when dealing with noisy or unreliable data.

The estimator minimizes the following loss function using gradient descent:

\[\frac{1}{n} \sum_{i=1}^{n} \text{deadzone}\left(\left|X_i \cdot w - y_i\right|\right)\]

where:

\[\text{deadzone}(e) = \begin{cases} 1 & \text{if } e > \text{threshold} \text{ & effect="constant"} \\ e & \text{if } e > \text{threshold} \text{ & effect="linear"} \\ e^2 & \text{if } e > \text{threshold} \text{ & effect="quadratic"} \\ 0 & \text{otherwise} \end{cases} \]

Parameters:

Name	Type	Description	Default
`threshold`	`float`	The threshold value for the dead zone effect.	`0.3`
`relative`	`bool`	If True, the threshold is relative to the target value. Namely the dead zone effect is applied to the relative error between the predicted and target values.	`False`
`effect`	`Literal[linear, quadratic, constant]`	The type of dead zone effect to apply. It can be one of the following: "linear": the errors within the threshold have no impact (their contribution is effectively zero), and errors outside the threshold are penalized linearly. "quadratic": the errors within the threshold have no impact (their contribution is effectively zero), and errors outside the threshold are penalized quadratically (squared). "constant": the errors within the threshold have no impact, and errors outside the threshold are penalized with a constant value.	`"linear"`
`n_iter`	`int`	The number of iterations to run the gradient descent algorithm.	`2000`
`stepsize`	`float`	The step size for the gradient descent algorithm.	`0.01`
`check_grad`	`bool`	If True, check the gradients numerically, just to be safe.	`False`

Attributes:

Name	Type	Description
`coef_`	`(ndarray, shape(n_columns))`	The learned coefficients after fitting the model.
`coefs_`	`(ndarray, shape(n_columns))`	Deprecated, please use `coef_` instead.

Examples:

import numpy as np
from sklego.linear_model import DeadZoneRegressor

X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
y = np.array([1, 2, 3, 4])

dzr = DeadZoneRegressor(threshold=0.5, relative=False, effect="quadratic").fit(X, y)

X_test = np.array([[5, 6], [6, 7]])
y_pred = dzr.predict(X_test)

print(y_pred)

Source code in sklego/linear_model.py

class DeadZoneRegressor(BaseEstimator, RegressorMixin):
    r"""The `DeadZoneRegressor` estimator implements a regression model that incorporates a _dead zone effect_ for
    improving the robustness of regression predictions.

    The dead zone effect allows the model to reduce the impact of small errors in the training data on the regression
    results, which can be particularly useful when dealing with noisy or unreliable data.

    The estimator minimizes the following loss function using gradient descent:

    $$\frac{1}{n} \sum_{i=1}^{n} \text{deadzone}\left(\left|X_i \cdot w - y_i\right|\right)$$

    where:

    $$\text{deadzone}(e) =
    \begin{cases}
    1 & \text{if } e > \text{threshold} \text{ & effect="constant"} \\
    e & \text{if } e > \text{threshold} \text{ & effect="linear"} \\
    e^2 & \text{if } e > \text{threshold} \text{ & effect="quadratic"} \\
    0 & \text{otherwise}
    \end{cases}
    $$

    Parameters
    ----------
    threshold : float, default=0.3
        The threshold value for the dead zone effect.
    relative : bool, default=False
        If True, the threshold is relative to the target value. Namely the _dead zone effect_ is applied to the
        relative error between the predicted and target values.
    effect : Literal["linear", "quadratic", "constant"], default="linear"
        The type of dead zone effect to apply. It can be one of the following:

        - "linear": the errors within the threshold have no impact (their contribution is effectively zero), and errors
            outside the threshold are penalized linearly.
        - "quadratic": the errors within the threshold have no impact (their contribution is effectively zero), and
            errors outside the threshold are penalized quadratically (squared).
        - "constant": the errors within the threshold have no impact, and errors outside the threshold are penalized
            with a constant value.
    n_iter : int, default=2000
        The number of iterations to run the gradient descent algorithm.
    stepsize : float, default=0.01
        The step size for the gradient descent algorithm.
    check_grad : bool, default=False
        If True, check the gradients numerically, _just to be safe_.

    Attributes
    ----------
    coef_ : np.ndarray, shape (n_columns,)
        The learned coefficients after fitting the model.
    coefs_ : np.ndarray, shape (n_columns,)
        Deprecated, please use `coef_` instead.

    Examples
    --------

    ```python
    import numpy as np
    from sklego.linear_model import DeadZoneRegressor

    X = np.array([[1, 2], [2, 3], [3, 4], [4, 5]])
    y = np.array([1, 2, 3, 4])

    dzr = DeadZoneRegressor(threshold=0.5, relative=False, effect="quadratic").fit(X, y)

    X_test = np.array([[5, 6], [6, 7]])
    y_pred = dzr.predict(X_test)

    print(y_pred)
    ```


    """

    _ALLOWED_EFFECTS = ("linear", "quadratic", "constant")

    def __init__(
        self,
        threshold=0.3,
        relative=False,
        effect="linear",
    ):
        self.threshold = threshold
        self.relative = relative
        self.effect = effect

    def fit(self, X, y):
        """Fit the estimator on training data `X` and `y` by optimizing the loss function using gradient descent.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features )
            The training data.
        y : array-like of shape (n_samples,)
            The target values.

        Returns
        -------
        self : DeadZoneRegressor
            The fitted estimator.

        Raises
        ------
        ValueError
            If `effect` is not one of "linear", "quadratic" or "constant".
        """
        X, y = check_X_y(X, y, estimator=self, dtype=FLOAT_DTYPES)
        if self.effect not in self._ALLOWED_EFFECTS:
            raise ValueError(f"effect {self.effect} must be in {self._ALLOWED_EFFECTS}")

        def deadzone(errors):
            if self.effect == "constant":
                error_weight = errors.shape[0]
            elif self.effect == "linear":
                error_weight = errors
            elif self.effect == "quadratic":
                error_weight = errors**2

            return np.where(errors > self.threshold, error_weight, 0.0)

        def training_loss(weights):
            prediction = np.dot(X, weights)
            errors = np.abs(prediction - y)

            if self.relative:
                errors /= np.abs(y)

            loss = np.mean(deadzone(errors))
            return loss

        def deadzone_derivative(errors):
            if self.effect == "constant":
                error_weight = 0.0
            elif self.effect == "linear":
                error_weight = 1.0
            elif self.effect == "quadratic":
                error_weight = 2 * errors

            return np.where(errors > self.threshold, error_weight, 0.0)

        def training_loss_derivative(weights):
            prediction = np.dot(X, weights)
            errors = np.abs(prediction - y)

            if self.relative:
                errors /= np.abs(y)

            loss_derivative = deadzone_derivative(errors)
            errors_derivative = np.sign(prediction - y)

            if self.relative:
                errors_derivative /= np.abs(y)

            derivative = np.dot(errors_derivative * loss_derivative, X) / X.shape[0]

            return derivative

        self.n_features_in_ = X.shape[1]

        minimize_result = minimize(
            training_loss,
            x0=np.zeros(self.n_features_in_),  # np.random.normal(0, 1, n_features_)
            tol=1e-20,
            jac=training_loss_derivative,
        )

        self.convergence_status_ = minimize_result.message
        self.coef_ = minimize_result.x
        return self

    def predict(self, X):
        """Predict target values for `X` using fitted estimator by multiplying `X` with the learned coefficients.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            The data to predict.

        Returns
        -------
        array-like of shape (n_samples,)
            The predicted data.
        """
        X = check_array(X, estimator=self, dtype=FLOAT_DTYPES)
        check_is_fitted(self, ["coef_"])
        return np.dot(X, self.coef_)

    @property
    def coefs_(self):
        warn(
            "Please use `coef_` instead of `coefs_`, `coefs_` will be deprecated in future versions",
            DeprecationWarning,
        )
        return self.coef_

    @property
    def allowed_effects(self):
        warn(
            "Please use `_ALLOWED_EFFECTS` instead of `allowed_effects`,"
            "`allowed_effects` will be deprecated in future versions",
            DeprecationWarning,
        )
        return self._ALLOWED_EFFECTS

`fit(X, y)` ¶

Fit the estimator on training data X and y by optimizing the loss function using gradient descent.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features )`	The training data.	required
`y`	`array-like of shape (n_samples,)`	The target values.	required

Returns:

Name	Type	Description
`self`	`DeadZoneRegressor`	The fitted estimator.

Raises:

Type	Description
`ValueError`	If `effect` is not one of "linear", "quadratic" or "constant".

Source code in sklego/linear_model.py

def fit(self, X, y):
    """Fit the estimator on training data `X` and `y` by optimizing the loss function using gradient descent.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features )
        The training data.
    y : array-like of shape (n_samples,)
        The target values.

    Returns
    -------
    self : DeadZoneRegressor
        The fitted estimator.

    Raises
    ------
    ValueError
        If `effect` is not one of "linear", "quadratic" or "constant".
    """
    X, y = check_X_y(X, y, estimator=self, dtype=FLOAT_DTYPES)
    if self.effect not in self._ALLOWED_EFFECTS:
        raise ValueError(f"effect {self.effect} must be in {self._ALLOWED_EFFECTS}")

    def deadzone(errors):
        if self.effect == "constant":
            error_weight = errors.shape[0]
        elif self.effect == "linear":
            error_weight = errors
        elif self.effect == "quadratic":
            error_weight = errors**2

        return np.where(errors > self.threshold, error_weight, 0.0)

    def training_loss(weights):
        prediction = np.dot(X, weights)
        errors = np.abs(prediction - y)

        if self.relative:
            errors /= np.abs(y)

        loss = np.mean(deadzone(errors))
        return loss

    def deadzone_derivative(errors):
        if self.effect == "constant":
            error_weight = 0.0
        elif self.effect == "linear":
            error_weight = 1.0
        elif self.effect == "quadratic":
            error_weight = 2 * errors

        return np.where(errors > self.threshold, error_weight, 0.0)

    def training_loss_derivative(weights):
        prediction = np.dot(X, weights)
        errors = np.abs(prediction - y)

        if self.relative:
            errors /= np.abs(y)

        loss_derivative = deadzone_derivative(errors)
        errors_derivative = np.sign(prediction - y)

        if self.relative:
            errors_derivative /= np.abs(y)

        derivative = np.dot(errors_derivative * loss_derivative, X) / X.shape[0]

        return derivative

    self.n_features_in_ = X.shape[1]

    minimize_result = minimize(
        training_loss,
        x0=np.zeros(self.n_features_in_),  # np.random.normal(0, 1, n_features_)
        tol=1e-20,
        jac=training_loss_derivative,
    )

    self.convergence_status_ = minimize_result.message
    self.coef_ = minimize_result.x
    return self

`predict(X)` ¶

Predict target values for X using fitted estimator by multiplying X with the learned coefficients.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	The data to predict.	required

Returns:

Type	Description
`array-like of shape (n_samples,)`	The predicted data.

Source code in sklego/linear_model.py

def predict(self, X):
    """Predict target values for `X` using fitted estimator by multiplying `X` with the learned coefficients.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        The data to predict.

    Returns
    -------
    array-like of shape (n_samples,)
        The predicted data.
    """
    X = check_array(X, estimator=self, dtype=FLOAT_DTYPES)
    check_is_fitted(self, ["coef_"])
    return np.dot(X, self.coef_)

`sklego.linear_model.DemographicParityClassifier` ¶

Bases: BaseEstimator, LinearClassifierMixin

DemographicParityClassifier is a logistic regression classifier which can be constrained on demographic parity (p% score).

It minimizes the log loss while constraining the correlation between the specified sensitive_cols and the distance to the decision boundary of the classifier.

Warning

This classifier only works for binary classification problems.

\[\begin{array}{cl}{\operatorname{minimize}} & -\sum_{i=1}^{N} \log p\left(y_{i} | \mathbf{x}_{i}, \boldsymbol{\theta}\right) \\ {\text { subject to }} & {\frac{1}{N} \sum_{i=1}^{N}\left(\mathbf{z}_{i}-\overline{\mathbf{z}}\right) d_\boldsymbol{\theta}\left(\mathbf{x}_{i}\right) \leq \mathbf{c}} \\ {} & {\frac{1}{N} \sum_{i=1}^{N}\left(\mathbf{z}_{i}-\overline{\mathbf{z}}\right) d_{\boldsymbol{\theta}}\left(\mathbf{x}_{i}\right) \geq-\mathbf{c}}\end{array}\]

Parameters:

Name	Type	Description	Default
`covariance_threshold`	`float \| None`	The maximum allowed covariance between the sensitive attributes and the distance to the decision boundary. If set to None, no fairness constraint is enforced.	required
`sensitive_cols`	`List[str] \| List[int] \| None`	List of sensitive column names (if X is a dataframe) or a list of column indices (if X is a numpy array).	`None`
`C`	`float`	Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.	`1.0`
`penalty`	`Literal[l1, l2, none, None]`	The type of penalty to apply to the model. "l1" applies L1 regularization, "l2" applies L2 regularization, while None (or "none") disables regularization.	`"l1"`
`fit_intercept`	`bool`	Whether or not a constant term (a.k.a. bias or intercept) should be added to the decision function.	`True`
`max_iter`	`int`	Maximum number of iterations taken for the solvers to converge.	`100`
`train_sensitive_cols`	`bool`	Indicates whether the model should use the sensitive columns in the fit step.	`False`
`multi_class`	`Literal[ovr, ovo]`	The method to use for multiclass predictions.	`"ovr"`
`n_jobs`	`int \| None`	The amount of parallel jobs that should be used to fit the model.	`1`

Source

M. Zafar et al. (2017), Fairness Constraints: Mechanisms for Fair Classification

Examples:

from sklego.linear_model import DemographicParityClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

X, y = make_classification(
    n_samples=100,
    n_features=2,
    n_informative=2,
    n_redundant=0,
    n_clusters_per_class=1,
)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

dp = DemographicParityClassifier(
    covariance_threshold=0.1, sensitive_cols=[0]
).fit(X_train, y_train)

y_pred = dp.predict_proba(X_test)

print(y_pred)

Source code in sklego/linear_model.py

class DemographicParityClassifier(BaseEstimator, LinearClassifierMixin):
    r"""`DemographicParityClassifier` is a logistic regression classifier which can be constrained on demographic
    parity (p% score).

    It minimizes the log loss while constraining the correlation between the specified `sensitive_cols` and the
    distance to the decision boundary of the classifier.

    !!! warning
        This classifier only works for binary classification problems.

    $$\begin{array}{cl}{\operatorname{minimize}} & -\sum_{i=1}^{N} \log p\left(y_{i} | \mathbf{x}_{i},
        \boldsymbol{\theta}\right) \\
        {\text { subject to }} & {\frac{1}{N} \sum_{i=1}^{N}\left(\mathbf{z}_{i}-\overline{\mathbf{z}}\right)
        d_\boldsymbol{\theta}\left(\mathbf{x}_{i}\right) \leq \mathbf{c}} \\
        {} & {\frac{1}{N} \sum_{i=1}^{N}\left(\mathbf{z}_{i}-\overline{\mathbf{z}}\right)
        d_{\boldsymbol{\theta}}\left(\mathbf{x}_{i}\right) \geq-\mathbf{c}}\end{array}$$

    Parameters
    ----------
    covariance_threshold : float | None
        The maximum allowed covariance between the sensitive attributes and the distance to the decision boundary.
        If set to None, no fairness constraint is enforced.
    sensitive_cols : List[str] | List[int] | None, default=None
        List of sensitive column names (if X is a dataframe) or a list of column indices (if X is a numpy array).
    C : float, default=1.0
        Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values
        specify stronger regularization.
    penalty : Literal["l1", "l2", "none", None], default="l1"
        The type of penalty to apply to the model. "l1" applies L1 regularization, "l2" applies L2 regularization,
        while None (or "none") disables regularization.
    fit_intercept : bool, default=True
        Whether or not a constant term (a.k.a. bias or intercept) should be added to the decision function.
    max_iter : int, default=100
        Maximum number of iterations taken for the solvers to converge.
    train_sensitive_cols : bool, default=False
        Indicates whether the model should use the sensitive columns in the fit step.
    multi_class : Literal["ovr", "ovo"], default="ovr"
        The method to use for multiclass predictions.
    n_jobs : int | None, default=1
        The amount of parallel jobs that should be used to fit the model.

    Source
    ------
    M. Zafar et al. (2017), Fairness Constraints: Mechanisms for Fair Classification


    Examples
    --------
    ```python
    from sklego.linear_model import DemographicParityClassifier
    from sklearn.datasets import make_classification
    from sklearn.model_selection import train_test_split

    X, y = make_classification(
        n_samples=100,
        n_features=2,
        n_informative=2,
        n_redundant=0,
        n_clusters_per_class=1,
    )

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

    dp = DemographicParityClassifier(
        covariance_threshold=0.1, sensitive_cols=[0]
    ).fit(X_train, y_train)

    y_pred = dp.predict_proba(X_test)

    print(y_pred)
    ```
    """

    def __new__(cls, *args, multi_class="ovr", n_jobs=1, **kwargs):
        multiclass_meta = {"ovr": OneVsRestClassifier, "ovo": OneVsOneClassifier}[multi_class]
        return multiclass_meta(_DemographicParityClassifier(*args, **kwargs), n_jobs=n_jobs)

`sklego.linear_model.EqualOpportunityClassifier` ¶

Bases: BaseEstimator, LinearClassifierMixin

EqualOpportunityClassifier is a logistic regression classifier which can be constrained on equal opportunity score.

It minimizes the log loss while constraining the correlation between the specified sensitive_cols and the distance to the decision boundary of the classifier for those examples that have a y_true of 1.

Warning

This classifier only works for binary classification problems.

\[\begin{array}{cl}{\operatorname{minimize}} & -\sum_{i=1}^{N} \log p\left(y_{i} | \mathbf{x}_{i}, \boldsymbol{\theta}\right) \\ {\text { subject to }} & {\frac{1}{POS} \sum_{i=1}^{POS}\left(\mathbf{z}_{i}-\overline{\mathbf{z}}\right) d_\boldsymbol{\theta}\left(\mathbf{x}_{i}\right) \leq \mathbf{c}} \\ {} & {\frac{1}{POS} \sum_{i=1}^{POS}\left(\mathbf{z}_{i}-\overline{\mathbf{z}}\right) d_{\boldsymbol{\theta}}\left(\mathbf{x}_{i}\right) \geq-\mathbf{c}}\end{array}\]

where POS is the subset of the population where \(\text{y_true} = 1\)

Parameters:

Name	Type	Description	Default
`covariance_threshold`	`float \| None`	The maximum allowed covariance between the sensitive attributes and the distance to the decision boundary. If set to None, no fairness constraint is enforced.	required
`positive_target`	`int`	The name of the class which is associated with a positive outcome	required
`sensitive_cols`	`List[str] \| List[int] \| None`	List of sensitive column names (if X is a dataframe) or a list of column indices (if X is a numpy array).	`None`
`C`	`float`	Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values specify stronger regularization.	`1.0`
`penalty`	`Literal[l1, l2, none, None]`	The type of penalty to apply to the model. "l1" applies L1 regularization, "l2" applies L2 regularization, while None (or "none") disables regularization.	`"l1"`
`fit_intercept`	`bool`	Whether or not a constant term (a.k.a. bias or intercept) should be added to the decision function.	`True`
`max_iter`	`int`	Maximum number of iterations taken for the solvers to converge.	`100`
`train_sensitive_cols`	`bool`	Indicates whether the model should use the sensitive columns in the fit step.	`False`
`multi_class`	`Literal[ovr, ovo]`	The method to use for multiclass predictions.	`"ovr"`
`n_jobs`	`int \| None`	The amount of parallel jobs that should be used to fit the model.	`1`

Examples:

from sklego.linear_model import EqualOpportunityClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

X, y = make_classification(
    n_samples=100,
    n_features=2,
    n_informative=2,
    n_redundant=0,
    n_clusters_per_class=1,
)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

eo = EqualOpportunityClassifier(
    covariance_threshold=0.1, positive_target=1, sensitive_cols=[0]
).fit(X_train, y_train)

y_pred = eo.predict_proba(X_test)

print(y_pred)

Source code in sklego/linear_model.py

class EqualOpportunityClassifier(BaseEstimator, LinearClassifierMixin):
    r"""`EqualOpportunityClassifier` is a logistic regression classifier which can be constrained on equal opportunity
    score.

    It minimizes the log loss while constraining the correlation between the specified `sensitive_cols` and the
    distance to the decision boundary of the classifier for those examples that have a y_true of 1.

    !!! warning
        This classifier only works for binary classification problems.

    $$\begin{array}{cl}{\operatorname{minimize}} & -\sum_{i=1}^{N} \log p\left(y_{i} | \mathbf{x}_{i},
        \boldsymbol{\theta}\right) \\
        {\text { subject to }} & {\frac{1}{POS} \sum_{i=1}^{POS}\left(\mathbf{z}_{i}-\overline{\mathbf{z}}\right)
        d_\boldsymbol{\theta}\left(\mathbf{x}_{i}\right) \leq \mathbf{c}} \\
        {} & {\frac{1}{POS} \sum_{i=1}^{POS}\left(\mathbf{z}_{i}-\overline{\mathbf{z}}\right)
        d_{\boldsymbol{\theta}}\left(\mathbf{x}_{i}\right) \geq-\mathbf{c}}\end{array}$$

    where POS is the subset of the population where $\text{y_true} = 1$

    Parameters
    ----------
    covariance_threshold : float | None
        The maximum allowed covariance between the sensitive attributes and the distance to the decision boundary.
        If set to None, no fairness constraint is enforced.
    positive_target : int
        The name of the class which is associated with a positive outcome
    sensitive_cols : List[str] | List[int] | None, default=None
        List of sensitive column names (if X is a dataframe) or a list of column indices (if X is a numpy array).
    C : float, default=1.0
        Inverse of regularization strength; must be a positive float. Like in support vector machines, smaller values
        specify stronger regularization.
    penalty : Literal["l1", "l2", "none", None], default="l1"
        The type of penalty to apply to the model. "l1" applies L1 regularization, "l2" applies L2 regularization,
        while None (or "none") disables regularization.
    fit_intercept : bool, default=True
        Whether or not a constant term (a.k.a. bias or intercept) should be added to the decision function.
    max_iter : int, default=100
        Maximum number of iterations taken for the solvers to converge.
    train_sensitive_cols : bool, default=False
        Indicates whether the model should use the sensitive columns in the fit step.
    multi_class : Literal["ovr", "ovo"], default="ovr"
        The method to use for multiclass predictions.
    n_jobs : int | None, default=1
        The amount of parallel jobs that should be used to fit the model.

    Examples
    --------

    ```python
    from sklego.linear_model import EqualOpportunityClassifier
    from sklearn.datasets import make_classification
    from sklearn.model_selection import train_test_split

    X, y = make_classification(
        n_samples=100,
        n_features=2,
        n_informative=2,
        n_redundant=0,
        n_clusters_per_class=1,
    )

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

    eo = EqualOpportunityClassifier(
        covariance_threshold=0.1, positive_target=1, sensitive_cols=[0]
    ).fit(X_train, y_train)

    y_pred = eo.predict_proba(X_test)

    print(y_pred)
    ```
    """

    def __new__(cls, *args, multi_class="ovr", n_jobs=1, **kwargs):
        multiclass_meta = {"ovr": OneVsRestClassifier, "ovo": OneVsOneClassifier}[multi_class]
        return multiclass_meta(_EqualOpportunityClassifier(*args, **kwargs), n_jobs=n_jobs)

`sklego.linear_model.BaseScipyMinimizeRegressor` ¶

Bases: BaseEstimator, RegressorMixin, ABC

Abstract base class for regressors relying on Scipy's minimize method to minimize a (custom) loss function.

Derive a class from this one and give it the function to be minimized. The derived class should implement the _get_objective method, which should return the loss function and its gradient.

Info

This implementation uses scipy.optimize.minimize.

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Constant that multiplies the penalty terms.	`0.0`
`l1_ratio`	`float`	The ElasticNet mixing parameter, with `0 <= l1_ratio <= 1`: `l1_ratio = 0` is equivalent to an L2 penalty. `l1_ratio = 1` is equivalent to an L1 penalty. `0 < l1_ratio < 1` is the combination of L1 and L2.	`0.0`
`fit_intercept`	`bool`	Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered).	`True`
`copy_X`	`bool`	If True, `X` will be copied; else, it may be overwritten.	`True`
`positive`	`bool`	When set to True, forces the coefficients to be positive.	`False`
`method`	`Literal[SLSQP, TNC, L - BFGS - B]`	Type of solver to use for optimization.	`"SLSQP"`

Attributes:

Name	Type	Description
`coef_`	`np.ndarray of shape (n_features,)`	Estimated coefficients of the model.
`intercept_`	`float`	Independent term in the linear model. Set to 0.0 if `fit_intercept = False`.
`n_features_in_`	`int`	Number of features seen during `fit`.

Source code in sklego/linear_model.py

class BaseScipyMinimizeRegressor(BaseEstimator, RegressorMixin, ABC):
    """Abstract base class for regressors relying on Scipy's
    [minimize method](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html) to minimize a
    (custom) loss function.

    Derive a class from this one and give it the function to be minimized. The derived class should implement the
    `_get_objective` method, which should return the loss function and its gradient.

    !!! info
        This implementation uses
        [scipy.optimize.minimize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html).

    Parameters
    ----------
    alpha : float, default=0.0
        Constant that multiplies the penalty terms.
    l1_ratio : float, default=0.0
        The ElasticNet mixing parameter, with `0 <= l1_ratio <= 1`:

        - `l1_ratio = 0` is equivalent to an L2 penalty.
        - `l1_ratio = 1` is equivalent to an L1 penalty.
        - `0 < l1_ratio < 1` is the combination of L1 and L2.
    fit_intercept : bool, default=True
        Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations
        (i.e. data is expected to be centered).
    copy_X : bool, default=True
        If True, `X` will be copied; else, it may be overwritten.
    positive : bool, default=False
        When set to True, forces the coefficients to be positive.
    method : Literal["SLSQP", "TNC", "L-BFGS-B"], default="SLSQP"
        Type of solver to use for optimization.

    Attributes
    ----------
    coef_ : np.ndarray of shape (n_features,)
        Estimated coefficients of the model.
    intercept_ : float
        Independent term in the linear model. Set to 0.0 if `fit_intercept = False`.
    n_features_in_ : int
        Number of features seen during `fit`.
    """

    def __init__(
        self,
        alpha=0.0,
        l1_ratio=0.0,
        fit_intercept=True,
        copy_X=True,
        positive=False,
        method="SLSQP",
    ):
        self.alpha = alpha
        self.l1_ratio = l1_ratio
        self.fit_intercept = fit_intercept
        self.copy_X = copy_X
        self.positive = positive
        if method not in ("SLSQP", "TNC", "L-BFGS-B"):
            raise ValueError(f'method should be one of "SLSQP", "TNC", "L-BFGS-B", ' f"got {method} instead")
        self.method = method

    @abstractmethod
    def _get_objective(self, X, y, sample_weight):
        """Produce the loss function to be minimized, and its gradient to speed up computations.

        Parameters
        ----------
        X : np.ndarray of shape (n_samples, n_features)
            The training data.
        y : np.ndarray of shape (n_samples,)
            The target values.
        sample_weight : np.ndarray of shape (n_samples,) | None, default=None
            Individual weights for each sample.

        Returns
        -------
        loss : Callable[[np.ndarray], float]
            The loss function to be minimized.
        grad_loss : Callable[[np.ndarray], np.ndarray]
            The gradient of the loss function. Speeds up finding the minimum.
        """
        ...

    def _regularized_loss(self, params):
        return +self.alpha * self.l1_ratio * np.sum(np.abs(params)) + 0.5 * self.alpha * (1 - self.l1_ratio) * np.sum(
            params**2
        )

    def _regularized_grad_loss(self, params):
        return +self.alpha * self.l1_ratio * np.sign(params) + self.alpha * (1 - self.l1_ratio) * params

    def fit(self, X, y, sample_weight=None):
        """Fit the linear model on training data `X` and `y` by optimizing the loss function using gradient descent.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features )
            The training data.
        y : array-like of shape (n_samples,)
            The target values.
        sample_weight : array-like of shape (n_samples,) | None, default=None
            Individual weights for each sample.

        Returns
        -------
        self : BaseScipyMinimizeRegressor
            Fitted linear model.
        """
        X_, grad_loss, loss = self._prepare_inputs(X, sample_weight, y)

        d = X_.shape[1] - self.n_features_in_  # This is either zero or one.
        bounds = self.n_features_in_ * [(0, np.inf)] + d * [(-np.inf, np.inf)] if self.positive else None
        minimize_result = minimize(
            loss,
            x0=np.zeros(self.n_features_in_ + d),
            bounds=bounds,
            method=self.method,
            jac=grad_loss,
            tol=1e-20,
        )
        self.convergence_status_ = minimize_result.message

        if self.fit_intercept:
            *self.coef_, self.intercept_ = minimize_result.x
        else:
            self.coef_ = minimize_result.x
            self.intercept_ = 0.0

        self.coef_ = np.array(self.coef_)

        return self

    def _prepare_inputs(self, X, sample_weight, y):
        """Prepare the inputs for the optimization problem.

        This method is called by `fit` to prepare the inputs for the optimization problem. It adds an intercept column
        to `X` if `fit_intercept=True`, and returns the loss function and its gradient.
        """
        X, y = check_X_y(X, y, y_numeric=True)
        sample_weight = _check_sample_weight(sample_weight, X)
        self.n_features_in_ = X.shape[1]

        n = X.shape[0]
        if self.copy_X:
            X_ = X.copy()
        else:
            X_ = X
        if self.fit_intercept:
            X_ = np.hstack([X_, np.ones(shape=(n, 1))])

        loss, grad_loss = self._get_objective(X_, y, sample_weight)

        return X_, grad_loss, loss

    def predict(self, X):
        """Predict target values for `X` using fitted linear model by multiplying `X` with the learned coefficients.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            The data to predict.

        Returns
        -------
        array-like of shape (n_samples,)
            The predicted data.
        """
        check_is_fitted(self)
        X = check_array(X)

        return X @ self.coef_ + self.intercept_

`fit(X, y, sample_weight=None)` ¶

Fit the linear model on training data X and y by optimizing the loss function using gradient descent.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features )`	The training data.	required
`y`	`array-like of shape (n_samples,)`	The target values.	required
`sample_weight`	`array-like of shape (n_samples,) \| None`	Individual weights for each sample.	`None`

Returns:

Name	Type	Description
`self`	`BaseScipyMinimizeRegressor`	Fitted linear model.

Source code in sklego/linear_model.py

def fit(self, X, y, sample_weight=None):
    """Fit the linear model on training data `X` and `y` by optimizing the loss function using gradient descent.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features )
        The training data.
    y : array-like of shape (n_samples,)
        The target values.
    sample_weight : array-like of shape (n_samples,) | None, default=None
        Individual weights for each sample.

    Returns
    -------
    self : BaseScipyMinimizeRegressor
        Fitted linear model.
    """
    X_, grad_loss, loss = self._prepare_inputs(X, sample_weight, y)

    d = X_.shape[1] - self.n_features_in_  # This is either zero or one.
    bounds = self.n_features_in_ * [(0, np.inf)] + d * [(-np.inf, np.inf)] if self.positive else None
    minimize_result = minimize(
        loss,
        x0=np.zeros(self.n_features_in_ + d),
        bounds=bounds,
        method=self.method,
        jac=grad_loss,
        tol=1e-20,
    )
    self.convergence_status_ = minimize_result.message

    if self.fit_intercept:
        *self.coef_, self.intercept_ = minimize_result.x
    else:
        self.coef_ = minimize_result.x
        self.intercept_ = 0.0

    self.coef_ = np.array(self.coef_)

    return self

`predict(X)` ¶

Predict target values for X using fitted linear model by multiplying X with the learned coefficients.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	The data to predict.	required

Returns:

Type	Description
`array-like of shape (n_samples,)`	The predicted data.

Source code in sklego/linear_model.py

def predict(self, X):
    """Predict target values for `X` using fitted linear model by multiplying `X` with the learned coefficients.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        The data to predict.

    Returns
    -------
    array-like of shape (n_samples,)
        The predicted data.
    """
    check_is_fitted(self)
    X = check_array(X)

    return X @ self.coef_ + self.intercept_

`sklego.linear_model.ImbalancedLinearRegression` ¶

Bases: BaseScipyMinimizeRegressor

Linear regression where overestimating is overestimation_punishment_factor times worse than underestimating.

A value of overestimation_punishment_factor=5 implies that overestimations by the model are penalized with a factor of 5 while underestimations have a default factor of 1. The formula optimized for is

\[\frac{1}{2 N} \|s \circ (y - Xw) \|_2^2 + \alpha \cdot l_1 \cdot\|w\|_1 + \frac{\alpha}{2} \cdot (1-l_1)\cdot \|w\|_2^2\]

where \(\circ\) is component-wise multiplication and

\[ s = \begin{cases} \text{overestimation_punishment_factor} & \text{if } y - Xw < 0 \\ 1 & \text{otherwise} \end{cases} \]

ImbalancedLinearRegression fits a linear model to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation. Compared to normal linear regression, this approach allows for a different treatment of over or under estimations.

Info

This implementation uses scipy.optimize.minimize.

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Constant that multiplies the penalty terms.	`0.0`
`l1_ratio`	`float`	The ElasticNet mixing parameter, with `0 <= l1_ratio <= 1`: `l1_ratio = 0` is equivalent to an L2 penalty. `l1_ratio = 1` is equivalent to an L1 penalty. `0 < l1_ratio < 1` is the combination of L1 and L2.	`0.0`
`fit_intercept`	`bool`	Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered).	`True`
`copy_X`	`bool`	If True, `X` will be copied; else, it may be overwritten.	`True`
`positive`	`bool`	When set to True, forces the coefficients to be positive.	`False`
`method`	`Literal[SLSQP, TNC, L - BFGS - B]`	Type of solver to use for optimization.	`"SLSQP"`
`overestimation_punishment_factor`	`float`	Factor to punish overestimations more (if the value is larger than 1) or less (if the value is between 0 and 1).	`1.0`

Attributes:

Name	Type	Description
`coef_`	`np.ndarray of shape (n_features,)`	Estimated coefficients of the model.
`intercept_`	`float`	Independent term in the linear model. Set to 0.0 if `fit_intercept = False`.
`n_features_in_`	`int`	Number of features seen during `fit`.

Examples:

import numpy as np
from sklego.linear_model import ImbalancedLinearRegression

np.random.seed(0)
X = np.random.randn(100, 4)
y = X @ np.array([1, 2, 3, 4]) + 2*np.random.randn(100)

over_bad = ImbalancedLinearRegression(overestimation_punishment_factor=50).fit(X, y)
over_bad.coef_
# array([0.36267036, 1.39526844, 3.4247146 , 3.93679175])

under_bad = ImbalancedLinearRegression(overestimation_punishment_factor=0.01).fit(X, y)
under_bad.coef_
# array([0.73519586, 1.28698197, 2.61362614, 4.35989806])

Source code in sklego/linear_model.py

class ImbalancedLinearRegression(BaseScipyMinimizeRegressor):
    r"""Linear regression where overestimating is `overestimation_punishment_factor` times worse than underestimating.

    A value of `overestimation_punishment_factor=5` implies that overestimations by the model are penalized with a
    factor of 5 while underestimations have a default factor of 1. The formula optimized for is

    $$\frac{1}{2 N} \|s \circ (y - Xw) \|_2^2 + \alpha \cdot l_1 \cdot\|w\|_1 + \frac{\alpha}{2} \cdot (1-l_1)\cdot
    \|w\|_2^2$$

    where $\circ$ is component-wise multiplication and

    $$ s = \begin{cases}
    \text{overestimation_punishment_factor} & \text{if } y - Xw < 0 \\
    1 & \text{otherwise}
    \end{cases}
    $$

    `ImbalancedLinearRegression` fits a linear model to minimize the residual sum of squares between the observed
    targets in the dataset, and the targets predicted by the linear approximation.
    Compared to normal linear regression, this approach allows for a different treatment of over or under estimations.

    !!! info
        This implementation uses
        [scipy.optimize.minimize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html).

    Parameters
    ----------
    alpha : float, default=0.0
        Constant that multiplies the penalty terms.
    l1_ratio : float, default=0.0
        The ElasticNet mixing parameter, with `0 <= l1_ratio <= 1`:

        - `l1_ratio = 0` is equivalent to an L2 penalty.
        - `l1_ratio = 1` is equivalent to an L1 penalty.
        - `0 < l1_ratio < 1` is the combination of L1 and L2.
    fit_intercept : bool, default=True
        Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations
        (i.e. data is expected to be centered).
    copy_X : bool, default=True
        If True, `X` will be copied; else, it may be overwritten.
    positive : bool, default=False
        When set to True, forces the coefficients to be positive.
    method : Literal["SLSQP", "TNC", "L-BFGS-B"], default="SLSQP"
        Type of solver to use for optimization.
    overestimation_punishment_factor : float, default=1.0
        Factor to punish overestimations more (if the value is larger than 1) or less (if the value is between 0 and 1).

    Attributes
    ----------
    coef_ : np.ndarray of shape (n_features,)
        Estimated coefficients of the model.
    intercept_ : float
        Independent term in the linear model. Set to 0.0 if `fit_intercept = False`.
    n_features_in_ : int
        Number of features seen during `fit`.

    Examples
    --------
    ```py
    import numpy as np
    from sklego.linear_model import ImbalancedLinearRegression

    np.random.seed(0)
    X = np.random.randn(100, 4)
    y = X @ np.array([1, 2, 3, 4]) + 2*np.random.randn(100)

    over_bad = ImbalancedLinearRegression(overestimation_punishment_factor=50).fit(X, y)
    over_bad.coef_
    # array([0.36267036, 1.39526844, 3.4247146 , 3.93679175])

    under_bad = ImbalancedLinearRegression(overestimation_punishment_factor=0.01).fit(X, y)
    under_bad.coef_
    # array([0.73519586, 1.28698197, 2.61362614, 4.35989806])
    ```
    """

    def __init__(
        self,
        alpha=0.0,
        l1_ratio=0.0,
        fit_intercept=True,
        copy_X=True,
        positive=False,
        method="SLSQP",
        overestimation_punishment_factor=1.0,
    ):
        super().__init__(alpha, l1_ratio, fit_intercept, copy_X, positive, method)
        self.overestimation_punishment_factor = overestimation_punishment_factor

    def _get_objective(self, X, y, sample_weight):
        def imbalanced_loss(params):
            return 0.5 * np.average(
                np.where(X @ params > y, self.overestimation_punishment_factor, 1) * np.square(y - X @ params),
                weights=sample_weight,
            ) + self._regularized_loss(params)

        def grad_imbalanced_loss(params):
            return (
                -(sample_weight * np.where(X @ params > y, self.overestimation_punishment_factor, 1) * (y - X @ params))
                @ X
                / sample_weight.sum()
            ) + self._regularized_grad_loss(params)

        return imbalanced_loss, grad_imbalanced_loss

`sklego.linear_model.QuantileRegression` ¶

Bases: BaseScipyMinimizeRegressor

Compute quantile regression. This can be used for computing confidence intervals of linear regressions.

QuantileRegression fits a linear model to minimize a weighted residual sum of absolute deviations between the observed targets in the dataset and the targets predicted by the linear approximation, i.e.

\[\frac{\text{switch} \cdot ||y - Xw||_1}{2 N} + \alpha \cdot l_1 \cdot ||w||_1 + \frac{\alpha}{2} \cdot (1 - l_1) \cdot ||w||^2_2\]

where

\[\text{switch} = \begin{cases} \text{quantile} & \text{if } y - Xw < 0 \\ 1-\text{quantile} & \text{otherwise} \end{cases}\]

The regressor defaults to LADRegression for its default value of quantile=0.5.

Compared to linear regression, this approach is robust to outliers.

Info

This implementation uses scipy.optimize.minimize.

Warning

If, while fitting the model, sample_weight contains any zero values, some solvers may not converge properly. We would expect that a sample weight of zero is equivalent to removing the sample, however unittests tell us that this is always the case only for method='SLSQP' (our default)

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Constant that multiplies the penalty terms.	`0.0`
`l1_ratio`	`float`	The ElasticNet mixing parameter, with `0 <= l1_ratio <= 1`: `l1_ratio = 0` is equivalent to an L2 penalty. `l1_ratio = 1` is equivalent to an L1 penalty. `0 < l1_ratio < 1` is the combination of L1 and L2.	`0.0`
`fit_intercept`	`bool`	Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered).	`True`
`copy_X`	`bool`	If True, `X` will be copied; else, it may be overwritten.	`True`
`positive`	`bool`	When set to True, forces the coefficients to be positive.	`False`
`method`	`Literal[SLSQP, TNC, L - BFGS - B]`	Type of solver to use for optimization.	`"SLSQP"`
`quantile`	`float`	The line output by the model will have a share of approximately `quantile` data points under it. It should be a value between 0 and 1. A value of `quantile=1` outputs a line that is above each data point, for example. `quantile=0.5` corresponds to LADRegression.	`0.5`

Attributes:

Name	Type	Description
`coef_`	`np.ndarray of shape (n_features,)`	Estimated coefficients of the model.
`intercept_`	`float`	Independent term in the linear model. Set to 0.0 if `fit_intercept = False`.
`n_features_in_`	`int`	Number of features seen during `fit`.

Examples:

import numpy as np
from sklego.linear_model import QuantileRegression

np.random.seed(0)
X = np.random.randn(100, 4)
y = X @ np.array([1, 2, 3, 4])

model = QuantileRegression().fit(X, y)
model.coef_
# array([1., 2., 3., 4.])

y = X @ np.array([-1, 2, -3, 4])
model = QuantileRegression(quantile=0.8).fit(X, y)
model.coef_
# array([-1.,  2., -3.,  4.])

Source code in sklego/linear_model.py

class QuantileRegression(BaseScipyMinimizeRegressor):
    r"""Compute quantile regression. This can be used for computing confidence intervals of linear regressions.

    `QuantileRegression` fits a linear model to minimize a weighted residual sum of absolute deviations between
    the observed targets in the dataset and the targets predicted by the linear approximation, i.e.

    $$\frac{\text{switch} \cdot ||y - Xw||_1}{2 N} + \alpha \cdot l_1 \cdot ||w||_1
        + \frac{\alpha}{2} \cdot (1 - l_1) \cdot ||w||^2_2$$

    where

    $$\text{switch} = \begin{cases}
    \text{quantile} & \text{if } y - Xw < 0 \\
    1-\text{quantile} & \text{otherwise}
    \end{cases}$$

    The regressor defaults to `LADRegression` for its default value of `quantile=0.5`.

    Compared to linear regression, this approach is robust to outliers.

    !!! info
        This implementation uses
        [scipy.optimize.minimize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html).

    !!! warning
        If, while fitting the model, `sample_weight` contains any zero values, some solvers may not converge properly.
        We would expect that a sample weight of zero is equivalent to removing the sample, however unittests tell us
        that this is always the case only for `method='SLSQP'` (our default)

    Parameters
    ----------
    alpha : float, default=0.0
        Constant that multiplies the penalty terms.
    l1_ratio : float, default=0.0
        The ElasticNet mixing parameter, with `0 <= l1_ratio <= 1`:

        - `l1_ratio = 0` is equivalent to an L2 penalty.
        - `l1_ratio = 1` is equivalent to an L1 penalty.
        - `0 < l1_ratio < 1` is the combination of L1 and L2.
    fit_intercept : bool, default=True
        Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations
        (i.e. data is expected to be centered).
    copy_X : bool, default=True
        If True, `X` will be copied; else, it may be overwritten.
    positive : bool, default=False
        When set to True, forces the coefficients to be positive.
    method : Literal["SLSQP", "TNC", "L-BFGS-B"], default="SLSQP"
        Type of solver to use for optimization.
    quantile : float, default=0.5
        The line output by the model will have a share of approximately `quantile` data points under it. It  should
        be a value between 0 and 1.

        A value of `quantile=1` outputs a line that is above each data point, for example.
        `quantile=0.5` corresponds to LADRegression.

    Attributes
    ----------
    coef_ : np.ndarray of shape (n_features,)
        Estimated coefficients of the model.
    intercept_ : float
        Independent term in the linear model. Set to 0.0 if `fit_intercept = False`.
    n_features_in_ : int
        Number of features seen during `fit`.

    Examples
    --------
    ```py
    import numpy as np
    from sklego.linear_model import QuantileRegression

    np.random.seed(0)
    X = np.random.randn(100, 4)
    y = X @ np.array([1, 2, 3, 4])

    model = QuantileRegression().fit(X, y)
    model.coef_
    # array([1., 2., 3., 4.])

    y = X @ np.array([-1, 2, -3, 4])
    model = QuantileRegression(quantile=0.8).fit(X, y)
    model.coef_
    # array([-1.,  2., -3.,  4.])
    ```
    """

    def __init__(
        self,
        alpha=0.0,
        l1_ratio=0.0,
        fit_intercept=True,
        copy_X=True,
        positive=False,
        method="SLSQP",
        quantile=0.5,
    ):
        super().__init__(alpha, l1_ratio, fit_intercept, copy_X, positive, method)
        self.quantile = quantile

    def _get_objective(self, X, y, sample_weight):
        def quantile_loss(params):
            return np.average(
                np.where(X @ params < y, self.quantile, 1 - self.quantile) * np.abs(y - X @ params),
                weights=sample_weight,
            ) + self._regularized_loss(params)

        def grad_quantile_loss(params):
            return (
                -(sample_weight * np.where(X @ params < y, self.quantile, 1 - self.quantile) * np.sign(y - X @ params))
                @ X
                / sample_weight.sum()
            ) + self._regularized_grad_loss(params)

        return quantile_loss, grad_quantile_loss

    def fit(self, X, y, sample_weight=None):
        """Fit the estimator on training data `X` and `y` by minimizing the quantile loss function.

        Parameters
        ----------
        X : array-like of shape (n_samples, n_features)
            The training data.
        y : array-like of shape (n_samples,)
            The target values.
        sample_weight : array-like of shape (n_samples,) | None, default=None
            Individual weights for each sample.

        Returns
        -------
        self : QuantileRegression
            The fitted estimator.

        Raises
        ------
        ValueError
            If `quantile` is not between 0 and 1.
        """
        if 0 <= self.quantile <= 1:
            super().fit(X, y, sample_weight)
        else:
            raise ValueError("Parameter `quantile` should be between zero and one.")

        return self

`fit(X, y, sample_weight=None)` ¶

Fit the estimator on training data X and y by minimizing the quantile loss function.

Parameters:

Name	Type	Description	Default
`X`	`array-like of shape (n_samples, n_features)`	The training data.	required
`y`	`array-like of shape (n_samples,)`	The target values.	required
`sample_weight`	`array-like of shape (n_samples,) \| None`	Individual weights for each sample.	`None`

Returns:

Name	Type	Description
`self`	`QuantileRegression`	The fitted estimator.

Raises:

Type	Description
`ValueError`	If `quantile` is not between 0 and 1.

Source code in sklego/linear_model.py

def fit(self, X, y, sample_weight=None):
    """Fit the estimator on training data `X` and `y` by minimizing the quantile loss function.

    Parameters
    ----------
    X : array-like of shape (n_samples, n_features)
        The training data.
    y : array-like of shape (n_samples,)
        The target values.
    sample_weight : array-like of shape (n_samples,) | None, default=None
        Individual weights for each sample.

    Returns
    -------
    self : QuantileRegression
        The fitted estimator.

    Raises
    ------
    ValueError
        If `quantile` is not between 0 and 1.
    """
    if 0 <= self.quantile <= 1:
        super().fit(X, y, sample_weight)
    else:
        raise ValueError("Parameter `quantile` should be between zero and one.")

    return self

`sklego.linear_model.LADRegression` ¶

Bases: QuantileRegression

Least absolute deviation Regression.

LADRegression fits a linear model to minimize the residual sum of absolute deviations between the observed targets in the dataset, and the targets predicted by the linear approximation, i.e.

\[\frac{1}{N}\|y - Xw \|_1 + \alpha \cdot l_1 \cdot\|w\|_1 + \frac{\alpha}{2} \cdot (1-l_1)\cdot \|w\|^2_2\]

Compared to linear regression, this approach is robust to outliers. You can even optimize for the lowest MAPE (Mean Average Percentage Error), by providing sample_weight=np.abs(1/y_train) when fitting the regressor.

Info

This implementation uses scipy.optimize.minimize.

Warning

If, while fitting the model, sample_weight contains any zero values, some solvers may not converge properly. We would expect that a sample weight of zero is equivalent to removing the sample, however unittests tell us that this is always the case only for method='SLSQP' (our default)

Parameters:

Name	Type	Description	Default
`alpha`	`float`	Constant that multiplies the penalty terms.	`0.0`
`l1_ratio`	`float`	The ElasticNet mixing parameter, with `0 <= l1_ratio <= 1`: `l1_ratio = 0` is equivalent to an L2 penalty. `l1_ratio = 1` is equivalent to an L1 penalty. `0 < l1_ratio < 1` is the combination of L1 and L2.	`0.0`
`fit_intercept`	`bool`	Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered).	`True`
`copy_X`	`bool`	If True, `X` will be copied; else, it may be overwritten.	`True`
`positive`	`bool`	When set to True, forces the coefficients to be positive.	`False`
`method`	`Literal[SLSQP, TNC, L - BFGS - B]`	Type of solver to use for optimization.	`"SLSQP"`
`quantile`	`float`	The line output by the model will have a share of approximately `quantile` data points under it. It should be a value between 0 and 1. A value of `quantile=1` outputs a line that is above each data point, for example. `quantile=0.5` corresponds to LADRegression.	`0.5`

Attributes:

Name	Type	Description
`coef_`	`np.ndarray of shape (n_features,)`	Estimated coefficients of the model.
`intercept_`	`float`	Independent term in the linear model. Set to 0.0 if `fit_intercept = False`.
`n_features_in_`	`int`	Number of features seen during `fit`.

Examples:

import numpy as np
from sklego.linear_model import LADRegression

np.random.seed(0)
X = np.random.randn(100, 4)
y = X @ np.array([1, 2, 3, 4])

model = LADRegression().fit(X, y)
model.coef_
# array([1., 2., 3., 4.])

y = X @ np.array([-1, 2, -3, 4])
model = LADRegression(positive=True).fit(X, y)
model.coef_
# array([7.39575926e-18, 1.42423304e+00, 2.80467827e-17, 4.29789588e+00])

Source code in sklego/linear_model.py

class LADRegression(QuantileRegression):
    r"""Least absolute deviation Regression.

    `LADRegression` fits a linear model to minimize the residual sum of absolute deviations between the observed targets
    in the dataset, and the targets predicted by the linear approximation, i.e.

    $$\frac{1}{N}\|y - Xw \|_1 + \alpha \cdot l_1 \cdot\|w\|_1 + \frac{\alpha}{2} \cdot (1-l_1)\cdot \|w\|^2_2$$

    Compared to linear regression, this approach is robust to outliers. You can even optimize for the lowest MAPE
    (Mean Average Percentage Error), by providing `sample_weight=np.abs(1/y_train)` when fitting the regressor.

    !!! info
        This implementation uses
        [scipy.optimize.minimize](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html).

    !!! warning
        If, while fitting the model, `sample_weight` contains any zero values, some solvers may not converge properly.
        We would expect that a sample weight of zero is equivalent to removing the sample, however unittests tell us
        that this is always the case only for `method='SLSQP'` (our default)

    Parameters
    ----------
    alpha : float, default=0.0
        Constant that multiplies the penalty terms.
    l1_ratio : float, default=0.0
        The ElasticNet mixing parameter, with `0 <= l1_ratio <= 1`:

        - `l1_ratio = 0` is equivalent to an L2 penalty.
        - `l1_ratio = 1` is equivalent to an L1 penalty.
        - `0 < l1_ratio < 1` is the combination of L1 and L2.
    fit_intercept : bool, default=True
        Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations
        (i.e. data is expected to be centered).
    copy_X : bool, default=True
        If True, `X` will be copied; else, it may be overwritten.
    positive : bool, default=False
        When set to True, forces the coefficients to be positive.
    method : Literal["SLSQP", "TNC", "L-BFGS-B"], default="SLSQP"
        Type of solver to use for optimization.
    quantile : float, default=0.5
        The line output by the model will have a share of approximately `quantile` data points under it. It  should
        be a value between 0 and 1.

        A value of `quantile=1` outputs a line that is above each data point, for example.
        `quantile=0.5` corresponds to LADRegression.

    Attributes
    ----------
    coef_ : np.ndarray of shape (n_features,)
        Estimated coefficients of the model.
    intercept_ : float
        Independent term in the linear model. Set to 0.0 if `fit_intercept = False`.
    n_features_in_ : int
        Number of features seen during `fit`.

    Examples
    --------
    ```py
    import numpy as np
    from sklego.linear_model import LADRegression

    np.random.seed(0)
    X = np.random.randn(100, 4)
    y = X @ np.array([1, 2, 3, 4])

    model = LADRegression().fit(X, y)
    model.coef_
    # array([1., 2., 3., 4.])

    y = X @ np.array([-1, 2, -3, 4])
    model = LADRegression(positive=True).fit(X, y)
    model.coef_
    # array([7.39575926e-18, 1.42423304e+00, 2.80467827e-17, 4.29789588e+00])
    ```
    """

    def __init__(
        self,
        alpha=0.0,
        l1_ratio=0.0,
        fit_intercept=True,
        copy_X=True,
        positive=False,
        method="SLSQP",
    ):
        super().__init__(alpha, l1_ratio, fit_intercept, copy_X, positive, method, quantile=0.5)

Linear Models¶

sklego.linear_model.LowessRegression ¶

fit(X, y) ¶

predict(X) ¶

sklego.linear_model.ProbWeightRegression ¶

fit(X, y) ¶

predict(X) ¶

sklego.linear_model.DeadZoneRegressor ¶

fit(X, y) ¶

predict(X) ¶

sklego.linear_model.DemographicParityClassifier ¶

sklego.linear_model.EqualOpportunityClassifier ¶

sklego.linear_model.BaseScipyMinimizeRegressor ¶

fit(X, y, sample_weight=None) ¶

predict(X) ¶

sklego.linear_model.ImbalancedLinearRegression ¶

sklego.linear_model.QuantileRegression ¶

fit(X, y, sample_weight=None) ¶

sklego.linear_model.LADRegression ¶

`sklego.linear_model.LowessRegression` ¶

`fit(X, y)` ¶

`predict(X)` ¶

`sklego.linear_model.ProbWeightRegression` ¶

`fit(X, y)` ¶

`predict(X)` ¶

`sklego.linear_model.DeadZoneRegressor` ¶

`fit(X, y)` ¶

`predict(X)` ¶

`sklego.linear_model.DemographicParityClassifier` ¶

`sklego.linear_model.EqualOpportunityClassifier` ¶

`sklego.linear_model.BaseScipyMinimizeRegressor` ¶

`fit(X, y, sample_weight=None)` ¶

`predict(X)` ¶

`sklego.linear_model.ImbalancedLinearRegression` ¶

`sklego.linear_model.QuantileRegression` ¶

`fit(X, y, sample_weight=None)` ¶

`sklego.linear_model.LADRegression` ¶