Metrics¶
sklego.metrics.correlation_score(column)
¶
The correlation score can score how well the estimator predictions correlate with a given column.
This is especially useful to use in situations where "fairness" is a theme.
correlation_score
takes a column on which to calculate the correlation and returns a metric function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column
|
str | int
|
Name of the column (when X is a dataframe) or the index of the column (when X is a numpy array) to score against. |
required |
Returns:
Type | Description |
---|---|
Callable[..., float]
|
A function which calculates the negative correlation between |
Examples:
Source code in sklego/metrics.py
sklego.metrics.equal_opportunity_score(sensitive_column, positive_target=1)
¶
The equality opportunity score calculates the ratio between the probability of a true positive outcome given the sensitive attribute (column) being true and the same probability given the sensitive attribute being false.
This is especially useful to use in situations where "fairness" is a theme.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sensitive_column
|
str | int
|
Name of the column containing the binary sensitive attribute (when X is a dataframe) or the index of the column (when X is a numpy array). |
required |
positive_target
|
The name of the class which is associated with a positive outcome |
1
|
Returns:
Type | Description |
---|---|
Callable[..., float]
|
A function which calculates the equal opportunity score for z = column |
Examples:
Source
M. Hardt, E. Price and N. Srebro (2016), Equality of Opportunity in Supervised Learning
Source code in sklego/metrics.py
sklego.metrics.p_percent_score(sensitive_column, positive_target=1)
¶
The p_percent score calculates the ratio between the probability of a positive outcome given the sensitive attribute (column) being true and the same probability given the sensitive attribute being false.
This is especially useful to use in situations where "fairness" is a theme.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sensitive_column
|
str | int
|
Name of the column containing the binary sensitive attribute (when X is a dataframe) or the index of the column (when X is a numpy array). |
required |
positive_target
|
int
|
The name of the class which is associated with a positive outcome |
1
|
Returns:
Type | Description |
---|---|
Callable[..., float]
|
A function which calculates the p percent score for z = column |
Examples:
Source
M. Zafar et al. (2017), Fairness Constraints: Mechanisms for Fair Classification
Source code in sklego/metrics.py
sklego.metrics.subset_score(subset_picker, score, **kwargs)
¶
Return a method that applies the passed score only to a specific subset.
The subset picker is a method that is passed the corresponding X
and y_true
and returns a one-dimensional
boolean vector where every element corresponds to a row in the data.
Only the elements with a True value are taken into account for the passed score, representing a filter.
This allows users to have an easy approach to measuring metrics over different slices of the population which can give insights into the model performance, either specifically for fairness or in general.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
subset_picker
|
Callable
|
Method that returns a boolean mask that is used for slicing the samples |
required |
score
|
Callable[..., T]
|
The score that needs to be applied to the subset |
required |
kwargs
|
dict
|
Additional keyword arguments to pass to score |
{}
|
Returns:
Type | Description |
---|---|
Callable[..., T]
|
A function which calculates the passed score for the subset |
Examples:
from sklego.metrics import subset_score
...
subset_score(lambda X, y_true: X['column'] == 'A', accuracy_score)(clf, X, y)