from hulearn.outlier import *
¶
FunctionOutlierDetector
¶
This class allows you to pass a function to detect outliers you're interested in. Note that the output of the function needs to be an array with [-1, 1] values (-1 denotes outliers).
Parameters
Name | Type | Description | Default |
---|---|---|---|
func |
the function that return an array of True/False | required | |
**kwargs |
extra keyword arguments will be pass to the function, can be grid-search-able | {} |
The functions that are passed need to be pickle-able. That means no lambda functions!
fit(self, X, y=None)
¶
Show source code in outlier/functionoutlier.py
21 22 23 24 25 26 27 28 |
|
Fit the classifier. No-Op.
partial_fit(self, X, y=None)
¶
Show source code in outlier/functionoutlier.py
30 31 32 33 34 35 36 37 38 |
|
Fit the classifier partially. No-Op.
predict(self, X)
¶
Show source code in outlier/functionoutlier.py
40 41 42 43 44 45 |
|
Make predictions using the passed function.
InteractiveOutlierDetector
¶
This tool allows you to take a drawn model and use it as an outlier detector. If a datapoint does not fit in any of the drawn polygons it becomes a candidate to become an outlier.
Parameters
Name | Type | Description | Default |
---|---|---|---|
json_desc |
python dictionary that contains drawn data | required | |
threshold |
the minimum number of polygons a point needs to be in to not be considered an outlier | 1 |
Usage:
from sklego.datasets import load_penguins
from hulearn.experimental.interactive import InteractiveCharts
df = load_penguins(as_frame=True)
charts = InteractiveCharts(df, labels="species")
# Next notebook cell
charts.add_chart(x="bill_length_mm", y="bill_depth_mm")
# Next notebook cell
charts.add_chart(x="flipper_length_mm", y="body_mass_g")
# After drawing a model, export the data
json_data = charts.data()
# You can now use your drawn intuition as a model!
from hulearn.outlier import InteractiveOutlierDetector
clf = InteractiveOutlierDetector(clf_data)
X, y = df.drop(columns=['species']), df['species']
# This doesn't do anything. But scikit-learn demands it.
clf.fit(X, y)
# This makes predictions, based on your drawn model.
# It can also be used in `GridSearchCV` for benchmarking!
clf.predict(X)
fit(self, X, y=None)
¶
Show source code in outlier/interactiveoutlier.py
105 106 107 108 109 110 |
|
Fit the classifier. Bit of a formality, it's not doing anything specifically.
from_json(path, threshold=1)
(classmethod)¶
Show source code in outlier/interactiveoutlier.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|
Load the classifier from json stored on disk.
Parameters
Name | Type | Description | Default |
---|---|---|---|
path |
path of the json file | required | |
threshold |
the minimum number of polygons a point needs to be in to not be considered an outlier | 1 |
Usage:
from hulearn.outlier import InteractiveOutlierDetector
InteractiveOutlierDetector.from_json("path/to/file.json")
predict(self, X)
¶
Show source code in outlier/interactiveoutlier.py
124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 |
|
Predicts the associated probabilities for each class.
Usage:
from hulearn.drawing-classifier.interactive import InteractiveOutlierDetector
# Assuming a variable `clf_data` that contains the drawn polygons.
clf = InteractiveOutlierDetector(clf_data)
X, y = load_data(...)
# This doesn't do anything. But scikit-learn demands it.
clf.fit(X, y)
# This makes predictions, based on your drawn model.
clf.predict_proba(X)