CohereEncoder¶
embetter.external.CohereEncoder
¶
Encoder that can numerically encode sentences.
Note that this is an external embedding provider. If their API breaks, so will this component.
Parameters
Name | Type | Description | Default |
---|---|---|---|
client |
cohere client with key | required | |
model |
name of model, can be "small" or "large" | 'large' |
Usage:
import pandas as pd
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from cohere import Client
from embetter.grab import ColumnGrabber
from embetter.external import CohereEncoder
client = Client("APIKEY")
# Let's suppose this is the input dataframe
dataf = pd.DataFrame({
"text": ["positive sentiment", "super negative"],
"label_col": ["pos", "neg"]
})
# This pipeline grabs the `text` column from a dataframe
# which then get fed into Cohere's endpoint
text_emb_pipeline = make_pipeline(
ColumnGrabber("text"),
CohereEncoder(client=client, model="large")
)
X = text_emb_pipeline.fit_transform(dataf, dataf['label_col'])
# This pipeline can also be trained to make predictions, using
# the embedded features.
text_clf_pipeline = make_pipeline(
text_emb_pipeline,
LogisticRegression()
)
# Prediction example
text_clf_pipeline.fit(dataf, dataf['label_col']).predict(dataf)
transform(self, X, y=None)
¶
Show source code in external/_cohere.py
64 65 66 67 68 69 70 |
|
Transforms the text into a numeric representation.