TimmEncoder¶
Use a pretrained vision model from TorchVision to generate embeddings. Embeddings
are provider via the lovely timm
library.
You can find a list of available models here.
Parameters
Name | Type | Description | Default |
---|---|---|---|
name |
name of the model to use | 'mobilenetv3_large_100' |
|
encode_predictions |
output the predictions instead of the pooled embedding layer before | False |
Usage:
import pandas as pd
from sklearn.pipeline import make_pipeline
from embetter.grab import ColumnGrabber
from embetter.vision import ImageLoader, TimmEncoder
# Let's say we start we start with a csv file with filepaths
data = {"filepaths": ["tests/data/thiscatdoesnotexist.jpeg"]}
df = pd.DataFrame(data)
# Let's build a pipeline that grabs the column, turns it
# into an image and embeds it.
pipe = make_pipeline(
ColumnGrabber("filepaths"),
ImageLoader(),
TimmEncoder(name="mobilenetv3_large_100")
)
# This pipeline can now encode each image in the dataframe
pipe.fit_transform(df)
transform(self, X, y=None)
¶
Show source code in vision/_torchvis.py
55 56 57 58 59 60 |
|
Transforms grabbed images into numeric representations.