whatlies.transformers.Tsne

This transformer transformers all vectors in an EmbeddingSet by means of tsne. This implementation uses scikit-learn.

Important

TSNE does not allow you to train a transformation and re-use it. It must retrain every time it sees data. You may also notice that it is relatively slow. This unfortunately is a fact of life.

Parameters

Name Type Description Default
n_components the number of compoments to create/add 2
**kwargs keyword arguments passed to the Tsne implementation, includes things like perplexity link {}

Usage:

from whatlies.language import SpacyLanguage
from whatlies.transformers import Tsne

words = ["prince", "princess", "nurse", "doctor", "banker", "man", "woman",
         "cousin", "neice", "king", "queen", "dude", "guy", "gal", "fire",
         "dog", "cat", "mouse", "red", "blue", "green", "yellow", "water",
         "person", "family", "brother", "sister"]

lang = SpacyLanguage("en_core_web_md")
emb = lang[words]

emb.transform(Tsne(3)).plot_interactive_matrix(0, 1, 2)

transform(self, embset)

Show source code in transformers/_tsne.py
46
47
48
49
50
51
52
    def transform(self, embset):
        names, X = embset.to_names_X()
        # We are re-writing the transform method here because TSNE cannot .fit().transform().
        # Check the docs here: https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#sklearn.manifold.TSNE
        new_vecs = self.tfm.fit_transform(X)
        new_dict = new_embedding_dict(names, new_vecs, embset)
        return EmbeddingSet(new_dict, name=f"{embset.name}.tsne({self.n_components})")

Transform the given EmbeddingSet instance.

Parameters

Name Type Description Default
embset an EmbeddingSet instance to be transformed. required