whatlies.language.TFHubLanguage

This class provides the abitilty to load and use text-embedding models of Tensorflow Hub to retrieve Embeddings or EmbeddingSets from them. A list of supported models is available here; however, note that only those models which operate directly on raw text (i.e. don't require any pre-processing such as tokenization) are supported for the moment (e.g. models such as BERT or ALBERT are not supported). Further, the TF-Hub compatible models from other repositories (i.e. other than tfhub.dev) are also supported.

Important

This object will automatically download a large file if it is not cached yet.

This language model does not contain a vocabulary, so it cannot be used to retreive similar tokens. Use an EmbeddingSet instead.

This language backend might require you to manually install extra dependencies unless you installed via either;

pip install whatlies[tfhub]
pip install whatlies[all]

Further, consider that this language model mainly supports TensorFlow 2.x models (i.e. TF2 SavedModel format); although, TensorFlow 1.x models might be supported to some extent as well (see hub.load documentation as well as model compatibility guide).

Parameters

Name Type Description Default
url str The url or local directory path of the model. required
tags Optional[List[str]] A set of strings specifying the graph variant to use, if loading from a TF1 module. It is passed to hub.load function. None
signature Optional[str] An optional signature of the model to use. None

Usage:

> from whatlies.language import TFHubLanguage
> lang = TFHubLanguage("https://tfhub.dev/google/nnlm-en-dim50/2")
> lang['today is a gift']
> lang = TFHubLanguage("https://tfhub.dev/google/nnlm-en-dim50/2")
> lang[['withdraw some money', 'take out cash', 'cash out funds']]

__getitem__(self, query)

Show source code in language/_tfhub_lang.py
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
    def __getitem__(
        self, query: Union[str, List[str]]
    ) -> Union[Embedding, EmbeddingSet]:
        """
        Retreive a single embedding or a set of embeddings.

        Arguments:
            query: single string or list of strings

        **Usage**

        ```python
        > from whatlies.language import TFHubLanguage
        > lang = TFHubLanguage("https://tfhub.dev/google/nnlm-en-dim50/2")
        > lang['today is a gift']
        > lang = TFHubLanguage("https://tfhub.dev/google/nnlm-en-dim50/2")
        > lang[['withdraw some money', 'take out cash', 'cash out funds']]
        ```
        """
        if isinstance(query, str):
            return self._get_embedding(query)
        return EmbeddingSet(*[self._get_embedding(q) for q in query])

Retreive a single embedding or a set of embeddings.

Parameters

Name Type Description Default
query Union[str, List[str]] single string or list of strings required

Usage

> from whatlies.language import TFHubLanguage
> lang = TFHubLanguage("https://tfhub.dev/google/nnlm-en-dim50/2")
> lang['today is a gift']
> lang = TFHubLanguage("https://tfhub.dev/google/nnlm-en-dim50/2")
> lang[['withdraw some money', 'take out cash', 'cash out funds']]