whatlies.language.ConveRTLanguage

This object is used to fetch Embeddings or EmbeddingSets from a ConveRT model. This object is meant for retreival, not plotting.

Important

This object will automatically download a large file if it is not cached yet.

This language model does not contain a vocabulary, so it cannot be used to retreive similar tokens. Use an EmbeddingSet instead.

This language backend might require you to manually install extra dependencies unless you installed via either;

pip install whatlies[tfhub]
pip install whatlies[all]

Parameters

Name Type Description Default
model_id str identifier used for loading the corresponding TFHub module, we currently only allow 'convert'. 'convert'

Usage:

> from whatlies.language import ConveRTLanguage
> lang = ConveRTLanguage()
> lang['bank']

__getitem__(self, query)

Show source code in language/_convert_lang.py
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
    def __getitem__(
        self, query: Union[str, List[str]]
    ) -> Union[Embedding, EmbeddingSet]:
        """
        Retreive a single embedding or a set of embeddings.

        Arguments:
            query: single string or list of strings

        **Usage**

        ```python
        > from whatlies.language import ConveRTLanguage
        > lang = ConveRTLanguage()
        > lang['bank']
        > lang = ConveRTLanguage()
        > lang[['bank of the river', 'money on the bank', 'bank']]
        ```
        """
        if isinstance(query, str):
            query_tensor = tf.convert_to_tensor([query])
            encoding = self.model(query_tensor)
            if self.signature == "encode_sequence":
                vec = encoding["sequence_encoding"].numpy().sum(axis=1)[0]
            else:
                vec = encoding["default"].numpy()[0]
            return Embedding(query, vec)
        return EmbeddingSet(*[self[tok] for tok in query])

Retreive a single embedding or a set of embeddings.

Parameters

Name Type Description Default
query Union[str, List[str]] single string or list of strings required

Usage

> from whatlies.language import ConveRTLanguage
> lang = ConveRTLanguage()
> lang['bank']
> lang = ConveRTLanguage()
> lang[['bank of the river', 'money on the bank', 'bank']]