USE
UniversalSentenceLanguage(variant='base', version=None)
Show source code in language/_sentence_encode_lang.py
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60 | def UniversalSentenceLanguage(variant: str = "base", version: Union[int, None] = None):
"""
Retreive a [universal sentence encoder](https://tfhub.dev/google/collections/universal-sentence-encoder/1) model from tfhub.
You can download specific versions for specific variants. The variants that we support are listed below.
- `"base"`: the base variant (915MB) [link](https://tfhub.dev/google/universal-sentence-encoder/4)
- `"large"`: the large variant (523MB) [link](https://tfhub.dev/google/universal-sentence-encoder-large/5)
- `"qa"`: the variant based on question/answer (528MB) [link](https://tfhub.dev/google/universal-sentence-encoder-qa/3)
- `"multi"`: the multi-language variant (245MB) [link](https://tfhub.dev/google/universal-sentence-encoder-multilingual/3)
- `"multi-large"`: the large multi-language variant (303MB) [link](https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/3)
- `"multi-qa"`: the multi-language qa variant (310MB) [link](https://tfhub.dev/google/universal-sentence-encoder-multilingual-qa/3)
TFHub reports that the multi-language models support Arabic, Chinese-simplified, Chinese-traditional,
English, French, German, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Spanish, Thai, Turkish and Russian.
Important:
This object will automatically download a large file if it is not cached yet.
This language model does not contain a vocabulary, so it cannot be used
to retreive similar tokens. Use an `EmbeddingSet` instead.
This language backend might require you to manually install extra dependencies
unless you installed via either;
```
pip install whatlies[tfhub]
pip install whatlies[all]
```
Arguments:
variant: select a specific variant
version: select a specific version, if kept `None` we'll assume the most recent version
"""
urls = {
"base": "https://tfhub.dev/google/universal-sentence-encoder/",
"large": "https://tfhub.dev/google/universal-sentence-encoder-large/",
"qa": "https://tfhub.dev/google/universal-sentence-encoder-qa/",
"multi": "https://tfhub.dev/google/universal-sentence-encoder-multilingual/",
"multi-large": "https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/",
"multi-qa": "https://tfhub.dev/google/universal-sentence-encoder-multilingual-qa/3",
}
versions = {
"base": 4,
"large": 5,
"qa": 3,
"multi": 3,
"multi-large": 3,
"multi-qa": 3,
}
version = versions[variant] if not version else version
url = urls[variant] + str(version)
return TFHubLanguage(url=url)
|
Retreive a universal sentence encoder model from tfhub.
You can download specific versions for specific variants. The variants that we support are listed below.
"base"
: the base variant (915MB) link
"large"
: the large variant (523MB) link
"qa"
: the variant based on question/answer (528MB) link
"multi"
: the multi-language variant (245MB) link
"multi-large"
: the large multi-language variant (303MB) link
"multi-qa"
: the multi-language qa variant (310MB) link
TFHub reports that the multi-language models support Arabic, Chinese-simplified, Chinese-traditional,
English, French, German, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Spanish, Thai, Turkish and Russian.
Important
This object will automatically download a large file if it is not cached yet.
This language model does not contain a vocabulary, so it cannot be used
to retreive similar tokens. Use an EmbeddingSet
instead.
This language backend might require you to manually install extra dependencies
unless you installed via either;
pip install whatlies[tfhub]
pip install whatlies[all]
Parameters
Name |
Type |
Description |
Default |
variant |
str |
select a specific variant |
'base' |
version |
Optional[int] |
select a specific version, if kept None we'll assume the most recent version |
None |