What Are Embeddings

Imaginary Tokens

Let's make a few word-embeddings. The basic object for this is an Embedding object.

from whatlies import Embedding

foo = Embedding("foo", [0.5, 0.1])
bar = Embedding("bar", [0.1, 0.2])
buz = Embedding("buz", [0.3, 0.3])

These are all embedding objects. It has a name and a vector. It also has a representation.

foo # Emb[foo]

We can also apply operations on it as if it was a vector.

foo | (bar - buz) # Emb[(foo | (bar - buz))]

This will also change the internal vector.

foo.vector                  # array([ 0.50,  0.10]
(foo | (bar - buz)).vector  # array([ 0.06, -0.12])

But why read when we can plot? The whole point of this package is to make it visual.

for t in [foo, bar, buz]:
    t.plot(kind="scatter").plot(kind="text");

Meaning

Let's come up with imaginary embeddings for man, woman, king and queen.

We will plot them using the arrow plotting type.

man   = Embedding("man", [0.5, 0.1])
woman = Embedding("woman", [0.5, 0.6])
king  = Embedding("king", [0.7, 0.33])
queen = Embedding("queen", [0.7, 0.9])

man.plot(kind="arrow", color="blue")
woman.plot(kind="arrow", color="red")
king.plot(kind="arrow", color="blue")
queen.plot(kind="arrow", color="red")
plt.axis('off');

King - Man + Woman

We can confirm the classic approximation that everybody likes to mention.

man.plot(kind="arrow", color="blue")
woman.plot(kind="arrow", color="red")
king.plot(kind="arrow", color="blue")
queen.plot(kind="arrow", color="red")

(king - man + woman).plot(kind="arrow", color="pink")

plt.axis('off');

King - Queen

But maybe I am interested in the vector that spans between queen and king. I'll use the - operator here to indicate the connection between the two tokens.

Notice the poetry there...

man.plot(kind="arrow", color="blue")
woman.plot(kind="arrow", color="red")
king.plot(kind="arrow", color="blue")
queen.plot(kind="arrow", color="red")
(queen - king).plot(kind="arrow", color="pink", show_ops=True)
plt.axis('off');

Man | (Queen - King)

But that space queen-king ... we can also filter all that information out of our words. Linear algebra would call this "making it orthogonal". The | operator makes sense here.

man.plot(kind="arrow", color="blue")
woman.plot(kind="arrow", color="red")
king.plot(kind="arrow", color="blue")
queen.plot(kind="arrow", color="red")

(queen - king).plot(kind="arrow", color="pink", show_ops=True)
(man | (queen - king)).plot(kind="arrow", color="pink", show_ops=True)
plt.axis('off');

Embedding Mathmatics

This is interesting. We have our original tokens and can filter away the (man-woman) axis. By doing this we get "new" embeddings with different properties. Numerically we can confirm in our example that this new space maps Emb(man) to be very similar to Emb(woman).

(man | (queen - king)).vector    # array([0.5, 0. ]
(woman | (queen - king)).vector  # array([0.49999999, 1e-16. ]

The same holds for Emb(queen) and Emb(man).

(queen | (man - woman)).vector   # array([0.7, 0. ]
(king | (man - woman)).vector    # array([0.7, 0. ]

More Operations

Let's consider some other operations. For this we will make new embeddings.

man   = Embedding("man", [0.5, 0.15])
woman = Embedding("woman", [0.35, 0.2])
king  = Embedding("king", [0.2, 0.2])

man.plot(kind="arrow", color="blue")
woman.plot(kind="arrow", color="red")
king.plot(kind="arrow", color="green")

plt.xlim(0, 0.5)
plt.ylim(0, 0.5)
plt.axis('off');

Mapping Unto Tokens

In the previous example we demonstrated how to map "away" from vectors. But we can also map "unto" vectors. For this we introduce the >> operator.

man.plot(kind="arrow", color="blue")
woman.plot(kind="arrow", color="red")
(woman >> man).plot(kind="arrow", color="red")
(woman >> king).plot(kind="arrow", color="red")
king.plot(kind="arrow", color="green")

plt.xlim(0, 0.5)
plt.ylim(0, 0.5)
plt.axis('off');

Measuring the Mapping

Note that the woman vector in our embedding maps partially unto man and overshoots a bit on king. We can quantify this by measuring what percentage of the vector is covered. This factor can be retreived by using the > operator.

woman > king  # 1.3749
woman > man   # 0.7522

Interesting

This suggests that perhaps ... king and man can be used as axes for plotting?

It would also work if the embeddings were in a very high dimensional plane.

No matter how large the embedding, we could've said woman spans 1.375 of king and 0.752 of man. Given king as the x-axis and man as the y-axis, we can map the token of man to a 2d representation (1.375, 0.752) which is easy to plot.

This is an interesting way of thinking about it. We can plot high dimensional vectors in 2d as long as we can plot it along two axes. An axis could be a vector of a token, or a token that has had operations on it.

Note that this > mapping can also cause negative values.

foo = Embedding("foo", [-0.2, -0.2])

foo.plot(kind="arrow", color="pink")
woman.plot(kind="arrow", color="red")
king.plot(kind="arrow", color="green")
(foo >> woman).plot(kind="arrow", color="red", show_ops=True)

plt.xlim(-.3, 0.4)
plt.ylim(-.3, 0.4)
plt.axis('off');

foo > woman # -0.6769

Plotting High Dimensions

Let's confirm this idea by using some spaCy word-vectors.

import spacy
nlp = spacy.load('en_core_web_md')

words = ["cat", "dog", "fish", "kitten", "man", "woman", "king", "queen", "doctor", "nurse"]
tokens = {t.text: Embedding(t.text, t.vector) for t in nlp.pipe(words)}

x_axis = tokens['man']
y_axis = tokens['woman']
for name, t in tokens.items():
    t.plot(x_axis=x_axis, y_axis=y_axis).plot(kind="text", x_axis=x_axis, y_axis=y_axis)

The interesting thing here is that we can also perform operations on these words before plotting them.

royalty = tokens['king'] - tokens['queen']
gender = tokens['man'] - tokens['woman']
for n, t in tokens.items():
    (t
      .plot(x_axis=royalty, y_axis=gender)
      .plot(kind="text", x_axis=royalty, y_axis=gender))

The idea seems to work. But maybe we can introduce cooler charts and easier ways to deal with collections of embeddings.