Trying and Failing to Interpret Embeddings by tedtimbrell

Share This Article

Sed ut perspiciatis unde.

I was born with congenital anosmia i.e. I cannot and have never been able to smell. Farts, flowers, cookies, and perfume; I have no personal experience of any of these smells. Yet, I can tell you that farts take over a room and cookies smell like home. This is all picked up through context. It’s picked up from watching my friends retching at the stench of the small animal that died in the vents of my middle school. For me, a smell is defined by its relation to other smells and the emotive descriptions of others. This is not altogether different from a sentence embedding.

I want to eventually build a system that can help me interpret smells. Now, I know I could ask an LLM (or a friend) to describe the smell of something but I do wonder if vector addition could provide some unexpected insights. What smells are quite similar but distant in context? I’d also like to try using reduced vectors to generate music or some other synesthetic output.

In this post, I’m going to explore vector addition and vector rotations as a means of modifying and interpreting these embeddings. My explorations are (mostly) a failure although hopefully, my process might save someone else some time.

If you have any ideas or corrections, please email me at ted@timbrell.dev

Background

My inspiration for this comes from Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. Except in this case, I’m using sentence embeddings rather than word embeddings. If I want to embed smells, I’ll need to be able to input “the smell of red wine.”

from openai import OpenAI
import numpy as np
import heapq
import pandas as pd
import os


openai_client = OpenAI()


def get_embedding_openai(names):
    response = openai_client.embeddings.create(
        input=names, model="text-embedding-3-small"
    )
    return np.array([d.embedding for d in response.data])

king, queen, man, woman, prince, princess = get_embedding_openai(
    [
        "king of england",
        "queen of england",
        "man",
        "woman",
        "prince of england",
        "princess of england",
    ]
)
son, daughter, actor, actress, steward, stewardess = get_embedding_openai([
    "son",
    "daughter",
    "actor",
    "actress",
    "steward",
    "stewardess",
])

Q: Wait aren’t you supposed to be using smells?

It’s pretty hard to reason about something you can’t experience. I might know fresh coffee smells good in the morning but I can’t tell how similar/dissimilar that is the smell of dew in the morning. I’ll get to smells in a later post.

I’m already making a jump from word embeddings to sentence embeddings so I believe it’s worth revisitng gender before moving on. It also has the benefit of being easy to generate examples for and is built into the English language. I’ll be using cosine similarity and Euclidean distance to get a sense of the distance between the vectors.

def cosine_similarity(vec1, vec2):
    vec1 = np.array(vec1)
    vec2 = np.array(vec2)
    dot_product = np.dot(vec1, vec2)
    magnitude_vec1 = np.linalg.norm(vec1)
    magnitude_vec2 = np.linalg.norm(vec2)
    if magnitude_vec1 == 0 or magnitude_vec2 == 0:
        return 0.0

    return dot_product / (magnitude_vec1 * magnitude_vec2)


def euc_dist(a, b):
    return sum(abs(a - b))

Simple example: vector offsets and addition

Let’s try to get the vector for “King” from the vector for “Queen”.

male_offset = man - woman
added_queen = queen + male_offset
print(f"{cosine_similarity(king, queen)=}")
print(f"{cosine_similarity(king, added_queen)=}")

cosine_similarity(king, queen)=np.float64(0.7561968293567973)
cosine_similarity(king, added_queen)=np.float64(0.7436281583952487)

print(f"{euc_dist(king, queen)=}")
print(f"{euc_dist(king, added_queen)=}")

euc_dist(king, queen)=np.float64(21.68610072977549)
euc_dist(king, added_queen)=np.float64(23.88454184561374)

Well, that’s annoying. Unlike what I’d expect from the word embedding paper, our vector for “Queen” plus the gender offset is further away from the vector for “King” both in angle and Euclidean distance.

I’m also surprised by just how little the similarity metrics moved. Then again, the geometry is unclear here. The vector offset might be going in the wrong direction or under/overshooting.

f"man - woman offset magnitude: {np.linalg.norm(male_offset)}", f"King - queen offset magnitude {np.linalg.norm(king - queen)}"

('man - woman offset magnitude: 0.7648199511941038',
 'King - queen offset magnitude 0.6982881772713763')

So we’re moving, roughly, the same distance as we’d need to reach the “king” vector.

f"{np.arccos(cosine_similarity(added_queen, queen))} radians between added_queen and queen"
f"{np.arccos(cosine_similarity(king, queen))} radians between king and queen"

'0.7318444725115928 radians between added_queen and queen'
'0.7133150836556438 radians between king and queen'

And we’re changing our angle by roughly the same amount as expected… just not in the right direction.

Let’s take a look at the cosine similarity between these gendered offsets

gender_vectors = [
    man - woman,
    king - queen,
    prince - princess,
    son - daughter,
    actor - actress,
    steward - stewardess,
]
for idx in range(len(gender_vectors)):
    gender_vectors[idx] /= np.linalg.norm(gender_vectors[0])

res = np.zeros(shape=(len(gender_vectors), len(gender_vectors)))

for r in range(len(gender_vectors)):
    for c in range(len(gender_vectors)):
        res[r, c] = cosine_similarity(gender_vectors[r], gender_vectors[c])
pd.DataFrame(res)

	0	1	2	3	4	5
0	1.000000	0.455847	0.438728	0.244890	0.276235	0.214461
1	0.455847	1.000000	0.657469	0.222574	0.386973	0.355544
2	0.438728	0.657469	1.000000	0.229321	0.337566	0.385467
3	0.244890	0.222574	0.229321	1.000000	0.154418	0.111242
4	0.276235	0.386973	0.337566	0.154418	1.000000	0.232568
5	0.214461	0.355544	0.385467	0.111242	0.232568	1.000000

Despite the thought that these are just gendered versions of the same concept… the offsets point in quite different directions. son - daughter differs from steward - stewardess by 1.47 radians (or 84 degrees).

Rotation

I’m not up to date on research into embeddings but I find the use of vector addition for these analyses odd. I know that these vectors are generated through a series of additions and activations but if these models are normalizing everything to a unit vector and comparing everything with cosine similarity are we not inherently saying that it’s the angles that matter?

To that end, what if I rotate the “queen” vector along the plane created by the “man” and “woman” vectors? The vectors for “king” and “queen” have to be offset from our vectors for “man” and “woman”. We also know that embeddings capture more concepts than just the N dimensions represented in the vector. A rotation, while more expensive, could help in the case of an angular difference between the initial vector pair and the compared vector pair.

Below, we try rotating our queen vector with the rotation matrix found from getting to “man” from “woman”.

def compute_nd_rotation_matrix(a, b):
    a_norm = a / np.linalg.norm(a)
    b_norm = b / np.linalg.norm(b)
    
    cos_theta = np.dot(a_norm, b_norm)
    cos_theta = np.clip(cos_theta, -1.0, 1.0)
    angle = np.arccos(cos_theta)
    
    v = b_norm - np.dot(b_norm, a_norm) * a_norm
    v_norm = np.linalg.norm(v)
    
    if v_norm < 1e-8:  # a and b are collinear
        return np.eye(len(a)),
    
    v = v / v_norm
    
    identity = np.eye(len(a))
    outer_aa = np.outer(a_norm, a_norm)
    outer_av = np.outer(a_norm, v)
    outer_va = np.outer(v, a_norm)
    outer_vv = np.outer(v, v)
    
    R = (
        identity
        + np.sin(angle) * (outer_va - outer_av)
        + (np.cos(angle) - 1) * (outer_vv + outer_aa)
    )
    
    return R, angle
    

gender_rotation, gender_angle = compute_nd_rotation_matrix(woman, man)

rotated_queen = np.dot(gender_rotation, queen)

def highlight_max(s):
    is_max = s == s.max()
    return ["font-weight: bold" if v else "" for v in is_max]


def highlight_min(s):
    is_min = s == s.min()
    return ["font-weight: bold" if v else "" for v in is_min]


def compute_results(*, target, source, offset, rotation):
    target_norm = target / np.linalg.norm(target)
    source_norm = source / np.linalg.norm(source)
    added_source = source_norm + offset
    added_source /= np.linalg.norm(added_source)

    rotated_source = np.dot(rotation, source_norm)

    rotated_vector_metrics = {
        "cosine_similarity": cosine_similarity(target_norm, rotated_source),
        "euclidean_distance": euc_dist(target_norm, rotated_source),
    }

    summed_vector_metrics = {
        "cosine_similarity": cosine_similarity(target_norm, added_source),
        "euclidean_distance": euc_dist(target_norm, added_source),
    }

    original_vector_metrics = {
        "cosine_similarity": cosine_similarity(target_norm, source_norm),
        "euclidean_distance": euc_dist(target_norm, source_norm),
    }

    df = pd.DataFrame(
        {
            "Original Vector": original_vector_metrics,
            "Summed Vector": summed_vector_metrics,
            "Rotated Vector": rotated_vector_metrics,
        }
    ).T
    return df


def style_results(df):
    styled_df = df.style.apply(highlight_max, subset=["cosine_similarity"])
    styled_df.apply(highlight_min, subset=["euclidean_distance"])
    return styled_df


style_results(
    compute_results(
        target=king, source=queen, offset=male_offset, rotation=gender_rotation
    )
)

	cosine_similarity	euclidean_distance
Original Vector	0.756197	21.686100
Summed Vector	0.743628	22.333814
Rotated Vector	0.800727	19.759377

np.arccos(0.756197)- np.arccos(0.800727)

np.float64(0.07102636138332474)

The rotation helps! Though, it only moves us 0.07 radians (4 degrees) closer.

Let’s try with other gendered titles.

Below we try with “prince” and “princess”,


style_results(compute_results(target=prince, source=princess, rotation=gender_rotation, offset=male_offset))

	cosine_similarity	euclidean_distance
Original Vector	0.798122	19.734372
Summed Vector	0.752733	21.830848
Rotated Vector	0.831910	17.951380

This yields similar results, although this is just another title for royalty.

style_results(compute_results(target=son, source=daughter, rotation=gender_rotation, offset=male_offset))

	cosine_similarity	euclidean_distance
Original Vector	0.506902	30.924019
Summed Vector	0.505972	31.099625
Rotated Vector	0.519920	30.628173

style_results(compute_results(target=actor, source=actress, rotation=gender_rotation, offset=male_offset))

	cosine_similarity	euclidean_distance
Original Vector	0.618884	27.222238
Summed Vector	0.565936	29.213596
Rotated Vector	0.644523	26.165533

style_results(compute_results(target=steward, source=stewardess, rotation=gender_rotation, offset=male_offset))

	cosine_similarity	euclidean_distance
Original Vector	0.753096	21.497546
Summed Vector	0.633975	26.397621
Rotated Vector	0.760474	21.201120

I have two takeaways here, 1) that the summed vector is worse in every pairing and 2) that the rotated vector is encoding some aspect of gender (but the improvement is quite small). Let’s explore each of these.

1. The summed vector is further from the target than the original vector of all title pairs.

I find this result suspicious. Let’s try scaling the offset vector to see if I can get a better result.

from scipy.optimize import minimize

def objective(k, source, offset, target):
    adjusted_vector = source + k * offset
    return -cosine_similarity(adjusted_vector, target)

options = []
for target, source in [
    (king, queen),
    (prince, princess),
    (son, daughter),
    (actor, actress),
    (steward, stewardess),
]:
    initial_k = 0.0

    result = minimize(objective, initial_k, args=(source, male_offset, target))
    optimal_k = result.x[0]
    options.append(optimal_k)
    print(optimal_k)
average_k = sum(options) / len(options)
print(f"Average K: {average_k}")

0.4442216918664982
0.3758034405295738
0.45508087793089264
0.3248524288805894
0.18004962701645405
Average K: 0.3560016132448016

Above, I’m printing the individual best-fit scalar modifier for our gender offset vector for each pair. We can see it’s overshooting in every case.

For simplicity, let’s average the optimal scalar and recompute the similarity stats. This is not optimal as I should be minimizing on the batch and then testing on out-of-sample data.

Trivial to say; that a singular, consistent magnitude for the offset would have been nice. In the case where we don’t have a known target, that offset would allow us to naively add/subtract the gender offset to a source vector and have confidence in its meaning.

for target, source in [
    (king, queen),
    (prince, princess),
    (son, daughter),
    (actor, actress),
    (steward, stewardess),
]:
    display(style_results(compute_results(target=target, source=source, rotation=gender_rotation, offset=male_offset * average_k)))
    print()

Queen -> King	cosine_similarity	euclidean_distance
Original Vector	0.756197	21.686100
Summed Vector	0.801380	19.744261
Rotated Vector	0.800727	19.759377

Princess -> Prince	cosine_similarity	euclidean_distance
Original Vector	0.798122	19.734372
Summed Vector	0.833044	18.017618
Rotated Vector	0.831910	17.951380

Daughter -> Son	cosine_similarity	euclidean_distance
Original Vector	0.506902	30.924019
Summed Vector	0.537458	29.976789
Rotated Vector	0.519920	30.628173

Actress -> Actor	cosine_similarity	euclidean_distance
Original Vector	0.618884	27.222238
Summed Vector	0.638717	26.398096
Rotated Vector	0.644523	26.165533

Stewardess -> Steward	cosine_similarity	euclidean_distance
Original Vector	0.753096	21.497546
Summed Vector	0.753192	21.533765
Rotated Vector	0.760474	21.201120

There we go! Our summed vector is now better or at least matches our original vector’s similarity. The summed vector now also matches the performance of the rotated vector, although it required an additional optimization step and K chosen in-sample.

Our largest outlier pair when optimizing for our scalar K was “steward” and “stewardess”. The optimal scalar K for that pair is half the average. Still, we the addition does no harm in terms of distance from the target. Though, we see the rotated vector makes progress in approaching the target vector.

I recognize that only using the “man” and “woman” vectors to generate the offset is a bit silly. Using a broader collection of gendered words, sentences, titles, etc., and averaging them to create average “man” and “woman” vectors before taking the offset is best practice. However, I’m going to have limited data once I get to smells so I’m trying to keep this simple.

Now that I’ve resolved the issue with vector addition, let’s go back to rotations!

2. The rotated vector is closer to the target!

The rotated vector is helping a little but isn’t closing much of the gap between the vectors. As we saw earlier, we’re rotating by roughly the correct amount, but on the wrong plane.

I think I’m encoding some concept of gender in the rotation but am I accounting for it entirely? While I might say there are no differences between a King and a Queen the studies on bias in LLMs show us that isn’t the case. I expect some difference in the transformed vectors no matter what naive transformations are performed. Gender stereotypes encoded into the embedding of “prince of England” might not be encoded into the general embedding for “man” (or are lessened through averaging). So, is the remaining distance due to other features/meanings or am I failing to account for general aspects of gender?

To start with, let’s make this a fair comparison with the offset and optimize the angle (magnitude) of rotation. After all, my hypothesis is that the two-dimensional plane can be treated as a feature, and its angle as a magnitude.

A scalar product for our angle wouldn’t be all that interpretable so instead, I’ll optimize for the angle of rotation directly.

def compute_nd_rotation_matrix(a, b, angle=None):
    a_norm = a / np.linalg.norm(a)
    b_norm = b / np.linalg.norm(b)
    
    cos_theta = np.dot(a_norm, b_norm)
    cos_theta = np.clip(cos_theta, -1.0, 1.0)
    if angle is None:
        angle = np.arccos(cos_theta)
    
    v = b_norm - np.dot(b_norm, a_norm) * a_norm
    v_norm = np.linalg.norm(v)
    
    if v_norm < 1e-8:  # a and b are collinear
        return np.eye(len(a)),
    
    v = v / v_norm
    
    identity = np.eye(len(a))
    outer_aa = np.outer(a_norm, a_norm)
    outer_av = np.outer(a_norm, v)
    outer_va = np.outer(v, a_norm)
    outer_vv = np.outer(v, v)
    
    R = (
        identity
        + np.sin(angle) * (outer_va - outer_av)
        + (np.cos(angle) - 1) * (outer_vv + outer_aa)
    )
    
    return R, angle


def objective(m, source, base_source, base_target , target):
    R, angle = compute_nd_rotation_matrix(base_source, base_target, m)    
    adjusted_vector = np.dot(R, source)
    return -cosine_similarity(adjusted_vector, target)

options = []
for target, source in [
    (king, queen),
    (prince, princess),
    (son, daughter),
    (actor, actress),
    (steward, stewardess),
]:
    initial_m = gender_angle

    result = minimize(objective, initial_m, args=(source, woman, man, target))
    optimal_m = result.x[0]
    options.append(optimal_m)
    print(optimal_m)
print()
average_m = sum(options) / len(options)
print(f"Average Optimized Angle: {average_m} (radians)")
print(f"Orginal Angle: {gender_angle} (radians)")
print(f"Difference: {abs(average_m - gender_angle)} radians, {np.rad2deg(abs

Trying and Failing to Interpret Embeddings by tedtimbrell

Trying and Failing to Interpret Embeddings by tedtimbrell

Share This Article

Newsletter

Background

Simple example: vector offsets and addition

Let’s take a look at the cosine similarity between these gendered offsets

Rotation

Let’s try with other gendered titles.

1. The summed vector is further from the target than the original vector of all title pairs.

2. The rotated vector is closer to the target!

HackTech

Leave a comment Cancel reply

Editor's Choice

Trying and Failing to Interpret Embeddings by tedtimbrell

Trying and Failing to Interpret Embeddings by tedtimbrell

Share This Article

Newsletter

Background

Simple example: vector offsets and addition

Let’s take a look at the cosine similarity between these gendered offsets

Rotation

Let’s try with other gendered titles.

1. The summed vector is further from the target than the original vector of all title pairs.

2. The rotated vector is closer to the target!

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter