Cosine Similarity between two sets of vectors?

I have words represented as vectors, and so I can compare two words using the cosine similarity of each word vector.

But, now I'd like to extrapolate that and compare two sentences, each being a set of word vectors.

What is the most robust way to go about this?

Should I compute the mean of each set of vectors and then compute the cosine-similarity of each mean vector? Should I be normalizing the vectors first?

How does that compare to my naive approach of scoring each word pair and then simply taking the mean of the scores as the similarity between the two sets?

Any insights are greatly appreciated. Thanks.


To calculate cosine similarity between to sentences i am using this approach:

  1. Calculate cosine distance between each word vectors in both vector sets (A and B)
  2. Find pairs from A and B with maximum score
  3. Multiply or sum it to get similarity score of A and B

This approach shows much better results for me than vector averaging.

Here some python code:

import numpy as np
A = [list of word vectors]
B = [list of word vectors]

qd = np.dot(np.vstack(A), np.vstack(B).T)
rel = 1
for r in np.amax(qd, axis=1):
    rel *= r

Use this function n_similarity from the word2vec model in the python package gensim. http://radimrehurek.com/gensim/models/word2vec.html#gensim.models.word2vec.Word2Vec.n_similarity