Cosine similarity is a metric of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them. Two vectors with the same orientation have a cosine similarity of 1, if they are perpendicular they have a similarity of 0, and if they have opposing directions the cosine similarity is -1, independent of their magnitude. One advantage of cosine similarity is its low-complexity, especially for sparse vectors where only the non-zero dimensions need to be considered, which is a common case in GSgalgoR. Other names of cosine similarity are Otuska-Orchini similarity when it is applied to binary data, which is the case for GSgalgoR, where individual solutions represented as strings of 0 and 1 are compared with t his metric.

cosine_similarity(a, b)

Arguments

a, b

A string of numbers with equal length. It can also be two binary strings of 0's and 1's

Value

In practice, the function can return numeric values from -1 to 1 according the vector orientations, where a cosine similarity of 1 implies same orientation of the vectors while -1 imply vector of opposing directions. In the binary application, values range from 0 to 1, where 0 are totally discordant vectors while 1 are identical binary vectors.

Examples

solution1 <- c(1, 0, 0, 1, 0, 0, 1) solution2 <- solution1 r <- cosine_similarity(solution1, solution2) # the cosine similarity (r) equals 1 solution2 <- abs(solution1 - 1) r2 <- cosine_similarity(solution1, solution2) # the cosine similarity (r2) equals 0