• Projects

Fruitr

Fruitr is an experiment with the Echonest API.
The code is located here: https://github.com/rishighan/fruitr
The project itself is at: fruitr.rishighan.com

Specifically, the hotttnesss and familiarity methods. hotttnesss is, in Echonest's own words, a numerical description of how hottt an artist currently is1. I was interested in finding out what metrics they really use to arrive at a numerical rating, especially in the light of how they can be used to create a predictive model to gauge if any particular artist is going to be hot based on his past and current work.

Echonest officially says that it bases the rating on the activity that it sees on the thousands of websites that are crawled by its algorithms2.

With Fruitr, I wanted to start with establishing a correlation between two or more metrics, so that I would have a scientific way of using that information to create the predictive model.

The problem at hand, at least initially is to calculate the correlation between hotttness and familiarity A strong correlation between these two metrics would mean that scientifically, if an artist is hot, he/she is also more familiar.

The correlation is a Pearson coefficient of correlation3 and is calculated by getting:
1. 10 similar4 artists for a seed value
2. The corresponding hotttness and familarity ratings

Ruby's zip method helps out here. Given 2 matrices, in our case, 1 x n in size, zip will take each element in the first array, and merge it with the elements of the second one, creating m subarrays, given that m is the length of the first array. Visual aid5

1
2
3
4
5
6
2.1.0 :001 > arr1 = [1,2,3,4]
 => [1, 2, 3, 4] 
2.1.0 :002 > arr2 = [5,6,7,8]
 => [5, 6, 7, 8] 
2.1.0 :003 > arr1.zip(arr2)
 => [[1, 5], [2, 6], [3, 7], [4, 8]] 

Taking this a step further, we can multiply the elements in the subarray by:

1
2
2.1.0 :004 > arr1.zip(arr2).map {|x,y| x= x*y}
 => [5, 12, 21, 32] 

Putting this all together, we have this method

1
2
3
4
5
6
7
# Matrix multiplication
def arrayMultiply arr1, arr2
  #arr1.inject(0) {|c,i| c + arr1[i]*arr2[i]}
  result = Array.new()
  result = arr1.zip(arr2).map {|x,y| x * y}
  result.reduce(:+)
end

This gives us the tools to proceed with calculating our coefficient:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def ruby_pearson(x,y)
  n=x.length

  sumx=x.inject(0) {|r,i| r + i}
  sumy=y.inject(0) {|r,i| r + i}

  sumxSq=x.inject(0) {|r,i| r + i**2}
  sumySq=y.inject(0) {|r,i| r + i**2}

  prods=[]; x.each_with_index{|this_x,i| prods << this_x*y[i]}
  pSum=prods.inject(0){|r,i| r + i}

  # Calculate Pearson score
  num=pSum-(sumx*sumy/n)
  den=((sumxSq-(sumx**2)/n)*(sumySq-(sumy**2)/n))**0.5
  if den==0
    return 0
  end
  r=num/den
  return r
end