Fruitr is an experiment with the Echonest API.

The code is located here: https://github.com/rishighan/fruitr

The project itself is at: **fruitr.rishighan.com**

Specifically, the `hotttnesss`

and `familiarity`

methods. `hotttnesss`

is, in Echonest's own words, a numerical description of how hottt an artist currently is^{1}. I was interested in finding out what metrics they really use to arrive at a numerical rating, especially in the light of how they can be used to create a predictive model to gauge if any particular artist is going to be hot based on his past and current work.

Echonest officially says that it bases the rating on the activity that it sees on the thousands of websites that are crawled by its algorithms^{2}.

With Fruitr, I wanted to start with establishing a correlation between two or more metrics, so that I would have a scientific way of using that information to create the predictive model.

The problem at hand, at least initially is to calculate the correlation between `hotttness`

and `familiarity`

A strong correlation between these two metrics would mean that scientifically, if an artist is hot, he/she is also more familiar.

The correlation is a Pearson coefficient of correlation^{3} and is calculated by getting:

1. 10 `similar`

^{4} artists for a seed value

2. The corresponding `hotttness`

and `familarity`

ratings

Ruby's `zip`

method helps out here. Given 2 matrices, in our case, 1 x n in size, `zip`

will take each element in the first array, and merge it with the elements of the second one, creating `m`

subarrays, given that m is the length of the first array. Visual aid^{5}

1 2 3 4 5 6 | 2.1.0 :001 > arr1 = [1,2,3,4] => [1, 2, 3, 4] 2.1.0 :002 > arr2 = [5,6,7,8] => [5, 6, 7, 8] 2.1.0 :003 > arr1.zip(arr2) => [[1, 5], [2, 6], [3, 7], [4, 8]] |

Taking this a step further, we can multiply the elements in the subarray by:

1 2 | 2.1.0 :004 > arr1.zip(arr2).map {|x,y| x= x*y} => [5, 12, 21, 32] |

Putting this all together, we have this method

1 2 3 4 5 6 7 | # Matrix multiplication def arrayMultiply arr1, arr2 #arr1.inject(0) {|c,i| c + arr1[i]*arr2[i]} result = Array.new() result = arr1.zip(arr2).map {|x,y| x * y} result.reduce(:+) end |

This gives us the tools to proceed with calculating our coefficient:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | def ruby_pearson(x,y) n=x.length sumx=x.inject(0) {|r,i| r + i} sumy=y.inject(0) {|r,i| r + i} sumxSq=x.inject(0) {|r,i| r + i**2} sumySq=y.inject(0) {|r,i| r + i**2} prods=[]; x.each_with_index{|this_x,i| prods << this_x*y[i]} pSum=prods.inject(0){|r,i| r + i} # Calculate Pearson score num=pSum-(sumx*sumy/n) den=((sumxSq-(sumx**2)/n)*(sumySq-(sumy**2)/n))**0.5 if den==0 return 0 end r=num/den return r end |