The multivariate kernel density estimator is the estimated pdf of a random sample vector. Let π₯ be a π-dimensional random vector with a density function π and let π¦π be a random sample drawn from π for π = 1, 2, β¦ , π, where n is the number of random samples. For any real vectors of π₯, the kernel density estimation is:
where the kernel functions is:
and H is the d-by-d variance matrix. In MATLAB, I used mvksdensity function which uses a diagonal variance matrix and a product kernel. That is, H1/2 is a square diagonal matrix with the elements of vector (h1, h2, β¦, hd) on the main diagonal. K(x) takes the product form K(x) = k(x1)k(x2)β―k(xd), where k(Β·) is a one-dimensional Gaussian kernel function. Then, the multivariate kernel density estimator becomes,
In this part, I used standard multivariate gaussian kernel where H represents the covariance matrix :
For each class, we can compare resulting pdfsβ multiplied by prior probabilities in natural logarithm. While ππ (π₯β),
ππ (π₯β) = ππ π(π₯β | π€π ) + ππ π(π€π )
After iterating over all classesβ resulting if, ππ (π₯β) > ππ (π₯β), then we assign x to the class π€π. On a private dataset, recall and precision for test set:
For overall comparison of algorithms on same private dataset please visit Hierarchical Clustering with SVM repository.