leoliu1221 / engagement_pure_spark

Pure spark for engagement experimentation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

engagement_pure_spark

Pure spark for engagement experimentation

steps

[ ] 500 main cluster
[ ] take all cluster > 200 and recluster into 10 sub clusters. E.g. if you have 50 in 500 main cluster that has size > 200, then you will have 450 + 50*10 clusters in the end
[ ] Create plot for final result. Centroid distance + max + max and centroid distance - max - max
[ ]  Count1: values on the left larger than 0.000001
[ ] Count2: count of values on the right smaller than 1.34
[ ] Count3: count of values on the left larger than 1.48

#measures [ ] Max point to center [ ] Average point to center [ ] cluster sizes [ ] cluster centroids

About

Pure spark for engagement experimentation

License:GNU General Public License v3.0


Languages

Language:Python 96.6%Language:Shell 3.4%