How can i get the avro format of dataset movielens?
hugstone opened this issue · comments
Hello, i have learned photon-ml tutorial on docker, if i want to run the demo on my local machine,How can i get the avro format of dataset movielens?
@hugstone I am also working with non-Avro formatted data so I wrote a Python script to transform from csv to sort of libvsm (with respect to a categorical variable as your grouping variable) and then modifying the existing dev-scripts provided by the LinkedIn team, you convert libsvm to avro.
Take a look at this gist.
Please note, the scripts are set up for legacy Python2.7.
The tricky part is the GAME.avsc and getting the avro schema correct.