OryxProject / oryx

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Home Page:http://oryx.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multiple jobs on oryx

PalanQu opened this issue · comments

Hi sir, I have a problem with the framework, If I want to deploy multiple jobs on the oryx, each job has each Batch Layer, Speed Layer, Serving Layer for a pipline processing . For example If I have a document and I want to do a WordCount and after the WordCount, I want to find a most similar document use the wordcound result, All the algorithm I want to write by myself in Spark, I mean I don't want to use existed algorithm, How can I do?

You need to run separate instances of the layer processes. The layers running the same app would share the same configuration, each. They'd share an app ID and Kafka topics, and different apps would have different IDs and topics.

Questions can go to https://groups.google.com/a/cloudera.org/forum/#!forum/oryx-user

Thx, Sir, You mean I need to deploy multiple layers? Could you please show me some simple examples?
Or I have found this , It's this answer is correct?

https://groups.google.com/a/cloudera.org/forum/#!topic/oryx-user/4oNVz4JrAt0

Yes that answer is correct. You simply run a whole different set of layer processes for each app.