YahooArchive / oozie

Oozie - workflow engine for Hadoop

Home Page:http://yahoo.github.com/oozie/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

oozie coordinator input-event does not work

twmht opened this issue · comments

hi,

I have three coordinators A, B and C.

The coordinator of B and C depends on the output of A. That is, if the output of A is ready, coordinator of B and C will run.

So, I use an input-event to control such dependency.

The structure of coordinator B and C look like

<coordinator-app name="B" frequency="1440" start=${start} end=${end} timezone="UTC" xmlns="uri:oozie:coordinator:0.1">
   <datasets>
      <dataset name="input1" frequency="1440" initial-instance=${start} timezone="UTC">
         <uri-template>hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}</uri-template>
      </dataset>
   </datasets>
   <input-events>
      <data-in name="coordInput1" dataset="input1">
          <instance>${coord:current(0)}</instance>
      </data-in>
   </input-events>
   <action>
      <workflow>
         <app-path>hdfs://localhost:9000/B/workflows</app-path>
      </workflow>
   </action>     
</coordinator-app>

So, if hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}/_SUCCESS is created, the coordinator B and C will be triggered to run their workflow.

the coordinator of A looks like:

<coordinator-app name="B" frequency="1440" start=${start} end=${end} timezone="UTC" xmlns="uri:oozie:coordinator:0.1">
   <action>
      <workflow>
         <app-path>hdfs://localhost:9000/A/workflows</app-path>
      </workflow>
   </action>
</coordinator-app>

its ${start} and ${end} are same as B and C.

The workflow of A will create hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}/_SUCCESS

However, the coordinator of B and C are still waiting for the hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}/_SUCCESS

Even though I use output event of coordinator of A, the workflow of B and C are still wanting for the created input dataset.

<coordinator-app name="B" frequency="1440" start=${start} end=${end} timezone="UTC" xmlns="uri:oozie:coordinator:0.1">
    <datasets>
        <dataset name="input1" frequency="1440" initial-instance=${start} timezone="UTC">
        <uri-template>hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}</uri-template>
        </dataset>
    </datasets>

    <output-events>
        <data-out name="coordOutput1" dataset="output1">
            <instance>${coord:current(0)}</instance>
        </data-out>
    </output-events>
   <action>
      <workflow>
         <app-path>hdfs://localhost:9000/A/workflows</app-path>
      </workflow>
   </action>
</coordinator-app>

If I submit the workflow of A without its coordinator, then the workflow of B and C will be triggered.

I am not sure if something missing in my coordinator of A.

Thank you!

Can you please send it to user@oozie.apache.org?Bigger community.Oozie - Project Mailing Lists

|   |
|   |   |   |   |   |
| Oozie - Project Mailing ListsProject Mailing Lists These are the mailing lists that have been established for this project. For each list, there is a subscribe, unsubscribe, and an archive link. Name Subscribe Unsubscribe Post |
| |
| View on oozie.apache.org | Preview by Yahoo |
| |
|   |

--Mohammad

 On Friday, September 4, 2015 7:49 AM, Ming-Hsuan-Tu <notifications@github.com> wrote:

hi,I have three coordinator A, B and C.The coordinator of B and C depends on the output of A. That is, if output of A is ready, coordinator of B and C will run.So, I use an input-event to control such dependency.The structure of coordinator B and C look like


hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}




${coord:current(0)}




hdfs://localhost:9000/B/workflows



So, if hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}/_SUCCESS is created, the coordinator B and C will be triggered to run their workflow.the coordinator of A looks like:


hdfs://localhost:9000/A/workflows



its ${start} and ${end} are same as B and C.The workflow of A will create hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}/_SUCCESSHowever, the coordinator of B and C are still waiting for the hdfs://localhost:9000/tmp/revenue_feed/${YEAR}/${MONTH}/${DAY}/_SUCCESSIf I submit workflow of A without coordinator, then the workflow of B and C will be triggered.I am not sure if something missing in my coordinator of A.Thank you!—
Reply to this email directly or view it on GitHub.