damavis / airflow-hop-plugin

Apache Hop plugin for Apache Airflow - Orquestate Apache Hop pipelines and workflows from Airflow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

First Teste Hop Plugin

pauloricardoferreira opened this issue · comments

Olá, estou realizando um teste com Apache Airflow 2.3.3 e Apache Hop 2.0 e o Plugin do Hop, e encontrei o seguinte problema.
Para todos os serviços, estou utilizando Docker.

====LogAirflow====
*** Reading local file: /opt/airflow/logs/dag_id=dag/run_id=manual__2022-08-11T10:54:53.330151+00:00/task_id=tsk-job-gerar-dags/attempt=1.log
[2022-08-11, 07:54:53 UTC] {taskinstance.py:1179} INFO - Dependencies all met for <TaskInstance: dag.tsk-job-gerar-dags manual__2022-08-11T10:54:53.330151+00:00 [queued]>
[2022-08-11, 07:54:53 UTC] {taskinstance.py:1179} INFO - Dependencies all met for <TaskInstance: dag.tsk-job-gerar-dags manual__2022-08-11T10:54:53.330151+00:00 [queued]>
[2022-08-11, 07:54:53 UTC] {taskinstance.py:1376} INFO -

[2022-08-11, 07:54:53 UTC] {taskinstance.py:1377} INFO - Starting attempt 1 of 1
[2022-08-11, 07:54:53 UTC] {taskinstance.py:1378} INFO -

[2022-08-11, 07:54:53 UTC] {taskinstance.py:1397} INFO - Executing <Task(HopWorkflowOperator): tsk-job-gerar-dags> on 2022-08-11 10:54:53.330151+00:00
[2022-08-11, 07:54:53 UTC] {standard_task_runner.py:52} INFO - Started process 518 to run task
[2022-08-11, 07:54:53 UTC] {standard_task_runner.py:79} INFO - Running: ['', 'tasks', 'run', 'dag', 'tsk-job-gerar-dags', 'manual__2022-08-11T10:54:53.330151+00:00', '--job-id', '15', '--raw', '--subdir', 'DAGS_FOLDER/job-gerar-dags.py', '--cfg-path', '/tmp/tmpe_g3c33k', '--error-file', '/tmp/tmpggoq0hjw']
[2022-08-11, 07:54:53 UTC] {standard_task_runner.py:80} INFO - Job 15: Subtask tsk-job-gerar-dags
[2022-08-11, 07:54:54 UTC] {task_command.py:371} INFO - Running <TaskInstance: dag.tsk-job-gerar-dags manual__2022-08-11T10:54:53.330151+00:00 [running]> on host 9a04be000dc4
[2022-08-11, 07:54:54 UTC] {taskinstance.py:1589} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=

AIRFLOW_CTX_DAG_ID=dag
AIRFLOW_CTX_TASK_ID=tsk-job-gerar-dags
AIRFLOW_CTX_EXECUTION_DATE=2022-08-11T10:54:53.330151+00:00
AIRFLOW_CTX_TRY_NUMBER=1
AIRFLOW_CTX_DAG_RUN_ID=manual__2022-08-11T10:54:53.330151+00:00
[2022-08-11, 07:54:54 UTC] {base.py:68} INFO - Using connection ID 'hop_default' for task execution.
[2022-08-11, 07:54:54 UTC] {taskinstance.py:1909} ERROR - Task failed with exception
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/airflow_hop/operators.py", line 80, in execute
register_rs = conn.register_workflow(self.workflow, self.task_params)
File "/usr/local/lib/python3.8/dist-packages/airflow_hop/hooks.py", line 160, in register_workflow
xml_builder = XMLBuilder(
File "/usr/local/lib/python3.8/dist-packages/airflow_hop/xml.py", line 40, in init
with open(f'{hop_home}/config/hop-config.json', encoding='utf-8') as file:
FileNotFoundError: [Errno 2] No such file or directory: '/opt/apache/hop/config/hop-config.json'
[2022-08-11, 07:54:54 UTC] {taskinstance.py:1415} INFO - Marking task as FAILED. dag_id=dag, task_id=tsk-job-gerar-dags, execution_date=20220811T105453, start_date=20220811T105453, end_date=20220811T105454
[2022-08-11, 07:54:54 UTC] {standard_task_runner.py:92} ERROR - Failed to execute job 15 for task tsk-job-gerar-dags ([Errno 2] No such file or directory: '/opt/apache/hop/config/hop-config.json'; 518)
[2022-08-11, 07:54:54 UTC] {local_task_job.py:156} INFO - Task exited with return code 1
[2022-08-11, 07:54:54 UTC] {local_task_job.py:273} INFO - 0 downstream tasks scheduled from follow-on schedule check

===Example path hop-config.json ===
root@45a5fa386e82:/opt/apache/hop/config# cat hop-config.json
{
"variables" : [ {
"name" : "HOP_MAX_LOG_SIZE_IN_LINES",
"value" : "0",
"description" : "The maximum number of log lines that are kept internally by Hop. Set to 0 to keep all rows (default)"
}, {
"name" : "HOP_MAX_LOG_TIMEOUT_IN_MINUTES",

o arquivo hop-config.json está na pasta correta.
O Projeto está em /opt/apache/hop/confi/projects/hop_repo.
dag
airflo_plugins
dag

Hi @pauloricardoferreira

The file /opt/apache/hop/config/hop-config.json is expected to be found somewhere (set in connections) in your Airflow workers. You will need to sync your Apache Hop home directory to that path into Airflow workers.

Hope that helps you.

@piffall
It worked....
there are several details to get right.
Now, I'm going to organize the settings.
Briefly migrated from PDI to HOP.
Thanks