lyft / cartography

Cartography is a Python tool that consolidates infrastructure assets and the relationships between them in an intuitive graph view powered by a Neo4j database.

Home Page:https://lyft.github.io/cartography/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unexpected query parameters error during AWS EMR sync on version 0.73.0

tmsteere opened this issue · comments

Crash during AWS sync: During the EMR portion of the AWS sync, Cartography crashes with an unexpected query parameters error.

Description:
Cartography sync is crashing during the EMR portion of the sync (see stack trace). The expected behavior is the completion of sync.

To Reproduce:
Sync AWS using a single account and default settings. The sync does complete if excluding EMR with --aws-requested-syncs.

Logs:
ERROR:cartography.sync:Unhandled exception during sync stage 'aws'
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/cartography/sync.py", line 85, in run
stage_func(neo4j_session, config)
File "/usr/local/lib/python3.10/dist-packages/cartography/util.py", line 133, in timed
return method(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/init.py", line 249, in start_aws_ingestion
sync_successful = _sync_multiple_accounts(
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/init.py", line 167, in _sync_multiple_accounts
_sync_one_account(
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/init.py", line 62, in _sync_one_account
RESOURCE_FUNCTIONSfunc_name
File "/usr/local/lib/python3.10/dist-packages/cartography/util.py", line 133, in timed
return method(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/emr.py", line 161, in sync
cleanup(neo4j_session, common_job_parameters)
File "/usr/local/lib/python3.10/dist-packages/cartography/util.py", line 133, in timed
return method(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/emr.py", line 137, in cleanup
cleanup_job = GraphJob.from_node_schema(EMRClusterSchema(), common_job_parameters)
File "/usr/local/lib/python3.10/dist-packages/cartography/graph/job.py", line 147, in from_node_schema
raise ValueError(
ValueError: Expected query params "{'AccountId', 'UPDATE_TAG', 'LIMIT_SIZE'}" but got "{'permission_relationships_file', 'UPDATE_TAG', 'AWS_ID', 'LIMIT_SIZE'}". Please check the value passed to parameters.
Traceback (most recent call last):
File "/usr/local/bin/cartography", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/cartography/cli.py", line 581, in main
sys.exit(CLI(default_sync, prog='cartography').main(argv))
File "/usr/local/lib/python3.10/dist-packages/cartography/cli.py", line 561, in main
return cartography.sync.run_with_config(self.sync, config)
File "/usr/local/lib/python3.10/dist-packages/cartography/sync.py", line 163, in run_with_config
return sync.run(neo4j_driver, config)
File "/usr/local/lib/python3.10/dist-packages/cartography/sync.py", line 85, in run
stage_func(neo4j_session, config)
File "/usr/local/lib/python3.10/dist-packages/cartography/util.py", line 133, in timed
return method(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/init.py", line 249, in start_aws_ingestion
sync_successful = _sync_multiple_accounts(
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/init.py", line 167, in _sync_multiple_accounts
_sync_one_account(
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/init.py", line 62, in _sync_one_account
RESOURCE_FUNCTIONSfunc_name
File "/usr/local/lib/python3.10/dist-packages/cartography/util.py", line 133, in timed
return method(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/emr.py", line 161, in sync
cleanup(neo4j_session, common_job_parameters)
File "/usr/local/lib/python3.10/dist-packages/cartography/util.py", line 133, in timed
return method(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/cartography/intel/aws/emr.py", line 137, in cleanup
cleanup_job = GraphJob.from_node_schema(EMRClusterSchema(), common_job_parameters)
File "/usr/local/lib/python3.10/dist-packages/cartography/graph/job.py", line 147, in from_node_schema
raise ValueError(
ValueError: Expected query params "{'AccountId', 'UPDATE_TAG', 'LIMIT_SIZE'}" but got "{'permission_relationships_file', 'UPDATE_TAG', 'AWS_ID', 'LIMIT_SIZE'}". Please check the value passed to parameters.

Please complete the following information::

  • Cartography release version: 0.73.0
  • Python version: 3.10.6
  • OS Ubuntu 22.04

Thanks for reporting, ugh I think I know what's going on. The permission relationships sync does some not ideal thing where it gets passed along via the job params. Will fix, sorry. If it is blocking, for now you can disable the EMR sync using --aws-requested-syncs.

@tmsteere - Should be fixed in 0.73.1, please write back if you see more issues.