[FR]: Re-write the docker job container code

Question

[FR]: Re-write the docker job container code

hacktobeer opened this issue 10 months ago · comments

hacktobeer commented 10 months ago

What would this feature improve or what problem would it solve?

This would improve:

the isolation of Jobs into their separate docker containers minimizing dependency problems
minimizing the worker image
improving maintainability and release process
minimizing plaso version issues with Timesketch
easy rollback of Job container versions when issues arise
easy updating of Job code by contributors (they can just update their container)

What is the feature you are proposing?

Re-write the already existing docker job code as this code is based on having a shared docker host. We currently run in K8s and docker compose so the architecture of this needs to change slightly (eg working with a sidecar).

Re-write the docker code to not have the dependency of already pulled docker containers but pull them on an ad-hoc bases when needed. This will make it easier to update to new versions of job containers without having to update/restart the whole Turbinia setup.

What alternatives have you considered?

None.

hacktobeer · Answer 1 · Thu Aug 24 2023 20:12:17 GMT+0800 (China Standard Time)

Observation from docker code paths so far:

docker code only checks if an image is already there but does not pull it
return object of run/start container API call has changed, code not working
pre-checks for worker verify if the actual docker image is has the program available, this would pull images in a phase it is not needed. Removing checks when ad-hoc loading images
artifact.py uses image_export.py that is included with the plaso install bundle. If we want to get rid of these dependencies we need to rewrite the FileArtifactExtractionTask to be docker enabled

hacktobeer · Answer 2 · Thu Aug 24 2023 22:57:00 GMT+0800 (China Standard Time)

More observations:

utils.py uses image_export directly in the export_file/export_artifact functions