samply / transFAIR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TransFAIR

Introduction

TransFAIR is a specialized tool for data integration for medical institutions. Instead of creating own ETL processes by hand, this tool facilitates certain data integration tasks like:

  • Extraction from source systems
  • Transformation into target schemata
  • Loading into target system

TransFAIR is designed to

  • minimize data integration effort for personnel at the sites, especially if connected to multiple networks
  • be easily extensible with new dataset/mapping definitions
  • thus accelerate and facilitate rollout of new features and dataset extensions
  • provide more consistent data quality (because as long as the source data is okay, errors within TransFAIR's mappings can be fixed centrally)

The tool focuses on use-cases and IT systems encountered in network medical research in Germany, in particular:

Profiles

TransFAIR is shipped with so-called ETL profiles. Currently, these are:

  • fhircopy - transfer FHIR resources Organization, Condition, Observation, Specimen, as well as Patients referenced in them unchanged from one FHIR server to another. This can be used to perform filtering and/or pseudonymisation across servers.
  • bbmri2mii - load biosample information from a BBMRI-ERIC Bridgehead, transform into MII Core Dataset and load into a target (e.g. FHIR Store with MII Core Dataset).
  • mii2bbmri - load the MII Core Dataset (usually from a FHIR server/façade providing the MII Core Dataset), transform in BBMRI-ERIC profiles and load into BBMRI-ERIC Bridgehead.
  • dicom2fhir - load data from a DICOM source, transform into ImagingStudy resources and load into a target FHIR store.
  • amr - load data from AMR (ECDC antimicrobial resistance) CSV files, transform into Patient and Observation resources and load into a target FHIR store.

Configuration

TransFAIR is configured using environment variables:

Variable Description Default
FHIR_INPUT_URL HTTP Address of the SOURCE datastore http://localhost:8080/fhir
FHIR_OUTPUT_URL HTTP Address of the TARGET datastore http://localhost:8090/fhir
PROFILE Identifier of the TransFAIR profile to execute (see Profiles) mii2bbmri
IMGMETA_FROM_FHIR Get DICOM metadata from the SOURCE datastore true
IMGMETA_DICOM_WEB_URL Get DICOM metadata from the specified DICOM web URL
IMGMETA_DICOM_FILE_PATH Get DICOM metadata from the specified DICOM file or directory
AMR_FILE_PATH Get AMR data from the specified directory

Setup a Development Environment

To setup a development environment, start two FHIR servers on localhost. Fill the first FHIR server with testdata and run the batch job in your IDE. Check the second FHIR server to see how your data was transferred.

fhir:
  source:
    url: http://localhost:8080/fhir # source store
  target:
    url: http://localhost:8090/fhir # target store
spring:
  profiles:
    active:
      mii2bbmri # profile to run

Roadmap

🚧 This tool is still under intensive development. Features on the roadmap are:

  • Pseudonymisation
  • Double-check specimen type mappings
  • Smoking status mappings
  • Body weight mapping
  • BMI mappings
  • Fasting status mappings
  • Log unmappable codes
  • Implement Beacon as a target format
  • Additional code systems for diagnoses beyond ICD10GM
  • Decide on using allowlist/denylist
  • Decide on identifiers instead of IDs in references
  • Incremental transfer, dealing with overwriting
  • Integration tests
  • Scheduler
  • Support standards beyond HL7 FHIR, e.g. OMOP and other well-known SQL, CSV and XML schemata.

To find out if TransFAIR can serve your needs today, we recommend you contact us to become a pilot site.

License

This code is licensed under the Apache License 2.0. For details, please see LICENSE

About

License:Apache License 2.0


Languages

Language:Java 99.7%Language:Dockerfile 0.3%