Clinical-Genomics / cg

Glue between Clinical Genomics apps

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Calculation of case sequencing QC slows down /orders

islean opened this issue · comments

Description

The calculations done by the QualityControllerService are expensive, meaning that usage of the methods in the web services, e.g. CiGRID, adds quite a lot of latency. As an example found in #3198, calculating the number of cases failing sequencing QC within orders roughly doubles the response time when rendering the orders page. Since this data is useful, we should think of a better way to calculate this.

Suggested solution

Easiest would probably be to save this data on case and sample level in status db. Would require refinement on when these values should be set. At the end of demultiplexing?

This can be closed when

Finding cases which fail sequencing qc can be done quickly.

I think the calculation is fine as longs as we store the result

We want to store it.

  • Should not be repeated in the /orders endpoint.
  • Done in a weird place: each time we check whether a case is ready to be started.

Conclusion

  • Extract separate service or update existing one and poll with CLI command to perform sequencing QC
  • Use a simple database filter for the stored sequencing QC result when checking which cases an analysis can be started for.
  • Same for the /orders endpoint, just use a simple database filter.

Questions

Q: How should the logic be triggered?
A: Regularly run CLI command which picks up any cases for which sequencing QC has not been done.

Q: which model should store the sequencing QC result for the case?
A: the Sample model <- this does absolutely not make sense. We can store the sample QC on the samples, the case QC belongs on the case model.

Q: how should we handle externally sequenced samples?
A: always pass? Check code or ask @karlnyr.

@seallard
Externally sequenced samples would always pass as they do not have a requirement of target reads.

image

The logic for external sample reads being picked up would then be the above filter for passing samples, collected with that when we run the cg add external we will set the case status to analyse and the automation would pick it up

Given that the sequencing QC is calculated per case (and can pass for individual samples but not on the case level) and the analyses are started per case (given that the sequencing QC passes), we should store the overall case sequencing QC on the case. We can also store the sample QC on the samples.

Implementation plan

  • Add sequencing_qc_status to case model with enum pending, passed, failed (default is pending).
  • Add new CLI command which picks up cases for which the qc needs to be done and updates the sequencing_qc_status.
  • Refactor old logic to decouple the sequencing qc from the other logic and filter based on the stored sequencing_qc_status.

This is very straightforward to implement, we just need to have a meeting to get everyone on the same page and as to ensure the required PR:s do not get blocked.

Rename field on case to ready_for_analysis.