understanding modified bases threshold setting in Dorado
DepledgeLab opened this issue · comments
Issue Report
I'm using Dorado v0.7.0 for all context m6A calling but I am uncertain how to interpret the --modified-based-threshold paramater.
--modified-bases-threshold the minimum predicted methylation probability for a modified base to be emitted in an all-context model, [0, 1] [default: 0.05]
Am I correct to interpret this as Dorado will report a site to by m6A modified if the methylation probability for an individual nucleotide in a single read is 5% or higher? Why is this value set so low?
As a further question, is it possible to switch between DRACH and all context using the rna004_130bps_sup@v5.0.0 model or can DRACH only be achieved using rna004_130bps_sup@v3.0.1?
Hi @DepledgeLab,
You're not quite right, no. Dorado will emit the probability that a base is modified if it passes this threshold. If it below threshold, dorado is sufficiently confident that the base is not modified that it simply presumes it to be a canonical base and lists it as being skipped in the MM
tag. To put it another way, dorado has to be 95% sure a base is unmodified before it will make the decision itself rather than leaving that level of filtering to the user.
There is no DRACH model for v5, no.