nf-core / hic

Analysis of Chromosome Conformation Capture data (Hi-C)

Home Page:https://nf-co.re/hic

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using "--restriction_site" and "--ligation_site" outputs an error

abhratanju opened this issue · comments

Description of the bug

when we have enzymes which are not already mentioned in the nextflow config, nf-core/HiC suggests mentioning separately the --restriction_site and --ligation_site in place of --digestion. for example the new Arima version 2 is not in the nf-core/hic in built enzymes list.

Arima v2, restriction_sites: ^GATC, G^ANTC, C^TNAG, T^TAA .
ligation_site='GATCGATC,GATCANTC,GANTGATC,GANTANTC,GATCTNAG,GANTTNAG,CTNAGATC,CTNAANTC,TTAGATC,TTAANTC,TTATTNAG,GATCTAA,GANTTAA,CTNATAA,TTATAA,CTNATNAG'

But, this returns an error message. I have found a way around by adding a new enzyme "arima2" in my nextflow.config and using --digestion. But, this doesn't solve the problem in general as there can be many possibilities of new enzymes and needs to be fixed.

Command used and terminal output

No response

Relevant files

No response

System information

No response

I'm having a very similar issue:

I created a local config file and added my restriction enzyme to the list, but when I used -c to add my local config file and --digest 'myenzyme' to my list of parameters when calling nextflow, I get the same error described above and interestingly it doesn't list 'myenzyme' as one of the options:

my_nextflow.config looks like this


digest {
      'myenzyme' {
        restriction_site='^GATC'
        ligation_site='GATCGATC'
      }
      'hindiii'{
         restriction_site='A^AGCTT'
         ligation_site='AAGCTAGCTT'
      }
      'mboi' {
         restriction_site='^GATC'
         ligation_site='GATCGATC'
      }
      'dpnii' {
         restriction_site='^GATC'
         ligation_site='GATCGATC'
      }
      'arima' {
         restriction_site='^GATC,G^ANTC'
         ligation_site='GATCGATC,GATCANTC,GANTGATC,GANTANTC'
      }
}

The call looks something like this:

nextflow run nf-core/hic -r 2.0.0 -c <config.file.location>/my_nextflow.config --digestion 'myenzyme' . . .

And the error looks like this:


ERROR: Validation of pipeline parameters failed!


* --digestion: 'myenzyme' is not a valid choice (Available choices: hindiii, mboi, dpnii, arima)

If I specify --restriction_site='^GATC' --ligation_site='GATCGATC' in my nextflow call, then I get this:

Ligation motif not found. Please either use the `--digestion` parameters or specify the `--restriction_site` and `--ligation_site`. For DNase Hi-C, please use '--dnase' option

I spent a few hours troubleshooting this and it the cause of my error was that nextflow_schem.json hard codes the available restriction enzymes.

In a future version would it be possible to populate the digetion options in the json file with the contents of the config file?

Thanks so much for making this code available. Let me know if more information would be useful.

The issue is resolved and updated in the latest version