minoda-lab / universc

UniverSC: a flexible cross-platform single-cell data processing pipeline

Home Page:https://genomec.gsc.riken.jp/gerg/UniverSC/UniverSC_app_release/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Errors with indrops-v2

shodais opened this issue · comments

I can use UniverSC, however, the following error was occurred.
The number of my fastq files are 24 in total, in each read1, read2, index1, index2.
So, my fastq file name is for example, "f1_S1_L012_R1_001.fastq" in the 12nd file.
How should I change my fastq name?

#####Cell Ranger command#####
cellranger count --id=dmf1\
        --fastqs=input4cellranger_dmf1\
        --lanes=1,2,3,4,5,6,7,8,9,L010,L011,L012,L013,L014,L015,L016,L017,L018,L019,L020,L021,L022,L023,L024\
        --r1-length=26\
        --chemistry=SC3Pv2\
        --transcriptome=/share/Users/shodai_s_16/Danio.rerio_genome\
        --sample=f1\
        --description=dmf1\
        \
        --jobmode=local\
        --localcores=20\


##########
error: Invalid value for '--lanes <NUMS>...': invalid digit found in string

For more information try --help
cellranger run complete
***Notice: Cloupe file cannot be computed for indrop-v3
           Cloupe files generated by this pipeline are corrupt
           and cannot be read by the 10x Genomics Loupe Browser.
           We do not provide support for Cloupe files as this
           requires software from 10x Genomics subject to their
           End User License Agreement.
           Cloupe files are disabled in compliance with this.
updating .lock file
 no other jobs currently run by cellranger cellranger-6.1.1 in /home1/projects/ai-med/cellranger-6.1.1/cellranger
 no conflicts: whitelist can now be changed for other technologies
replacing modified barcodes with the original in the output gene barcode matrix
gzip: dmf1/outs/raw_feature_bc_matrix/barcodes.tsv.gz: No such file or directory
gzip: dmf1/outs/filtered_feature_bc_matrix/barcodes.tsv.gz: No such file or directory
sh: 1: cannot create dmf1/outs/raw_feature_bc_matrix/barcodes.tsv.gz: Directory nonexistent
sh: 1: cannot create dmf1/outs/filtered_feature_bc_matrix/barcodes.tsv.gz: Directory nonexistent
barcodes recovered

#####Conversion tool log#####
cellranger cellranger-6.1.1

Original barcode format: indrop-v3 (then converted to 10x)

cellranger runtime: 0s
##########

Hi shodais,

As the error message states, you have an unusual value for "lane" that is being rejected.
So if I understand it correctly, you have 24 fastq files including all R1/2 I1/2 files.
So I am going to assume this means you have 6 samples (4 files for each sample).

Can you try again after you rename each file to something like this?
f1_<S1, ..., S6>_L001_<R1, R2, I1, or I2>_001.fastq
In your case, this should be the simplest representation of your files.
See if this work for you.

By the way, I noticed that you are running cellranger 6.1.1, I strongly advise you that you stick to version 3.0.2 or lower because this will be in conflict with cellranger licensing.

Kai Battenberg

Thank you for your answer.
I’ll try again after I rename my files and install cellranger 3.0.2.

Dear Kai Batternberg,

Errors were occurred again.
I'm trying to re-analyze fastq files in indrops-v2 that I got from a paper I am interested in.
I've renamed the fastq files and installed cellranger-3.0.2 as you advised.
However, the following error was occurred.

I'm worried about these error:

  • f1_S1_L001_R2_001.fastq was recognized as INPUT(R1), and f1_S1_L001_R1_001.fastq was recognized as INPUT(R2).
  • How should I improve this error, /home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo ... barcode and UMI linker removed for indrop-v2: command not found ? What should I install before UniverSC?
  • From this error, [error] You selected chemistry 'SC3Pv2', which expects the cell barcode sequence in read R1. In input data, an extremely low rate of correct barcodes was observed for this chemistry (0.00 %)., should I set "chemistry"?

Sincerely,
Shodai Suzuki

Running launch_universc.sh in '/home1/shodai_s_16/universc'
UniverSC Copyright (C) 2019 Tom Kelly; Kai Battenberg
This program comes with ABSOLUTELY NO WARRANTY; for details type 'cat LICENSE'. This is free software, and you are welcome to redistribute it under certain conditions; type 'cat LICENSE' for details.
Cell Ranger is called as third-party dependency and is not maintained by this project. Please ensure you comply with the End User License Agreement for all software installed where applicable; for details type 'cat README.md'.
***Note: Make sure that samples are demultiplexed prior to running launch_universc.sh***
    /share/Users/shodai_s_16/f2/f2_S1_L001_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L002_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L003_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L004_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L005_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L001_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L002_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L003_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L004_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L005_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L001_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L002_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L003_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L004_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L005_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L001_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L002_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L003_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L004_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L005_R1_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L001_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L002_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L003_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L004_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S1_L005_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L001_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L002_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L003_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L004_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S2_L005_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L001_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L002_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L003_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L004_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S3_L005_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L001_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L002_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L003_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L004_R2_001.fastq file found
    /share/Users/shodai_s_16/f2/f2_S4_L005_R2_001.fastq file found
***WARNING: technology is set to indrop-v2. barcodes on Read 2 will be used***
***WARNING: whitelist for indrop-v2 is modified from the original barcodes (https://github.com/indrops/indrops/tree/master/ref/barcode_lists), first 8 bp of list1 and list2 are joind to generate a 16 bp barcode***
Using 10x version 2 chemistry to support UMIs
***WARNING: conversion was turned on because directory input4cellranger_dmf2 was not found***
 checking if UniverSC is running already
  checking .lock file
  call accepted: no other cellranger jobs running

#####Input information#####
SETUP and exit: false
FORMAT: indrop-v2
BARCODES: /home1/shodai_s_16/universc/whitelists/inDrop-v2_barcodes.txt
INPUT(R1):
 /share/Users/shodai_s_16/f2/f2_S1_L001_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S1_L002_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S1_L003_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S1_L004_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S1_L005_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L001_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L002_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L003_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L004_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L005_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L001_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L002_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L003_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L004_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L005_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L001_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L002_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L003_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L004_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L005_R1_001.fastq
INPUT(R2):
 /share/Users/shodai_s_16/f2/f2_S1_L001_R1_001.fastq
 /share/Users/shodai_s_16/f2/f2_S1_L002_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S1_L003_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S1_L004_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S1_L005_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L001_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L002_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L003_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L004_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S2_L005_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L001_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L002_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L003_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L004_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S3_L005_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L001_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L002_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L003_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L004_R2_001.fastq
 /share/Users/shodai_s_16/f2/f2_S4_L005_R2_001.fastq
SAMPLE: f2
LANE: 1,2,3,4,5
ID: dmf2
DESCRIPTION: dmf2
***WARNING: no description given, setting to ID value***
REFERENCE: /share/Users/shodai_s_16/D.rerio_genome-3.0.2
NCELLS: (no cell number given)
CHEMISTRY: SC3Pv2
JOBMODE: local
***WARNING: --jobmode "sge" is recommended if running script with qsub***
CONVERSION: true
##########

creating a folder for all Cell Ranger input files ...
 directory input4cellranger_dmf2 created for converted files
moving file to new location
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L001_R2_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L002_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L003_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L004_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L005_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L001_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L002_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L003_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L004_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L005_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L001_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L002_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L003_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L004_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L005_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L001_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L002_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L003_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L004_R1_001.fastq ...
using transcripts in Read 2 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L005_R1_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L001_R1_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L002_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L003_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L004_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S1_L005_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L001_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L002_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L003_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L004_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S2_L005_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L001_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L002_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L003_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L004_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S3_L005_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L001_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L002_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L003_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L004_R2_001.fastq ...
using transcripts in Read 1 for indrop-v2
 handling /share/Users/shodai_s_16/f2/f2_S4_L005_R2_001.fastq ...
converting input files to confer cellranger format ...
 adjustment parameters:
  barcodes: 3 bp at its head
  UMIs: -4 bp at its tail
 making technology-specific modifications ...
  ... remove adapter for indrop-v2
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo  ... barcode and UMI linker removed for indrop-v2: command not found
 adjusting barcodes of R1 files
 handling input4cellranger_dmf2/f2_S1_L001_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L001_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S1_L002_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L002_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S1_L003_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L003_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S1_L004_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L004_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S1_L005_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L005_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L001_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L001_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L002_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L002_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L003_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L003_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L004_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L004_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L005_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L005_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L001_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L001_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L002_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L002_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L003_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L003_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L004_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L004_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L005_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L005_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L001_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L001_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L002_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L002_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L003_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L003_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L004_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L004_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L005_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L005_R1_001.fastq adjusted
 adjusting UMIs of R1 files
 handling input4cellranger_dmf2/f2_S1_L001_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L001_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S1_L002_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L002_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S1_L003_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L003_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S1_L004_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L004_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S1_L005_R1_001.fastq ...
  input4cellranger_dmf2/f2_S1_L005_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L001_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L001_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L002_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L002_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L003_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L003_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L004_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L004_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S2_L005_R1_001.fastq ...
  input4cellranger_dmf2/f2_S2_L005_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L001_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L001_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L002_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L002_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L003_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L003_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L004_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L004_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S3_L005_R1_001.fastq ...
  input4cellranger_dmf2/f2_S3_L005_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L001_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L001_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L002_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L002_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L003_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L003_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L004_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L004_R1_001.fastq adjusted
 handling input4cellranger_dmf2/f2_S4_L005_R1_001.fastq ...
  input4cellranger_dmf2/f2_S4_L005_R1_001.fastq adjusted
running Cell Ranger ...

#####Cell Ranger command#####
cellranger count --id=dmf2\
        --fastqs=input4cellranger_dmf2\
        --lanes=1,2,3,4,5\
        --r1-length=26\
        --chemistry=SC3Pv2\
        --transcriptome=/share/Users/shodai_s_16/D.rerio_genome-3.0.2\
        --sample=f2\
        --description=dmf2\
        \
        --jobmode=local\
        --localcores=20\


##########
/home1/projects/ai-med/cellranger-3.0.2/cellranger-cs/3.0.2/bin
cellranger count (3.0.2)
Copyright (c) 2019 10x Genomics, Inc.  All rights reserved.
-------------------------------------------------------------------------------

Martian Runtime - '3.0.2-v3.2.0'
2021-11-10 10:56:15 [jobmngr] WARNING: configured to use 83GB of local memory, but only 61.0GB is currently available.
Serving UI at http://cpu01:34709?auth=t4Ft9vsko2lew-6LwpqSv5gfiAe0MEEXAGvD1Gzqicc

Running preflight checks (please wait)...
2021-11-10 10:56:16 [runtime] (ready)           ID.dmf2.SC_RNA_COUNTER_CS.EXPAND_SAMPLE_DEF
2021-11-10 10:56:16 [runtime] (run:local)       ID.dmf2.SC_RNA_COUNTER_CS.EXPAND_SAMPLE_DEF.fork0.chnk0.main
2021-11-10 10:56:22 [runtime] (chunks_complete) ID.dmf2.SC_RNA_COUNTER_CS.EXPAND_SAMPLE_DEF
Checking sample info...
Checking FASTQ folder...
Checking reference...
Checking reference_path (/share/Users/shodai_s_16/D.rerio_genome-3.0.2) on cpu01...
Checking chemistry...
Checking read 1 length...
Checking optional arguments...
mrc: '3.0.2-v3.2.0'

mrp: '3.0.2-v3.2.0'

Anaconda: Python 2.7.14 :: Anaconda, Inc.

numpy: 1.14.2

scipy: 1.0.1

pysam: 0.14.1

h5py: 2.8.0

pandas: 0.22.0

STAR: STAR_2.5.1b

samtools: samtools 1.7
Using htslib 1.7
Copyright (C) 2018 Genome Research Ltd.

2021-11-10 10:56:24 [runtime] (ready)           ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHEMISTRY_DETECTOR.DETECT_CHEMISTRY
2021-11-10 10:56:24 [runtime] (run:local)       ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHEMISTRY_DETECTOR.DETECT_CHEMISTRY.fork0.chnk0.main
2021-11-10 10:56:24 [runtime] (ready)           ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.DISABLE_FEATURE_STAGES
2021-11-10 10:56:24 [runtime] (run:local)       ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.DISABLE_FEATURE_STAGES.fork0.chnk0.main
2021-11-10 10:56:24 [runtime] (ready)           ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION
2021-11-10 10:56:24 [runtime] (run:local)       ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION.fork0.chnk0.main
2021-11-10 10:56:24 [runtime] (chunks_complete) ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.DISABLE_FEATURE_STAGES
2021-11-10 10:56:24 [runtime] (chunks_complete) ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION
2021-11-10 10:56:24 [runtime] (failed)          ID.dmf2.SC_RNA_COUNTER_CS.SC_RNA_COUNTER.CHEMISTRY_DETECTOR.DETECT_CHEMISTRY

[error] You selected chemistry 'SC3Pv2', which expects the cell barcode sequence in read R1.
In input data, an extremely low rate of correct barcodes was observed for this chemistry (0.00 %).
Please check your input data and chemistry selection. Note: manual chemistry detection is not required in most cases.
Input: {'lanes': [u'1', u'2', u'3', u'4', u'5'], 'sample_names': [u'f2'], 'sample_indices': None, 'fastq_mode': u'ILMN_BCL2FASTQ', 'read_path': u'/share/Users/shodai_s_16/input4cellranger_dmf2', 'interleaved': False}

2021-11-10 10:56:24 Shutting down.
Waiting 6 seconds for UI to do final refresh.
Saving pipestance info to dmf2/dmf2.mri.tgz
For assistance, upload this file to 10x Genomics by running:

cellranger upload <your_email> dmf2/dmf2.mri.tgz

cellranger run complete
***Notice: Cloupe file cannot be computed for indrop-v2
           Cloupe files generated by this pipeline are corrupt
           and cannot be read by the 10x Genomics Loupe Browser.
           We do not provide support for Cloupe files as this
           requires software from 10x Genomics subject to their
           End User License Agreement.
           Cloupe files are disabled in compliance with this.
updating .lock file
 no other jobs currently run by cellranger 3.0.2 in /home1/projects/ai-med/cellranger-3.0.2/cellranger
 no conflicts: whitelist can now be changed for other technologies
replacing modified barcodes with the original in the output gene barcode matrix
gzip: dmf2/outs/raw_feature_bc_matrix/barcodes.tsv.gz: No such file or directory
gzip: dmf2/outs/filtered_feature_bc_matrix/barcodes.tsv.gz: No such file or directory
sh: 1: cannot create dmf2/outs/raw_feature_bc_matrix/barcodes.tsv.gz: Directory nonexistent
sh: 1: cannot create dmf2/outs/filtered_feature_bc_matrix/barcodes.tsv.gz: Directory nonexistent
barcodes recovered

#####Conversion tool log#####
cellranger 3.0.2

Original barcode format: indrop-v2 (then converted to 10x)

cellranger runtime: 13s
##########

Hi Shodai,

With regard to this error message:
"/home1/shodai_s_16/universc/launch_universc.sh: line 2925: echo ... barcode and UMI linker removed for indrop-v2: command not found"
This is a typo on our end and we can correct this.
This does not affect any of the outcome for you, and thank you for noticing this.

Regarding the set chemistry.
You can set different chemistry but this is likely not the cause.
Looking at the output, it's clear to me that (A) the list of whitelist barcodes and (B) the list of sequences that occur in the first 16 bases of the R1 files under "input4cellranger_dmf2" are not matching.

So can you share 4 files with us? (small portions of each should be fine)

  1. The original R2 file.
  2. The R1 file under input4cellranger_dmf2 that corresponds to R2.
  3. The original barcode file you selected.
  4. The barcode file used that is under "cellranger-cs/3.0.2/lib/python/cellranger/barcodes/737K-august-2016.txt"

I think this would be enough for us to figure out what the issue may be.

Thank you for bringing this to our attention.

Kai Battenberg

Dear Kai Batternberg,

Thank you for your time.
Small portions of each files you mentioned are as follows:

  1. The original R2 files
    @SRR6176634.1 NS500422:399:HFH3MBGXY:1:11101:9214:1056_CAGATC length=50
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    +SRR6176634.1 NS500422:399:HFH3MBGXY:1:11101:9214:1056_CAGATC length=50
    ##################################################
    @SRR6176634.2 NS500422:399:HFH3MBGXY:1:11101:21209:1068_CAGAAC length=50
    NCCATNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
    +SRR6176634.2 NS500422:399:HFH3MBGXY:1:11101:21209:1068_CAGAAC length=50
    #AAAA#############################################
    @SRR6176634.3 NS500422:399:HFH3MBGXY:1:11101:13722:1075_CAGCTC length=50
    NGATATTGCCNGAGCNNGTGNATGTNACGCCTTNGGTTGNNNNNNNNNNN
    +SRR6176634.3 NS500422:399:HFH3MBGXY:1:11101:13722:1075_CAGCTC length=50
    #A////EE//#E/EE##//E#//A/#/<<///6#EEA/E###########
    ...

  2. The R1 file under input4cellranger_dmf2 that corresponds to R2.
    There are R1 files under input4cellranger_dmf2, but there are no description in all of R1 files.

  3. The original barcode file you selected.
    I did not select barcode file. The command I did is as follows:

bash /home1/shodai_s_16/universc/launch_universc.sh -t "indrops-v2" --localcores 20 --id dmf2 --reference "/share/Users/shodai_s_16/D.rerio_genome-3.0.2" --read1 "/share/Users/shodai_s_16/f2/f2_S1_L001_R1_001.fastq" "/share/Users/shodai_s_16/f2/f2_S1_L002_R1_001.fastq"...

Shoud I set barcode files?

  1. The barcode file used that is under "cellranger-cs/3.0.2/lib/python/cellranger/barcodes/737K-august-2016.txt"

AAACAAACAAACAAAC
AAACACGGAAACACGG
AAACACTAAAACACTA
AAACCGCCAAACCGCC
AAACGATCAAACGATC
AAACGTGAAAACGTGA
AAACTACAAAACTACA
AAACTGTGAAACTGTG
AAAGAAAGAAAGAAAG
AAAGAGGCAAAGAGGC
AAAGCCCGAAAGCCCG
...

Best regards,
Shodai Suzuki

Dear Suzuki-san,

Thanks for contacting us, I am pleased to hear that you are interested to apply our software to an interesting problem. Allow me to clarify a few points in case it helps.

You appear to be very resourceful to install the correct version and set up a custom reference for Zebrafish. Best of luck troubleshooting this issue. I hope it is a misunderstanding that can be resolved. If necessary we can make changes to the source code in a future release if it is necessary to avoid this.

  1. UniverSC was originally designed to run on raw in-house data sequenced by yourself.
    For this reason, some problems may arise from using published data. Based on your sample headers, it appears you are using data for NCBI SRA database: https://www.ncbi.nlm.nih.gov/sra/?term=SRR6176634
    It is possible our tool on published data as demonstrated in the test data. See these scripts for examples.
    https://github.com/minoda-lab/universc/blob/master/test/shared/smartseq3-test/test_auto_index.sh
    https://github.com/minoda-lab/universc/blob/master/test/shared/dropseq-test/prepare_files.sh
    The sequence in your sample headers appears to be the "sample index" rather than the cell barcode so extracting these are not necessary if your samples are already correctly demultiplexed.
    Please ensure that you have paired-end data from SRA using fastq_dump --split-files <ID>.
    https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=toolkit_doc&f=fastq-dump

  2. I've tested the pipeline on inDrops v3 and made changes for version 2 as documented here:
    https://github.com/indrops/indrops
    https://teichlab.github.io/scg_lib_structs/methods_html/inDrop.html
    You can find the test data I used here:
    https://github.com/minoda-lab/universc/tree/master/test/shared/indrop-v3-test
    To clarify, inDrops v2 only has 2 files R1 and R2. There is no need to rename R1 and R2 to match Cell Ranger, please use the raw R1 and R2 as input without switching their names.

  3. The barcodes are automatically changed before running Cell Ranger. I've packaged them with the pipeline. What Kai is checking is whether this occurred correctly.
    For inDrops v2 the barcode is generated in combinations of pairs from this file:
    https://github.com/indrops/indrops/blob/master/ref/barcode_lists/gel_barcode2_list.txt
    Your barcode configuration therefore appears to be correct.

  4. Since no sequencing technology to my knowledge produces more than 9 lanes, the lane parameter is hard-coded to a single-digit in UniverSC. Please ensure all lane numbers are give in the file name as "L00N" with 2 leading zeroes. It may be possible to use "L00NN" for 2 digit input but I am unsure of this. It may be necessary to concatenate your fastq files before running UniverSC: cat sample_S1_L001_R1_001.fastq sample_S1_L001_R2_001.fastq sample_S1_L001_R3_001.fastq . fulldata_S1_L001_R1_001.fastq assuming your samples contain unique indexes the results should be identical.

The error you are seeing above from Cell Ranger can occur for a variety of reasons. The chemistry parameter it set automatically by UniverSC for this technology so there is no need to change it. I suspect the problem is either:

  • the barcode conversion (in this case yours has been updated correctly)
  • the input files themselves are not what is expected of the program (please check)
  • the cell barcode for this technology is not where we expected it to be (in which case we shall need to reconfigure it)

Suzuki-san, if you can provide us with the first few lines of the other input file you are using it would help to identify the problem.

Kai, if you wish to reproduce the issue, we have a premade Danio rerio (Zebrafish) reference on our RIKEN servers and the raw files are available on NCBI:
https://trace.ncbi.nlm.nih.gov/Traces/sra/?run=SRR6176634

Thank you,

Tom Kelly

Dear Tom Kelly,
I appreciate your support.

The sequence in your sample headers appears to be the "sample index" rather than the cell barcode so extracting these are not necessary if your samples are already correctly demultiplexed.

You mean R1 includes transcripts and sample index, while R2 has cell barcodes and UMI, right?

As you mention, I got the fastq files through SRA tool kit.
I used fasterq-dump, in which files are split automatically, not fastq-dump. I will try again with fastq-dump with an option —split-files.

Suzuki-san, if you can provide us with the first few lines of the other input file you are using it would help to identify the problem.

Yes, SRR6176610 fastq files (also produced by indrops-v2) are as below:

SRR6176610_1.fastq

(base) shoudai@ssuzukiMacBookPro ~ % head -n 48 /Users/shoudai/sratoolkit/fastqfiles/files/SRR6176610_1.fastq
@SRR6176610.1 NS500422:399:HFH3MBGXY:1:11101:6280:1051_ATCACG length=33
CTGGANTCAGAGTACTTTAGTGTGGAAAAGAGA
+SRR6176610.1 NS500422:399:HFH3MBGXY:1:11101:6280:1051_ATCACG length=33
AAAA/#E/E//AAA<AEE//AEEE//<A/AEE/
@SRR6176610.2 NS500422:399:HFH3MBGXY:1:11101:3241:1086_ATCACG length=34
ACGTTCGAAGGGTTTTTCGGAGTTGATGTTCGAC
+SRR6176610.2 NS500422:399:HFH3MBGXY:1:11101:3241:1086_ATCACG length=34
AAAAAA//E<EEE//EEE6AEEEA/AEE/EEE//
@SRR6176610.3 NS500422:399:HFH3MBGXY:1:11101:26605:1087_ATCCCG length=34
CGCGCCATCTCACTCATTCCATCTCACTCCTTTC
+SRR6176610.3 NS500422:399:HFH3MBGXY:1:11101:26605:1087_ATCCCG length=34
/6A////AA/A//A//AA/<AEA6//////EAE/
@SRR6176610.4 NS500422:399:HFH3MBGXY:1:11101:11785:1094_ATCACG length=34
TTCCACTTCTCCTTTGTGTTCAACAGAAGAAACC
+SRR6176610.4 NS500422:399:HFH3MBGXY:1:11101:11785:1094_ATCACG length=34
</AAAEAEEEEAAE//EEE/E/AAE/EAE/EAEE
@SRR6176610.5 NS500422:399:HFH3MBGXY:1:11101:24551:1095_ATCACG length=34
CCTTTTTACCTCCAACCCACAGTTGTATTTACCG
+SRR6176610.5 NS500422:399:HFH3MBGXY:1:11101:24551:1095_ATCACG length=34
AA//AEE/EEE<<E/AEE/A/A</E//EA/E/<E
@SRR6176610.6 NS500422:399:HFH3MBGXY:1:11101:21780:1095_AACACG length=34
TGGCTTTATTTAGATTATTGGTGAGTTTTTACAC
+SRR6176610.6 NS500422:399:HFH3MBGXY:1:11101:21780:1095_AACACG length=34
AAAAAEEEEEEEEAEEEE<<E/EEEEEEEEEE/E
@SRR6176610.7 NS500422:399:HFH3MBGXY:1:11101:26662:1101_ATCCCG length=34
GTTGCTTGAATAGACAATGAATTGTATTACGAAT
+SRR6176610.7 NS500422:399:HFH3MBGXY:1:11101:26662:1101_ATCCCG length=34
AAA/AEA/EEE<AEEEEA/EEEE6EEEEEAEEEE
@SRR6176610.8 NS500422:399:HFH3MBGXY:1:11101:3292:1102_ATCACG length=33
CCTGTGACCTTGGTTGACTTGCTCAAATGCTTT
+SRR6176610.8 NS500422:399:HFH3MBGXY:1:11101:3292:1102_ATCACG length=33
AAA/A/AEAEEE<EEE/EEEEEEEEEEAEEEEE
@SRR6176610.9 NS500422:399:HFH3MBGXY:1:11101:20496:1104_ATCACG length=34
AGTCTTAGAAAAGGAAATGTAGGTGAACAAAGTC
+SRR6176610.9 NS500422:399:HFH3MBGXY:1:11101:20496:1104_ATCACG length=34
AAAAAEE/AAEEEEAEEAEA/EEEEEEEEEEEEE
@SRR6176610.10 NS500422:399:HFH3MBGXY:1:11101:7512:1108_ATCACG length=33
TCCCAAGTAATGAAGGACACAATCTTCATCTAA
+SRR6176610.10 NS500422:399:HFH3MBGXY:1:11101:7512:1108_ATCACG length=33
AAAAAA/EEEEE//EEEEEEE6EEEEE/AEEEE
@SRR6176610.11 NS500422:399:HFH3MBGXY:1:11101:20905:1117_ATCACG length=33
AACAGATCTAATTATTCATTAAAAATCATTAAA
+SRR6176610.11 NS500422:399:HFH3MBGXY:1:11101:20905:1117_ATCACG length=33
AA/A//AEE/AEE6EEA66///A/<E/6/EA//
@SRR6176610.12 NS500422:399:HFH3MBGXY:1:11101:14210:1127_ATCACG length=34
AAGGTGTAGCTATTTAATGCATCTCAATGCTTAT
+SRR6176610.12 NS500422:399:HFH3MBGXY:1:11101:14210:1127_ATCACG length=34
A6AAAEA//EE/EA</EEEEEEEEAA/6/EE/A/

SRR6176610_2.fastq

(base) shoudai@ssuzukiMacBookPro ~ % head -n 48 /Users/shoudai/sratoolkit/fastqfiles/files/SRR6176610_2.fastq
@SRR6176610.1 NS500422:399:HFH3MBGXY:1:11101:6280:1051_ATCACG length=50
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+SRR6176610.1 NS500422:399:HFH3MBGXY:1:11101:6280:1051_ATCACG length=50
##################################################
@SRR6176610.2 NS500422:399:HFH3MBGXY:1:11101:3241:1086_ATCACG length=49
NAGATGGCTGAGTGATTGCTTGTGACTCCTTATAGGTGGCATTACTAAT
+SRR6176610.2 NS500422:399:HFH3MBGXY:1:11101:3241:1086_ATCACG length=49
#A/AA/<E//AE/EEA/E/A/A/AAEEEA</</E//E//E6///EEE//
@SRR6176610.3 NS500422:399:HFH3MBGXY:1:11101:26605:1087_ATCCCG length=50
NAGACGATGGGAGTGATTGCTTGTGACGCCTTTGATGCCCCCGCGTTAAT
+SRR6176610.3 NS500422:399:HFH3MBGXY:1:11101:26605:1087_ATCCCG length=50
#AAAAAEEA//EEE6EAE6EEE/EE/EEAEEAEA/EAEEEEE/E<EE6AE
@SRR6176610.4 NS500422:399:HFH3MBGXY:1:11101:11785:1094_ATCACG length=50
GATTGAGGGTGAGTGATTGCTTGTGACGCCTTAGGTATGACGTGCTTATT
+SRR6176610.4 NS500422:399:HFH3MBGXY:1:11101:11785:1094_ATCACG length=50
AAAAA6EEAEEEEEEEAE6EAE/AE/EEEAAAE6EEEEEEEEEAEEE<EE
@SRR6176610.5 NS500422:399:HFH3MBGXY:1:11101:24551:1095_ATCACG length=50
ACGTATACGAGTGATCGCTAGTGACGCCTTAGAGTCTGGCGGACATTTTT
+SRR6176610.5 NS500422:399:HFH3MBGXY:1:11101:24551:1095_ATCACG length=50
AAA/AEEE//A/AEE/AEA//<EEE/EE/6<<///EEEAE/E/E//EEEE
@SRR6176610.6 NS500422:399:HFH3MBGXY:1:11101:21780:1095_AACACG length=50
GGGGGGGGGGGGGGCGGGGGTGGGGCGGCGTAGGGGGGGGGGGGGTTGGG
+SRR6176610.6 NS500422:399:HFH3MBGXY:1:11101:21780:1095_AACACG length=50
/6AA/////E/E/E///E/6/AEA//E6/<//EE///E/E////<//EEE
@SRR6176610.7 NS500422:399:HFH3MBGXY:1:11101:26662:1101_ATCCCG length=50
AGGGAGCGAGAGTGACCGCATGTGACGCAATCGTATGTCAGAAAATTTTT
+SRR6176610.7 NS500422:399:HFH3MBGXY:1:11101:26662:1101_ATCCCG length=50
/AAA6//EEE<//E/E/E/////E//A/A6///E<//E////6//<A/A/
@SRR6176610.8 NS500422:399:HFH3MBGXY:1:11101:3292:1102_ATCACG length=50
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGTGGG
+SRR6176610.8 NS500422:399:HFH3MBGXY:1:11101:3292:1102_ATCACG length=50
AAAAAEEEEEE/E<666EA/<EA<AEE<A<AEEEAEEEEEEEEEE/AEEE
@SRR6176610.9 NS500422:399:HFH3MBGXY:1:11101:20496:1104_ATCACG length=50
AGAAGTGCCGAGTGATTGCTTGTGACGCCTTATGCGGATCACGACTTTTT
+SRR6176610.9 NS500422:399:HFH3MBGXY:1:11101:20496:1104_ATCACG length=50
AAAAAEEEEEE<EEEE6EEA6E/<EEEEEAAEE<E<EEEEEEE/EEEAAE
@SRR6176610.10 NS500422:399:HFH3MBGXY:1:11101:7512:1108_ATCACG length=50
TGAGATCTGACGAGCGATAGCATGTGACGCCTTGTAATAATTTGAATTAT
+SRR6176610.10 NS500422:399:HFH3MBGXY:1:11101:7512:1108_ATCACG length=50
//A66EEE6//EEE////66///<//<E///A///E//EE/E//EA<AEE
@SRR6176610.11 NS500422:399:HFH3MBGXY:1:11101:20905:1117_ATCACG length=49
ATCAGCGCGAGTGATTGCTTGTGACGCCTTACGAAACGATATTCTTTTA
+SRR6176610.11 NS500422:399:HFH3MBGXY:1:11101:20905:1117_ATCACG length=49
AAAAAEEEAE/E<EAEAEAA<AEEA6EEAEEEE/EEEEAEEEEEEEA<E
@SRR6176610.12 NS500422:399:HFH3MBGXY:1:11101:14210:1127_ATCACG length=50
GAGATCTCGGGAGTGATTGCTTCTGACGCCTTAGTCTAGGAACTTTTTTT
+SRR6176610.12 NS500422:399:HFH3MBGXY:1:11101:14210:1127_ATCACG length=50
AA/AAEEEEEAEEEEEAEAEEE/<EEEEEEEEAEEEEEE/EEEEEEEAAE

It may be necessary to concatenate your fastq files before running UniverSC: cat sample_S1_L001_R1_001.fastq sample_S1_L001_R2_001.fastq sample_S1_L001_R3_001.fastq . fulldata_S1_L001_R1_001.fastq assuming your samples contain unique indexes the results should be identical.

I try to use 6 sample from zebrafish brain (referred to as f1-f6).
For example, the f1 corresponds to SRR6176610-SRR6176633 (24 fastq files including R1 and R2 files in total).
Thus, should I concatenate my fastq files of each f1-f6?

Regards,
Shodai Suzuki

Dear Suzuki-san,

Thanks for sharing both files. This is informative.

The linker sequence "GAGTGATTGCTTGTGACGCCTT" expected for indrops v1 and v2 is present in your R2. So your files are correct for inDrops-v2. The sequences before and after this should be your barcodes:

---[8-11 bp barcode1]GAGTGATTGCTTGTGACGCCTT[8-bp barcode2][6-bp UMI]---

For example in this read:

@SRR6176610.4 NS500422:399:HFH3MBGXY:1:11101:11785:1094_ATCACG length=50
GATTGAGGGTGAGTGATTGCTTGTGACGCCTTAGGTATGACGTGCTTATT
+SRR6176610.4 NS500422:399:HFH3MBGXY:1:11101:11785:1094_ATCACG length=50
AAAAA6EEAEEEEEEEAE6EAE/AE/EEEAAAE6EEEEEEEEEAEEE<EE

barcode 1: GATTGAGGGT
barcode 2: AGGTATGA
umi: CGTGCT
sample index (i7): ATCACG

There is no need to remove the linker sequence this is done automatically by UniverSC. Unfortunately, some data will be lost because of poor sequence quality leading to mismatches. Hopefully this does not affect UMI depth much in deep sequencing data as the same UMI will sequenced again at higher quality in other reads.

To clarify on usage, you should not be running samples with different sample index i7 sequences in the same run. You need to call UniverSC on each sample separately. In your case, it appears that each sample was demultiplexed into a separate fastq already. So there is no need to use an I2 sample index file as it is better to run each sample one at a time. You should only use multiple "lanes" if the same sample has been sequenced multiple times.

Otherwise everything you are doing appears to be correct. However, I have discovered a problem. The barcode1 sequences are the reverse complement of the barcode sequences as they are sequenced in the opposite direction for version 2. Unfortunately this could be troublesome and will require changes to the UniverSC source code to support inDrops v2 correctly.

This is different to inDrops version 3 that we tested on, although we have added support for legacy technologies to support exactly this use case (taking advantage of existing published data from older protocols). Apologies for the inconvenience.

I will try to make time to correct the source code for UniverSC to support this technology properly in a future release. Sorry I have moved onto another position so I am busy with my new duties but as the one who added inDrops technologies, I will take responsibility for this issue.

If you wish to try it yourself in the meantime, these files are included with UniverSC and you can try taking the reverse complement of pairs of sequences to generate correct barcodes yourself. Sorry it may be difficult but it is possible with the current version of UniverSC using the "custom barcodes" feature:
https://github.com/minoda-lab/universc/blob/master/whitelists/inDrop_gel_barcode1_list.txt
https://github.com/minoda-lab/universc/blob/master/whitelists/inDrop_gel_barcode2_list.txt

Thank you,

Tom Kelly

Dear Tom Kelly,

Thank you very much for your advice.
I have understood that UniverSC didin't work on my fastq files because the fastq files are paired end.
I will try to make the correct barcode list as you mention.

Thank you,
Shodai Suzuki

@shodais
Dear Suzuki-san,

Sorry I think you have misunderstood. Note the UniverSC requires paired-end reads. Using either fastq_dump or newer SRA tools to download paired-end FASTQ files is okay.

I have managed to reproduce the problem in inDrops v2 data and have resolved it. The updated version UniverSC 1.1.7 should support this and inDrops v1. In both cases the reverse complement of the barcode sequences was used. I have prepared custom barcode files to automatically resolve this in the next release. Running UniverSC again should detect the existing barcode whitelist and update it correctly. This version is building in a docker container which should be pushed soon.

This allows partial mismatches in the adapter sequence which should work even with sequencing errors. If you still have problems, I recommend quality trimming and filtering to remove paired-end reads of poor quality before running UniverSC. Please ensure a tool that supports paired-ends matching is used.

The barcode 1 has length 8-11 bp. I have matched the last 8 bases before the linker sequence. I checked that these sequences are still unique to each cell barcode, even if the first 1-3 bases are removed in some cases. This generates a 16 bp index for each read. The same operations are performed in the barcode whitelist. This updated version runs a Cell Ranger 3.0.2 call without the errors described above.

I have tested this using test data in this project:
https://github.com/indrops/indrops/tree/master/test/seq_runs

I encountered the same issue there as with your data. Hopefully this resolves your problem. Please let us know if it does and close the "issue" (I'll leave the Singularity issue open until I can test it myself).

Thank you,

Tom Kelly

Dear Tom,
I have understood that UniverSC requires paired-end reads.

Thank you for the release of the updated version, UnvierSC-1.1.7. I tried working with UniverSC-1.1.7, using two reads of one sample as a test, and cellranger count was worked succesfully, generating outputs with counting data.
Now, I am working with all the reads, and cellranger count is being worked, although the count is not finished because of too much data.

I am relieved to see that cellranger was worked and I very much appreciate your support.
To be honest, I'm not specialized in single cell RNA-seq analyisis, but I learned a lot through the communication with you.
I'm closing the "issue".

Best regards,
Shodai Suzuki

Excellent! Glad to hear the updated version run on your dataset: