Teichlab / bbknn

Batch balanced KNN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Details for pbmc dataset used #30

mohit1997 opened this issue · comments

[Reopening Issue] Thanks for your quick reply! I am having trouble finding the 5' dataset on the 10X Genomics website. Is it no longer available? Can you share the link?

"The input data was downloaded from the 10X Genomics website. The exact 5′dataset was ‘PBMCs of a healthy donor 5′gene expression’, under Cell Ranger 2.1.0, under V(D)J + 5′Gene Expression. The exact 3′dataset was ‘8k PBMCs from a Healthy Donor’, under Cell Ranger 2.1.0, under Chromium Demonstration (v2 Chemistry)."

I had a bit of a root around the 10x website to see if I could locate the 5' dataset, and this doesn't appear to be the case. It seems that they discontinued a lot of the 5' data for pre-3.x cellranger. This might be caused by the fact there was a major VDJ processing redesign in cellranger around that time frame, so they wanted people to see the new and improved version.

You could theoretically root around the various PBMC 5' data and do a comparison of the number of overlapping barcodes between the BBKNN object and the 10x download, but I can't imagine why you'd want to do this. The analysis was just meant to illustrate BBKNN overcoming a 3'/5' technical effect in 10x data, and the exact choice of 3'/5' data wasn't of much relevance. It doesn't help that it was conducted by someone who has left the lab. Still, the fact that it got streamlined out of the final BBKNN publication should be a further testament to its importance.