AllonKleinLab / SPRING_dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to pre-process annotations (custom color tracks and groupings)

cvillamar opened this issue · comments

Dear colleagues,
I'm trying to pre-process the required input for a local SPRING server by following the notebook in data_prep/spring_example_pbmc4k.ipynb
However, I am unable to run the function that generates the required files in a way that it would include my custom annotations with continuous and categorical data.

Here is how I load the annotation files (that I previuously saved in the same format that I would normally use for the SPRING server that is hosted in Allon Klein Lab)

import csv
with open(main_spring_dir + '../../spring.groupings.csv') as csvfile:
    reader = csv.reader(csvfile)
    cell_groupings = {}
    for row in reader:
        key = row[0]
        cell_groupings[key] = row[1:]
with open(main_spring_dir + '../../spring.custom.color.tracks.csv') as csvfile:
    reader = csv.reader(csvfile)
    custom_colors = {}
    for row in reader:
        key = row[0]
        custom_colors[key] = row[1:]

But when later I call the function below to generate the subplots and processed files, it breaks:

out = make_spring_subplot(E, gene_list, save_path, 
                    normalize = False, tot_counts_final = total_counts,
                    min_counts = 3, min_cells = 3, min_vscore_pctl = 60,show_vscore_plot = True, 
                    num_pc = 60, 
                    k_neigh = 5, 
                    num_force_iter = 500,
                    cell_groupings = cell_groupings,
                    custom_colors = custom_colors)

Displaying the following:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-e5e49c56d331> in <module>()
     11                     num_force_iter = 500,
     12                     cell_groupings = cell_groupings,
---> 13                     custom_colors = custom_colors)
     14 
     15 np.save(save_path + '/cell_filter.npy', np.arange(E.shape[0]))

/restricted/projectnb/crem-bioinfo/project_code/00_pan_project/SPRING_dev/data_prep/spring_helper.pyc in make_spring_subplot(E, gene_list, save_path, base_ix, normalize, exclude_dominant_frac, min_counts, min_cells, min_vscore_pctl, show_vscore_plot, exclude_gene_names, num_pc, sparse_pca, pca_norm, k_neigh, cell_groupings, num_force_iter, output_spring, precomputed_pca, gene_filter, custom_colors, exclude_corr_genes_list, exclude_corr_genes_minCorr, dist_metric, use_approxnn, run_doub_detector, dd_k, dd_frac, dd_approx, tot_counts_final)
    778             save_spring_dir_sparse_hdf5(E, gene_list, save_path, list(links),
    779                             custom_colors = custom_colors,
--> 780                             cell_groupings = cell_groupings)
    781         else:
    782             save_spring_dir_sparse_hdf5(E, gene_list, save_path, list(links),

/restricted/projectnb/crem-bioinfo/project_code/00_pan_project/SPRING_dev/data_prep/spring_helper.pyc in save_spring_dir_sparse_hdf5(E, gene_list, project_directory, edges, custom_colors, cell_groupings)
    654     # save custom colors
    655     custom_colors['Uniform'] = np.zeros(E.shape[0])
--> 656     write_color_tracks(custom_colors, project_directory+'color_data_gene_sets.csv')
    657 
    658     # create and save a dictionary of color profiles to be used by the visualizer

/restricted/projectnb/crem-bioinfo/project_code/00_pan_project/SPRING_dev/data_prep/spring_helper.pyc in write_color_tracks(ctracks, fname)
    598     out = []
    599     for name,score in ctracks.items():
--> 600         line = name + ',' + ','.join(['%.3f' %x for x in score])
    601         out += [line]
    602     out = sorted(out,key=lambda x: x.split(',')[0])

TypeError: float argument required, not str

Here's a view of the input of those annotations:

head -n 5 spring.custom.color.tracks.csv| cut -f 1-4 -d ","
nCount_RNA,15703,18128,41380
nFeature_RNA,4231,3411,6802
percent.mt,5.48302872062663,4.08208296557811,3.33977767037216
nCount_SCT,8331,7728,7891
nFeature_SCT,3526,2333,3155

head -n 5 spring.groupings.csv| cut -f 1-4 -d ","
orig.ident,F00431,F01380,F01391
Diagnosis,IPF,IPF,IPF
Sample_Name,TILD001,TILD028,VUILD64
Sample_Source,NTI,NTI,Vanderbilt
Status,ILD,ILD,ILD

Am I reading the annotations in the wrong format?
I was wondering if you had any version of the example notebook that would include annotations (continuous and categorical).
Thanks a lot for your help!

Thanks a lot, Caleb.
That solved the issue.