h5 file generation problem

Question

h5 file generation problem

evenDDDDD opened this issue 5 years ago · comments

1.Why appears such an error when generating h5 file：
"MY_ERROR: Error in CreateSeuratObject (raw.data = object @ exprs, meta.data = object @ obs): The parameter is not useful (raw.data = object @ exprs) \ n".
I used the same data and R script in the collect part of your GitHub to run it. Seurat is also installed;
2、Is cell_ontology necessary?
thank you！

Zhi-Jie Cao · Answer 1 · Thu Dec 12 2019 18:35:23 GMT+0800 (China Standard Time)

Thanks for your interest! I guess it's a problem with incompatible Seurat versions (v3 changed the API significantly and is incompatible with v2). Our data collection scripts used Seurat v2.3.3. Could you confirm what Seurat version are you using?

lulu deng · Answer 2 · Thu Dec 12 2019 18:53:37 GMT+0800 (China Standard Time)

Thank you very much for responding so quickly. I just checked the version of seurat and determined it was v3.1.1. Maybe I need to install seurat v2.3.3. And about another question, is the annotation of cell ontology necessary? Because my data may not get this information. Thanks again!

…

------------------ 原始邮件 ------------------ 发件人: "Zhijie Cao"<notifications@github.com>; 发送时间: 2019年12月12日(星期四) 晚上6:35 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "643431561"<643431561@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] h5 file generation problem (#3) Thanks for your interest! I guess it's a problem with incompatible Seurat versions (v3 changed the API significantly and is incompatible with v2). Our data collection scripts used Seurat v2.3.3. Could you confirm what Seurat version are you using? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Zhi-Jie Cao · Answer 3 · Thu Dec 12 2019 19:04:22 GMT+0800 (China Standard Time)

Okay, great! Switching to Seurat v2 should solve the problem.
The cell ontology annotation is unnecessary. Just skip the "cell_ontology" argument when constructing the dataset, it should work fine.

lulu deng · Answer 4 · Sat Dec 14 2019 17:20:27 GMT+0800 (China Standard Time)

Thank you for your answer. I solved the problem smoothly, but I encountered some other problems. Why did I generate the h5 file and train DIRECti model, there were no latent, tSNE and UMAP results in the h5 file; I also used the "inference" method. I tried a lot and no errors appeared, but it didn’t work. How do these results get into the h5 file?? I'm sorry if I disturbed you. ------------------ 原始邮件 ------------------ 发件人: "Zhijie Cao"<notifications@github.com>; 发送时间: 2019年12月12日(星期四) 晚上7:04 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "643431561"<643431561@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] h5 file generation problem (#3) Okay, great! Switching to Seurat v2 should solve the problem. The cell ontology annotation is unnecessary. Just skip the "cell_ontology" argument when constructing the dataset, it should work fine. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Zhi-Jie Cao · Answer 5 · Sat Dec 14 2019 21:12:49 GMT+0800 (China Standard Time)

Can you provide a code snippet to illustrate how you are using the model?
To get the latent coordinates and write them to file, you can use data.latent = model.inference(data), and then call data.write_dataset("somefile.h5").

Some docs and examples can be found here:

Let me know if any further issues.

lulu deng · Answer 6 · Tue Dec 17 2019 14:50:05 GMT+0800 (China Standard Time)

Sorry to disturb you again, I would like to ask you how to output "matplotlib.axes._subplots.AxesSubplot object" as a picture, including the tsne plot and the final comparison chart of cell blast. Because I really lack the knowledge of graphing with python. For example, when I run "ax = combined_dataset.visualize_latent ("study")" in Ipython, it only outputs "[Info] Computing tSNE ..." without pictures.  In addition, when using the "visualize_latent" method for visualization, the "cell ontology" information is missing. Can I use "cell type1" instead? Looking forward to your reply！ ------------------ 原始邮件 ------------------ 发件人: "Zhijie Cao"<notifications@github.com>; 发送时间: 2019年12月14日(星期六) 晚上9:12 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "643431561"<643431561@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] h5 file generation problem (#3) Can you provide a code snippet to illustrate how you are using the model? To get the latent coordinates and write them to file, you can use data.latent = model.inference(data), and then call data.write_dataset("somefile.h5"). Some docs and examples can be found here: DIRECTi ExprDataSet Notebook Let me know if any further issues. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Zhi-Jie Cao · Answer 7 · Tue Dec 17 2019 22:06:08 GMT+0800 (China Standard Time)

If you are running IPython with access to a graphical interface, or using Jupyter Notebook, the picture should appear automatically when tSNE computation is done (I personally have only used Jupyter Notebook though). Computing tSNE can take a long period of time if the number of cells is large. If that is the case, just wait a few moments and let it finish. If no graphical interface is available (e.g. running IPython over ssh), the picture would not appear. For matplotlib.axes._subplots.AxesSubplot objects, you may use ax.get_figure().savefig("file.pdf") to save it to a file, assuming ax is the returned Axes object. For the Sankey comparison plot, the cb.blast.sankey function returns a plotly dict. The picture should appear automatically if you use Jupyter Notebook, otherwise you may use plotly.io.write_image(d, "file.pdf"), assuming d is the returned plotly dict.
Yes, you can use any type of annotation that you have, not limited to cell ontology.

lulu deng · Answer 8 · Mon Jul 06 2020 21:50:40 GMT+0800 (China Standard Time)

Hi！
Your work on defining cell types is excellent. I want to repeat your work, but I ran into a problem.
These are my codes below:
`expr_mat <- read.table("./p2_counts.txt",header = TRUE, row.names = 1)
expr_mat1 <- as.matrix(expr_mat)

meta_df <- read.table("./p2_metadata.txt", header = TRUE, row.names = 1, sep='\t')
colnames(meta_df) <- c("cell_type1")

cell_ontology <- read.csv("./p2_cell_ontology.csv", sep='\t')
cell_ontology <- cell_ontology[, c("cell_type1", "cell_ontology_class", "cell_ontology_id")]

construct_dataset("./p22_10x/", as.matrix(expr_mat), meta_df, datasets_meta = NULL, cell_ontology)`

The error is：
Error in validObject(.Object) : invalid class “ExprDataSet” object: FALSE

I can't understand what is happening, my installed seurat is v2.3.4 and R is 3.6.3.
Look forward to your reply！

Zhi-Jie Cao · Answer 9 · Tue Jul 07 2020 18:44:50 GMT+0800 (China Standard Time)

It is likely because meta_df differs from expr_mat in terms of row number and row names. Could you validate that:

nrow(expr_mat) == nrow(meta_df)
all(rownames(expr_mat) == rownames(meta_df))

lulu deng · Answer 10 · Tue Jul 07 2020 19:15:53 GMT+0800 (China Standard Time)

Hi！ Thank you very much for your reply！ I modified my file to make sure that there is no case where the name does not match. The questions that have troubled me for a day are: 1、in the Query step, I used two different input files and reference for blast, but this error always appears when using the second file： ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). I'm sure my second input file does not have any NaN. Is this error related to my reference or the input file that needs to be annotated？ 2、I updated the latest version of Cell Blast AttributeError: 'BLAST' object has no attribute 'build_empirical' Below is my code：expr_mat <- read.table("gene_99_counts_2.txt",header = TRUE, row.names = 1) expr_mat1<-t(expr_mat) meta_df <- read.table("p2_metadata.txt", header = TRUE, row.names = 1, sep='\t') meta_df$region = "Hypothalamus" cell_ontology <- read.csv("p2_cell_ontology.csv", sep='\t') cell_ontology <- cell_ontology[, c("cell_type1", "cell_ontology_class", "cell_ontology_id")] construct_dataset("./p22_99/", as.matrix(expr_mat1), meta_df, datasets_meta = NULL, cell_ontology) blast = cb.blast.BLAST(models3, adata).build_empirical() tensorflow v1.8.0 cell blast v0.3.7 R3.6.3------------------ 原始邮件 ------------------ 发件人: "Zhijie Cao"<notifications@github.com>; 发送时间: 2020年7月7日(星期二) 晚上6:45 收件人: "gao-lab/Cell_BLAST"<Cell_BLAST@noreply.github.com>; 抄送: "643431561"<643431561@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [gao-lab/Cell_BLAST] h5 file generation problem (#3) It is likely because meta_df differs from expr_mat in terms of row number and row names. Could you validate that: nrow(expr_mat) == nrow(meta_df) all(rownames(expr_mat) == rownames(meta_df)) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Zhi-Jie Cao · Answer 11 · Tue Jul 07 2020 21:20:06 GMT+0800 (China Standard Time)

My guess is that the expression matrix contains cells with all-zero expression, or contains negative values (the expression matrix should consist of non-negative raw UMI counts).
The API has changed a little bit since v0.3. Please refer to the new tutorial here.