xiaoyeye / CNNC

covolutional neural network based coexpression analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Confusion about how the KEGG edges were filtered

suzannejin opened this issue · comments

Hi!

In the paper, you mentioned:

KEGG contains 290 pathways, and Reactome contains 1,581
pathways. For both, we only select directed edges with either activation or
inhibition edge types and filter out cyclic gene pairs where genes regulate
each other mutually (to allow for a unique label for each pair). In total, we
have 3,057 proteins with outgoing directed edges in KEGG, and the total
number of directed edges is 33,127. For Reactome, the corresponding
numbers are 2,519 and 33,641.

What I am not clear is if you removed all the cyclic gene pairs (A->B, and B->A), or just kept one direction. So for example, if you have A->B and B->A, you kept A->B but not B->A, or you removed both of them directly (?)

Also, if instead of predicting regulatory pairs, I only want to predict the pairs in the same pathway no matter their causality, then can I keep other KEGG edge types that are not activation or inhibition?

Thanks in advance!!

Hi,
You are welcome.
The training and test data generation is very flexible, which depends on what you want the model to learn.

  1. In the original paper, we removed all cyclic gene pairs.
  2. I think you can try it, and just keep in mind that the training and test should follow the same standard.