The present file , KEGG-GEO.pdf, contains the R programming codes used to develope a platform to analyze the coherency between thirteen kinds of genes pairs and gene expression profiles at mRNA level. The project was entitled "Can we assume the gene expression profile as a proxy for signaling network activity". Correlation analysis between all human KEGG signaling genes pairs (July,2017) in 1969 GEO datasets were conducted to see whether there is a coherency between gene expression profiles at mRNA level and signaling gene pairs. Files includes nine sections:
The first section describes how the KEGG edge list with 26,490 edges was built. In the second section, downloading and merging the up-down gene expression profiles is explained for KEGG genes. Section three walks you through the preprocessing of expression profiles. In this step, an extensive list containing 1969 experiments (GDS) was built. A large expression matrix called Exprtable with 40,903 samples in column and 3187 genes in rows was constructed. From this matrix, a list called SignalingNet was constructed, with an element for each gene pair in the KEGG edge list. In the fourth section, each element of SignalingNet contains the expression values and correlation information for the source and the target genes. Section five includes the information for coherency of the edges and the number of activation and inhibition edges having a specific p-value and correlation coefficient. Then, in the sixth section, ten sets of 1000 unconnected node pairs were built in which nodes never reach one another (based on KEGG information). The correlation analysis was also performed on these node pairs. In the seventh section, the number of edges having a specific p-value and correlation coefficient engaged in two-edge subgraphs were computed. Afterward, in the eighth section, the number of edges having a specific p-value and correlation coefficient engaged in multiple-edge subgraphs were computed. Finally, in the ninth section, the results are summarized in some tables.