Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nan value in PIDC results #56

Open
WWXkenmo opened this issue Aug 23, 2021 · 2 comments
Open

Nan value in PIDC results #56

WWXkenmo opened this issue Aug 23, 2021 · 2 comments

Comments

@WWXkenmo
Copy link

Hi Beeline team,
I am currently using your neat pipeline, while I have encountered a very wird typo in the rankedEdges.csv file of PIDC results. It seems that on my datasets, the edge weights measured by PIDC is all nan values, like this
image

But! after I used a search algorithm developed by myself ( this algorithm need to repeat run PIDC, which could not be applied on the large-scale scRNA-seq datasets), I found that just delete some of the genes (in my cases, the 441th,865th,866th genes), the edge weights are back to normal ??
image

I originally thought that may be these genes have some bad statistical characteristics, but regretly that I didn't find any special properties of those genes. (e.g. average expression, variance, coefficients of variation, etc...)
I found this thing is happened in most of my datasets, so I think its really important to be figured out, but I have no idea about how to solve it.

In order to let your team to check this typo, I have create a repo and upload the ExpressionData.csv, https://github.com/WWXkenmo/PIDC_bug

Best,
Ken

@ktakers
Copy link
Collaborator

ktakers commented Aug 23, 2021

Thank you for using BEELINE. I was able to reproduce the NaN error in the PIDC output using your example ExpressionData.txt. In the PIDC output I see the following error message for NaN edges:

Gamma distribution failed for Rps3 and Srgn; used normal instead.

I haven't root caused the error and will continue looking into this.

@ktakers
Copy link
Collaborator

ktakers commented Sep 14, 2021

I haven't found any issues in the way that BEELINE prepares the input or parses the output from PIDC. I believe the error is related to a poor fit of the input to gamma or normal distributions, but I haven't identified how this results in NaN values in the output from PIDC. I recommend following up with the maintainers of PIDC at https://github.com/Tchanders/NetworkInference.jl for further root causing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants