You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(1) I think the paper Methods and the actual code disagree in calculating the Ms score. In the paper the text reads
... we calculate a score, Ms, for each putative microexon as Ms =1 − (1 − PsPU2)/n, where PU2 is the probability that the observed U2 score came from the Gaussian with the higher mean and n is the number of matches for a given intron.
So if ME_final$ME_P_value should be identified with Ms the code calculates 1 − (1 − Ps(1-PU2))/n and not 1 − (1 − PsPU2)/n defined in the paper (because it takes the probability of the LOWER mean component, and not HIGHER mean.)
Or should Ms actually be 1 - ME_final$ME_P_value? In that case it will not match the paper either, I think ...
(2) In general, I am not sure I can understand the logic behind the expression for the Ms score in the paper. High scores are likely to indicate true microexons, and by definition Ms = 1 − (1 − PsPu2)/n = 1 - 1/n + PsPu2/n. Wouldn't you want Ms decrease with increasing Ps? The formula is the opposite. Also, if I understood correctly, n is the actual number of microexon+splice sites exact matches in the given intron sequence. I would expect Ms to grow for lower n (n>0), i.e., decrease with n increasing, But the formula is the opposite again. Shouldn't it be something like (1-Ps)Pu2/n? Or if you take it as 1 - ME_final$ME_P_value from the code it will be [1-(1-Pu2)Ps]/n, which also would make sense to me. Am I missing the point completely?
Thank you
The text was updated successfully, but these errors were encountered:
Hi,
(1) I think the paper Methods and the actual code disagree in calculating the
Ms
score. In the paper the text reads... we calculate a score,
Ms
, for each putative microexon asMs =1 − (1 − PsPU2)/n
, wherePU2
is the probability that the observed U2 score came from the Gaussian with the higher mean andn
is the number of matches for a given intron.The corresponding calculation in the code seems to be (from https://github.com/hemberg-lab/MicroExonator/blob/master/src/final_filters3.R, lines 85-95)
So if
ME_final$ME_P_value
should be identified withMs
the code calculates1 − (1 − Ps(1-PU2))/n
and not1 − (1 − PsPU2)/n
defined in the paper (because it takes the probability of the LOWER mean component, and not HIGHER mean.)Or should
Ms
actually be1 - ME_final$ME_P_value
? In that case it will not match the paper either, I think ...(2) In general, I am not sure I can understand the logic behind the expression for the
Ms
score in the paper. High scores are likely to indicate true microexons, and by definitionMs = 1 − (1 − PsPu2)/n = 1 - 1/n + PsPu2/n
. Wouldn't you wantMs
decrease with increasingPs
? The formula is the opposite. Also, if I understood correctly,n
is the actual number of microexon+splice sites exact matches in the given intron sequence. I would expectMs
to grow for lowern
(n>0), i.e., decrease withn
increasing, But the formula is the opposite again. Shouldn't it be something like(1-Ps)Pu2/n
? Or if you take it as1 - ME_final$ME_P_value
from the code it will be[1-(1-Pu2)Ps]/n
, which also would make sense to me. Am I missing the point completely?Thank you
The text was updated successfully, but these errors were encountered: