-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
iocp: fix crash, GetQueuedCompletionStatus() write freed WSAOVERLAPPED memory #4136
base: master
Are you sure you want to change the base?
Conversation
Great! I assume you've tested this and it does fix the issue :) I think we also need to run under somekind of stress test, e.g: using ioqueue test in pjlib-test, to make sure all memory pools (of pending ops) are properly released. Note that the after an ioqueue key is unregistered, the key will be put into the closing-key-list and soon into the free-key-list to be reused by another socket. We need to make sure that all pending op has been freed before the key is freed & reused. Next, perhaps we can apply a little bit optimization, e.g: instead of mem-pool for each pending-op, perhaps mem-pool per ioqueue-key to avoid multiple alloc+free for multiple pending-op, using same mechanism as ioqueue key (employing additional list for keeping unused pending-op instances to be reused later). |
Note: |
When |
Tried to run
Not sure if this is the same issue, but this assertion does not happen when using ioqueue select. |
@nanangizz no this patch, Is there this assert? |
Yes, same assert without this patch. |
pjlib/src/pj/ioqueue_winnt.c
Outdated
BOOL rc = fnCancelIoEx(key->ioqueue->iocp, (LPOVERLAPPED)&op->pending_key); | ||
if (rc) | ||
PJ_PERROR(2, (THIS_FILE, PJ_RETURN_OS_ERROR(GetLastError()), "cancel io error")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- actually when
rc ==1
, it succeeds - shouldn't the handle
key->hnd
instead ofkey->ioqueue->iocp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right.
msn doc say:
If this parameter is NULL, all I/O requests for the hFile parameter are canceled.
it's better use fnCancelIoEx(key->hnd, NULL)
to cancel all pending operation. i tested it, looks okay
I found the reason: key double unregister. |
Thanks @jimying . Honestly I haven't got a chance to reproduce the original issue and test the proposed solution. I believe you are using this ioqueue in real world, experienced the issue, and find this solution does work, is that correct? Next, here are few notes about the proposed solution:
Also, this ioqueue has been disabled for quite sometime and some improvement in the ioqueue area may not be integrated into this ioqueue, e.g: group lock for key. So please understand that there may still be some steps required to enable this ioqueue again :) |
@nanangizz i write a simple demo to reproduce the crash issue in msys2, #4172 I have tested it, in old code, it can 100% reproduce the crash. To test new code we can git cherry-pick the demo patch to this branch. |
Thanks @jimying. |
…D memory try to fix issue pjsip#985
Hi @jimying, please let us know whether you plan to incorporate @nanangizz suggestions above. |
@sauwming sorry, I will submit it as soon as possible today or tomorrow |
new commits do:
|
@@ -133,12 +133,9 @@ struct pj_ioqueue_key_t | |||
int connecting; | |||
#endif | |||
|
|||
#if PJ_IOQUEUE_HAS_SAFE_UNREG | |||
pj_atomic_t *ref_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should remove all the code related to PJ_IOQUEUE_HAS_SAFE_UNREG
. I believe what @nanangizz meant was that we need to make sure PJ_IOQUEUE_HAS_SAFE_UNREG
is set to 1.
Since by default PJ_IOQUEUE_HAS_SAFE_UNREG is already 1, IMO adding a check below should be sufficient, just to make sure that users don't accidentally disable it:
#if PJ_IOQUEUE_HAS_SAFE_UNREG == 0
/* IOCP only works with PJ_IOQUEUE_HAS_SAFE_UNREG enabled, otherwise we will get memory error in ... */
#error "PJ_IOQUEUE_HAS_SAFE_UNREG must be enabled to use ioqueue IOCP"
#endif
Another alternative is to force enable it in pjlib/config.h
, something like:
#if PJ_IOQUEUE_IMP == PJ_IOQUEUE_IOCP
# define PJ_IOQUEUE_HAS_SAFE_UNREG 1
#endif
But this is more complicated since we need to define PJ_IOQUEUE_IMP
similar to PJ_SSL_SOCK_IMP
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic codes of PJ_IOQUEUE_HAS_SAFE_UNREG = 0 are no need any more, i think it's better to remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still there in all other ioq backends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps unsafe unreg option will not be used for a very long time in IOCP (until there is a way to cancel op immediately), so perhaps it is okay to remove that.
Btw, I think I will try to join the development, may I commit to the branch directly @jimying ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nanangizz ok, no problem
try to fix issue #985