Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable PendingJob use in PJSUA2 swig generated wrapper class #3774

Closed
wants to merge 5 commits into from

Conversation

trengginas
Copy link
Member

In order for PendingJob to be utilized, it must function as a base class. This patch will enable the use of PendingJob in SWIG generated wrapper class (e.g: java, cs, python).

sauwming
sauwming previously approved these changes Nov 8, 2023
@mattdimeo
Copy link

This was originally my suggestion, but I think something more is required, because the job processing code frees the pointer out from under the python (or whatever) interface and causes crashes.

@sauwming sauwming dismissed their stale review November 8, 2023 04:27

Solution not working as expected

@trengginas
Copy link
Member Author

The issue is the registered/added PendingJob will be deleted by the lib (e.g.: after execution). It will crash if the object is deleted by the app (double deletion). The purpose solution is to prevent the auto-deletion of PendingJob by specifying autoDelete to false.

@sauwming
Copy link
Member

sauwming commented Nov 9, 2023

The solution only prevents the deletion by the library, but will place the burden of the memory management of the jobs to the app, which may not be as simple and obvious how.
Perhaps an example is needed?

@nanangizz
Copy link
Member

This PR approach (the auto_delete flag) requires some additional steps for Python app (compared to C/C++ app whose auto-delete):

  1. maintain reference to the object to avoid premature destroy by garbage collector before the callback is called.
  2. delete/release-reference later after the callback is called or when the library is destroyed.
  3. if the destructor calls PJLIB API, be careful of pjlib-unregistered-thread issue.

I think the docs better describe the above, and as @sauwming mentioned, an example is even better.

There may be an alternative approach, using SWIG __disown__() method (see here), an idea is to modify the PendingJob constructor (in pjsua.i) to invoke __disown()__ so Python GC will not cause double destroy. This way the Python app will still need to do point 1 & 3 above. Also not sure if this is also applicable for other languages such as Java.

* If this is set to true, the job will be automatically be deleted
* after the job is executed.
*/
bool autoDelete;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this is a private member, initialized via constructor param with the default is true (as existing)?

@trengginas
Copy link
Member Author

SWIG has option to set the object ownership, unfortunately the names are different for each language.
e.g: on python using __disown__, on java using swigCMemOwn.
Ideally, we want app to set them uniformly (e.g: using pjsua2.i).
On python, it is possible to use pythonappend to set the object ownership from the class constructor.
However, I'm haven't been able to get the same result with Java.

@@ -1162,11 +1162,24 @@ struct EpConfig : public PersistentObject
/* This represents posted job */
struct PendingJob
{
PendingJob():autoDelete(true){};

PendingJob(bool autoDel):autoDelete(autoDel) {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Instead of having two constructors, why not just one with default param, e.g: PendingJob(bool autoDel=true)?
  • IMO it is better to have the docs about maintaining reference to object PendingJob for Python/Java, manual delete, destructor called from PJLIB thread, and mini sample code (or link to sample) here?

/** Perform the job */
virtual void execute(bool is_pending) = 0;

bool getAutoDelete() { return autoDelete; };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO if this method is only useful for Endpoint, perhaps use class friend feature is cleaner?

@sauwming
Copy link
Member

sauwming commented Nov 15, 2023

The example seems rather complicated and places quite some responsibilities on the application side, that I think it may be worth redesigning the API and class.

Some (rough) ideas:

  • Create a new API: utilAddPendingJob2(PendingJob &job)
  • Get rid of autoDelete, we will be the one doing all the memory management. We will duplicate the job, maintain it, and delete it once done.

The objective is to make it as simple as possible for the user to execute a job in the main thread, so feel free to get creative and modify existing APIs and/or classes. If backward compatibility is a concern, we can simply keep the current APIs, and create a new set of APIs and classes instead (such as PendingJob2 or NewPendingJob).

@wosrediinanatour
Copy link
Contributor

wosrediinanatour commented Nov 15, 2023

In this case, is there a reason to use C++98/03?
It would be so nice to have more modern C++, which allows to use lambdas, like

template <typename Job>
void Endpoint::utilAddPendingJob(Job &&job)
{
... std::forward<Job>(job)...
}
...

ep.utilAddPendingJob([some_string](){cout << "Hello world" << some_string;});

(in C++ you could even use std::invocable<...> concept).
Or "at least" use std::funtion<...> similar to

void Endpoint::utilAddPendingJob(std::function<....> job)
{
...
}

@wosrediinanatour
Copy link
Contributor

  • Create a new API: utilAddPendingJob2(PendingJob &job)

The cool thing about C++ is that it supports overload sets. So there is no reason to use a new unction name, as long as the signature is different.

@nanangizz
Copy link
Member

We will duplicate the job, maintain it, and delete it once done.

IMHO duplicating the job may not be feasible, we don't really know about the job object so we don't know how to duplicate it, even if we could, it can be complicated, for example, if 'the job' has some reference counter to manage, the duplicate won't work with the ref counter?

@sauwming
Copy link
Member

sauwming commented Nov 15, 2023

I believe we can define a virtual clone function (virtual clone()) to achieve it.

But to avoid misunderstanding, it doesn't have to be this way, i.e. it doesn't have to involve duplication and such. The objective is to enable the user to execute a job in the main thread, in the simplest way possible. Surely there must be a solution for this?

The post by @wosrediinanatour suggest some alternatives that can be explored.

@wosrediinanatour
Copy link
Contributor

Just another idea: functionality looks similar to the ASIO's post function template: https://think-async.com/Asio/asio-1.18.0/doc/asio/reference/post.html - a very popular library, which is well designed.

@trengginas
Copy link
Member Author

I tried to implement the suggestion by @sauwming, by copying the input PendingJob.
Changes:

  • Add a new API to Endpoint, e.g: utilAddPendingJob2() which will copy the input so the caller will be safe to delete the job
  • Add a new API to clone the input PendingJob that will be used by utilAddPendingJob2()
  • Caller (PendingJob derived class) needs to implement the clone method and provide the method to copy the object. On python, the copy.copy() can be used instead of a copy constructor. On Java, copy constructor is required.
//Endpoint.hpp
struct PendingJob {
...
   virtual PendingJob *clone() const = 0;
...
};

class Endpoint {
...
   void utilAddPendingJob2(const PendingJob *job);
...
};
//Endpoint.cpp
void Endpoint::utilAddPendingJob2(const PendingJob *job)
{
    PendingJob *newJob = job->clone();
    utilAddPendingJob(newJob);
}

//Existing PendingJob derived class needs to implement clone()
#test.py
class MyJob(pj.PendingJob):
    def clone(self):
        #use copy instead copy constructor
        return copy.copy(self)  
//sample.java
class MyJob extends PendingJob
{
    @Override
    public PendingJob clone() {
        MyJob job = new MyJob(this);
        return job;
    }

   public MyJob(MyJob job) {}
}

@sauwming
Copy link
Member

Looks good! Does it work as expected?

If yes, IMO we can use this approach since it looks simple enough.
One suggestion is to name the new API utilAddPendingJob_Clone() instead; it's more descriptive than utilAddPendingJob2().

@trengginas
Copy link
Member Author

Yes, the test app doesn't crash and the execute() method was called from the derived class.

@nanangizz
Copy link
Member

Try to check if the clone() in Python/Java/C# works fine? E.g: add some attributes & init the values in constructor and check the values in execute(). Also check if both, original & cloned, are destroyed properly.

@trengginas
Copy link
Member Author

Further testing showed that the clone() method needs to return an object that will be safely deleted by the lib.
On python:

class TestJob(pj.PendingJob):
...
    def clone(self):
        newJob = TestJob(self)
        return newJob.__disown__()
...

On java:

class MyJob extends PendingJob
{
    @Override
    public PendingJob clone() {
        MyJob newJob = new MyJob(this);
        return newJob;
    }
    public MyJob(MyJob job) {
        ...
        this.swigCMemOwn = false;
    }
}  

We are back to the original issue, e.g: caller needs to know specific API from SWIG for this to work.
If this is allowed (e.g: caller is required to use specific SWIG generated API), then the simplest solution is to just use UtilAddPendingJob() and passing the object ownership.

@nanangizz
Copy link
Member

Another idea, maintain pending jobs references in a list in each target language, so pending jobs won't be destroyed by GC. Perhaps need to make PendingJob wrapper class, so after the execute() callback is invoked, the wrapper class will delete itself.

@wosrediinanatour
Copy link
Contributor

wosrediinanatour commented Nov 21, 2023

Just another note/question?. Why do you use a function clone() instead of defining a copy-assignment operator, i.e. operator=(const job&). Because this would be the C++ way of doing such things.

If you also need to shallow copy, then just define a move-assignment operator, i.e. operator=(job&&)

@sauwming
Copy link
Member

@wosrediinanatour, the issue is actually not with C++. Current PendingJob mechanism works fine with C++, but we need a solution that can make it work for other languages such as Java, Python, C#.

@trengginas
Copy link
Member Author

I tried using reference counting alternative as described in https://www.swig.org/Doc4.1/SWIGPlus.html#SWIGPlus_ref_unref.
The idea is to create a reference count mechanism (e.g: class) and use it as a base to the PendingJob class to enable the reference count capability. SWIG will call those method if required. (e.g: on object assignment the reference is added).

@sauwming
Copy link
Member

Nice finding.
In that document, there's also a proxy mechanism in 6.5.2 that seems more similar to what we've been trying to do so far (i.e. with disown). So, just wondering, what's the difference between the two mechanisms, and why the ref/unref is chosen here?

@trengginas
Copy link
Member Author

There are 2 alternatives:

  1. reference counting
  2. passing object ownership
  • With passing object ownership, the target language (e.g: python/java) will not be responsible with object destruction. The C++ object will not be destroyed if it's no longer referenced. The C++ object will be destroyed by the native/C++ part. e.g: after Job execute, the object will be deleted.
    With reference counting, the object destruction is based on the reference count. The object destruction can be triggered from the target language (e.g: the object is no longer referenced)
  • The passing object ownership is not uniform. e.g: on python by using __disown__(ref). On Java by using swigCMemOwn ref. C# doesn't mention this on its doc (ref). And I'm unable to make it work on Java and C#.

@sauwming
Copy link
Member

sauwming commented Dec 5, 2023

So is this PR done or still in progress?
Does the reference counting approach work as expected?
Why do the examples seem like not properly implemented (there are unused classes and the testing doesn't really demonstrate if the approach has indeed been proven to work)?

@trengginas
Copy link
Member Author

The reference count works as expected.
The sample will show the example use of PendingJob that will crash without the patch.
e.g: MyJobHub will add PendingJob, and the object will be double destructed.

@sauwming
Copy link
Member

sauwming commented Dec 26, 2023

From my own testing, it doesn't seem that the reference counting helps prevent the premature deletion.

In summary, it still fails both test 2 and test 3 below, the object Job will still be deleted regardless of the reference counting, and it will still crash.

This is the test code:

class TestJob(pj.PendingJob):
    def __init__(self):
        super().__init__()
        self.__id__ = randint(0, 100000)

    def execute(self, is_pending):
        print("executing job id: ", self.__id__, is_pending)

    def __del__(self):
        print("Job deleted", self.__id__)

class MyJobHub() :
    def __init__(self, ep):
        self.__ep__ = ep

    def setNewJob(self):
        print("creating job 2")
        job = TestJob()
        print("adding job 2")
        self.__ep__.utilAddPendingJob(job)

# Function to be executed in a separate thread
def my_function(ep):
    if threading.current_thread().name == "MainThread":
        print("This is the main thread.")
    else:
        print("This is not the main thread.")
    ep.libRegisterThread("thread")
    print("creating job 3")
    job = TestJob()
    print("adding job 3")
    ep.utilAddPendingJob(job)

def ua_pending_job_test():
    jobs = {}
    write("PendingJob test.." + "\r\n")
    ep_cfg = pj.EpConfig()
    ep_cfg.uaConfig.threadCnt = 0
    ep_cfg.uaConfig.mainThreadOnly = True

    ep = pj.Endpoint()
    ep.libCreate()
    ep.libInit(ep_cfg)
    ep.libStart()

    # simple test 1
    # adding job from the main thread-same function
    print("creating job 1")
    job = TestJob()
    print("adding job 1")
    ep.utilAddPendingJob(job)

    # simple test 2
    # adding job from the main thread-from a class' member func
    hub = MyJobHub(ep)
    hub.setNewJob()

    # real test 3
    # adding job from a different thread
    my_thread = threading.Thread(target=my_function, args=(ep,))
    my_thread.start()

    print("handling events")
    ep.libHandleEvents(10)

And here's the result:

creating job 1
add ref 1
adding job 1
add ref 2
executing job id:  72405 False
dec ref 1
creating job 2
add ref 1
adding job 2
add ref 2
executing job id:  28576 False
dec ref 1
Job deleted 28576
dec ref 0
This is not the main thread.
creating job 3
add ref 1
adding job 3
add ref 2
Job deleted 55048
dec ref 1
handling events
zsh: segmentation fault  python3 test.py

@sauwming
Copy link
Member

sauwming commented Jan 2, 2024

Continued in #3817.

@sauwming sauwming closed this Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants