Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify ZooKeeper entry points by removing ZooSession #5124

Closed
ctubbsii opened this issue Nov 29, 2024 · 0 comments · Fixed by #5192
Closed

Simplify ZooKeeper entry points by removing ZooSession #5124

ctubbsii opened this issue Nov 29, 2024 · 0 comments · Fixed by #5192
Assignees
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Milestone

Comments

@ctubbsii
Copy link
Member

I've been spending a lot of time lately looking at the ZooKeeper code, and there's a lot of room for improvement. For the longest time, I really couldn't understand what ZooSession is really doing that's important. After spending some more time on it, I've come to believe that it is merely allowing client object reuse for unclosed ZooKeeper objects globally. However, I believe this is no longer needed, because we are not arbitrarily constructing ZooKeeper instances through static utilities throughout our code like we may have once done.

In current Accumulo code, we now have ClientContext and ServerContext objects that maintain state for clients (AccumuloClient) and server-side processes (AbstractServer and utilities), respectively, and give a lifecycle to that state. There are only a few calls to ZooSession, and they are entirely in ZooReader and ZooReaderWriter, which are lightweight wrappers around ZooKeeper for convenient read/write operations. ZooReader/Writer are instantiated as fields inside ClientContext/ServerContext, so it is not necessary to have all this complexity.

I believe ZooSession can be removed, and ZooReader/Writer can be rewritten as decorators for a ZooKeeper that is directly stored as a field in ClientContext/ServerContext, and this would dramatically simplify much of our ZooKeeper code, as well as clean up some complexity in ClientContext/ServerContext.

Unless there's a good reason why ZooSession should stay, I intend to work on this.

@ctubbsii ctubbsii added the enhancement This issue describes a new feature, improvement, or optimization. label Nov 29, 2024
@ctubbsii ctubbsii added this to the 3.1.0 milestone Nov 29, 2024
@ctubbsii ctubbsii self-assigned this Nov 29, 2024
@ctubbsii ctubbsii linked a pull request Dec 17, 2024 that will close this issue
ctubbsii added a commit to ctubbsii/accumulo that referenced this issue Dec 23, 2024
* Replace existing ZooSession with a new one that is a facade for
  ZooKeeper. The methods from ZooKeeper we need are placed on it and
  internally it maintains a delegate ZooKeeper instance that handles
  automatically creating a new ZooKeeper session (client object)
  when the old session has died.
  This enables us to maintain fewer ZooKeeper objects by keeping only
  one at a time per ClientContext/ServerContext, and to reduce
  complexity substantially, where it was hard to reason about
  which ZooKeeper session we were using at any given moment. This
  no longer requires passing around ZooKeeper objects via ZooReader
  that called to the old ZooSession to construct a ZooKeeper session
  on demand. The new ZooSession is now a singleton field whose lifecycle
  is part of the context, and ZooReader/ZooReaderWriter are
  substantially simplified.
* Lazily construct objects in ServerInfo/ClientInfoImpl to simplify the
  implementation for the various use cases, and to ensure that we don't
  create more objects than needed.
* Improve debugging information to tie the ZooSession instances with
  their purpose (for example, for use with ServerContext, or for use
  with ClientContext, with the user's name)
* Get rid of ZooCacheFactory in favor of a lazily constructed ZooCache
  instance in the ClientContext, to make it more clear when a ZooCache
  is being shared or reused, and to remove a static singleton
* Move instanceId and instanceName lookup logic into ZooUtil
* Make ZookeeperLockChecker use its own ZooSession, because it is still
  a static singleton and must continue to operate after the context that
  constructed it is closed (should be fixed when apache#2301 is done)
* Perform some minor improvements to ZooCache to simplify its
  constructors now that it uses ZooSession, and to change the type of
  external watchers to Consumer, so they aren't as easily confused with
  actual ZooKeeper watchers
* Improve a lot of ZooKeeper related test code

Potential future work after this could include:

* Roll ZooReader and ZooReaderWriter functions directly into ZooSession
* Add support to ZooSession for more ZooKeeper APIs
* Handle KeeperException thrown from delegate that signals the session
  is disconnected, instead of relying only on the verifyConnected()
  method in ZooSession to update the delegate
* Handle InterruptedExceptions directly in ZooSession, so they don't
  propagate everywhere in the code in ways that are inconvenient to
  handle

This fixes apache#5124, apache#2299, and apache#2298
@asfgit asfgit closed this as completed in be22dee Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant