-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transition to plugins for dedupe variables. #1085
Comments
I don't see a problem with just using a src layout? |
Also, what was "backwards incompatible" with it? Like any custom variables that got registered with dedupe by being implicitly part of the I would say for that I don't currently like how that registration is implicit via subclassing. I think it would be much better to have explicit @dedupe.register_variable decorator that people could put on their own classes, or even use a plugin system like pluggy |
with implicit namespaces you can't have an init.py in the top directory as far as i understand. so everything that's under dedupe right now would have to go into a subdirectory like "core" whatever. the reasons why we use subclassing is that was the main way to do it when we first wrote the code way back. there's not really any path forward that's not going to be a breaking change for existing plugins, so i think we should feel free to do what we want. i'm very open to more modern approaches here. |
@NickCrews, do you have an example of a use of pluggy that would be a good fit for dedupe variables? most of what i'm finding is about hooks, which doesn't seem quite right. |
I don't. And thinking about it more, pluggy seems a bit overkill. What about we do what pandas does with their plotting backend? https://plotly.com/python/pandas-backend/ Eg they say So if you specify the type of a variable as This is better because it is explicit where to go looking for the source code, there isn't some hidden registration behind the scenes like there is now. It is simpler for both variable authors and for dedupe maintainers than pluggy or some other system. This string spec is still not backwards compatible for plugin authors, so no additional benefit there. |
That's an interesting idea! I'm pretty far along on pluggy solution, so
i'll finish that up and then we can evaluate.
…On Sat, Dec 10, 2022 at 1:08 PM Nick Crews ***@***.***> wrote:
I don't. And thinking about it more, pluggy seems a bit overkill.
What about we do what pandas does with their plotting backend?
https://plotly.com/python/pandas-backend/
Eg they say pandas.options.plotting.backend = "plotly" and pandas goes
and tries to import a package named "plotly" and thn use a function called
"plot" in that package.
So if you specify the type of a variable as
"mylib.foo:ZipCodeVariable" then dedupe imports mylib.foo and looks for an
attribute named ZipCodeVariable. Using a colon as a separator as is used as
a convention elsewhere in python eg setuptools entry points.
This is better because it is explicit where to go looking for the source
code, there isn't some hidden registration behind the scenes like there is
now.
It is simpler for both variable authors and for dedupe maintainers than
pluggy or some other system.
I think this should be backwards compatible as well? If the string
contains a colon then do the new method, else use the old method?
—
Reply to this email directly, view it on GitHub
<#1085 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEDC3OGO6TKEG3JQRI4JOLWMTBJDANCNFSM562GII2A>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I looked at #1121, some specific comments there. Check out my solution at #1122. Another benefit I realized of using spec-strings instead of pluggy is that it is way more testable. Your PR didn't have tests, I'm guessing because they would be hard to write :) Mine was actually pretty easy to add some tests. |
newer versions of pip/setuptools are now confused by the old-style namespace packages.
i previously tried to migrate to the implicit style namespace packages, but the problem with that i don't think there's a way to do that this not backwards incompatible.
specifically, i don't see how you could use the implicit style namespaces and have something like
from dedupe import Dedupe
work.you could do
from dedupe.api import Dedupe
but that's not what we have guaranteed.so, then maybe we have to move to the src-layout
The text was updated successfully, but these errors were encountered: