Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference plugin variables using "module:class" strings #1122

Closed
wants to merge 5 commits into from

Conversation

NickCrews
Copy link
Contributor

@NickCrews NickCrews commented Dec 11, 2022

Will close #1085

This probably could do with some more polishing, eg removing type from every Variable class, and swapping if variable_type == "Interaction" to parsing the variable type, then checkingif isinstance(variable_type, dedupe.variables.InteractionType)

@coveralls
Copy link

coveralls commented Dec 11, 2022

Coverage Status

Coverage remained the same at 74.136% when pulling bad32f6 on NickCrews:cusotm-vars-str-spec into b3efd1e on dedupeio:main.

@fgregg
Copy link
Contributor

fgregg commented Dec 11, 2022

i like the simplicity of this approach, but it will be a breaking change for data models that include datetime. in this code this is only working because of the surviving name spacing bits we are trying to get away from.

it will also be a be a breaking change for data models that include the name, address, and fuzzy category types that are not bundled with dedupe but are recommended in the docs

while this pattern is nicer for devs, i don’t like that if a library’s file structure changes, that can break a downstream users code. the file structure should not have to be part of an external API (though i grant that it often is)

@NickCrews
Copy link
Contributor Author

i like the simplicity of this approach, but it will be a breaking change for data models that include datetime. in this code this is only working because of the surviving name spacing bits we are trying to get away from.

Hmmm, you're right. Two possible solutions I see:

  1. Merge the datetime package into this library. It's already listed as a dep of dedupe, and I assume that it isn't used by some 3rd package. I don't understand why datetime is given this special treatment but name, address, and fuzzycategory are not?
  2. Update the datetime package so that it exports itself in a more normal way. Then here in variables/__init__.py we can say from dedupe_variable_datetime import DateTimeType as DateTime

it will also be a breaking change for data models that include the name, address, and fuzzy category types that are not bundled with dedupe but are recommended in the docs

Hmm, that's a good point. I was just thinking of breaking changes to people using their own custom plugins. I think the best option here is temporarily have a hardcoded mapping for these edge cases (eg "FuzzyCategorical"->"dedupe_variable_fuzzycategory:FuzzyCategorical"), and if you use them then you get a deprecation warning. We can remove them later, after users get a chance to switch. I think eventually they should follow the same rules as other 3rd party plugin variables, since they are external and need to be pip installed.

while this pattern is nicer for devs, i don’t like that if a library’s file structure changes, that can break a downstream users code. the file structure should not have to be part of an external API (though i grant that it often is)

That's a good point. I think that is the responsibility of the plugin authors and is easily avoided. e.g. in a zipcodelib package, in the root __init__.py, they should from zipcodelib.foo.bar import ZipCodeVariable, so that users can just use "zipcodelib:ZipVariable". Then the plugin author is free to move ZipCodeVariable from zipcodelib/foo/bar.py to zipcodelib/baz.py, adjust the import in the root __init__.py, and everything still works.

fgregg added a commit that referenced this pull request Jun 27, 2024
set up data models like

```python
[
  dedupe.variables.String("name"),
  dedupe.variables.Exact("address")
 ]
 ```

instead of 

```
[
  {"field", "name", "type": "String"},
  {"field", "address", "type": "String"},
 ]
```

supersedes #1122, #1121. will close #1085
@fgregg
Copy link
Contributor

fgregg commented Jun 27, 2024

closed by #1193

@fgregg fgregg closed this Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

transition to plugins for dedupe variables.
3 participants