-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON Schema #10
base: main
Are you sure you want to change the base?
JSON Schema #10
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple little things, but generally this looks great and like a solid and important improvement. I'll let @flooie make the final call though, once he's back.
tests.py
Outdated
with open_schema() as schema_f: | ||
schema_data = schema_f.read() | ||
schema = json.loads(schema_data) | ||
|
||
try: | ||
jsonschema.validate( | ||
instance=instance, | ||
schema=schema, | ||
) | ||
except jsonschema.ValidationError as e: | ||
self.fail("JSON failed validation against schema") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this whole thing can be outdented, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
tests.py
Outdated
with open_schema() as schema_f: | ||
schema_data = schema_f.read() | ||
schema = json.loads(schema_data) | ||
|
||
try: | ||
jsonschema.validate( | ||
instance=instance, | ||
schema=schema, | ||
) | ||
except jsonschema.ValidationError as e: | ||
self.fail("JSON failed validation against schema") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this validation would fail without a meaningful message. Is there a better way to do this that gives us something more about how the validation failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm pretty sure that's possible. jsonschema.validate()
typically barfs more useful stuff when it raises its exceptions. I'll look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed, let me know what it does for you.
@@ -17440,7 +17426,7 @@ | |||
} | |||
], | |||
"name": "Decisions of the Federal Communications Commission", | |||
"level": "", | |||
"level": null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is philosophical, but generally I follow Django's approach to empty strings, which is:
In most cases, it’s redundant to have two possible values for “no data;” the Django convention is to use the empty string, not NULL.
(https://docs.djangoproject.com/en/3.0/ref/models/fields/#django.db.models.Field.null)
I've argued about it before, but I think the convention makes enough sense that it's worth it just to be consistent throughout FLP. I guess I'd listen to opposing views, but I usually appreciate knowing that strings with no value are ""
instead of either null
or ""
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something tells me courts-db is already using null
in this way though....hrm. If that's the case, I feel ambivalent about changing all of courts-db to use ""
instead of null
, though I still kind of feel like it'd be worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof...I'm not well acquainted with Django-land, but that seems like a really weird convention. To me, empty string means empty string, not absence of data. How does that even really help? You're just doing a different check (if x == ""
instead of if x is None
).
In any event, for the level
element here, it is constrained by an enum
so you would never have to guess whether you're getting an empty string or a null
. The enum
in the schema tells you only one of those is valid. See: https://github.com/freelawproject/courts-db/pull/10/files#diff-11aaec965ced488b7af5aa03d35c580dR32-R43
Courts-DB was in fact using a lot of empty strings, but I changed them to null
s in a7297bc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any event, for the level element here, it is constrained by an enum so you would never have to guess whether you're getting an empty string or a null
That assumes you remember that there's a schema and what's in it. The nice thing about sticking with the convention is that you always know that "no data" is ""
, no matter where you are in the code base.
You're just doing a different check (if x == "" instead of if x is None)
In Django, you can almost always do x == ""
precisely because the convention is there. Without it, you have to do x == "" or x is None
, because "no data" can have multiple values.
If I win you over a bit, let's switch it back. If you feel like I'm wrong, and see no value in Django's approach, very well.
I also just noticed that |
Tests are all green now, which is weird because I expected the validation test to fail with something like this because of #8:
Looks like the culprit is the CI test configuration here, which isn't running that test: https://github.com/freelawproject/courts-db/blob/master/.github/workflows/tests.yml#L25 Is there a reason to limit what tests are run, or can we just make that bit: - name: Run tests
run: |
python tests.py |
No reason, @anseljh. Want to fix the tests too? |
(Or rather, fix the CI test config, is what I meant to say.) |
Huh. I just turned on the CLA bot, but didn't realize it'd spam an actual IP lawyer. Ansel, maybe you have thoughts about it. I'm playing with it for this and a couple other smaller repos atm. We should get this merged too. I thought it was merged ages ago! |
That's kind of awesome and funny. Happy to have a look at the CLA, but you'll probably need to remind me.
Sure. We'll need to re-run the tests. |
Take a look up thread, there's a link where you can agree to it. |
Wow. Didn't know there was a CLA-bot. I love it!
…On Wed, Jan 19, 2022, 4:26 PM Mike Lissner ***@***.***> wrote:
That's kind of awesome and funny. Happy to have a look at the CLA, but
you'll probably need to remind me.
Take a look up thread, there's a link where you can agree to it.
—
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACPKOOSEMHIXM6RMCBPWB3UW5JBHANCNFSM4NVWL43A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
It's new. I thought I flagged it for you, since I knew how you'd feel. It
seems to be working well and uses the cla you provided in 2009 or so.
…On Wed, Jan 19, 2022, 21:29 Brian Carver ***@***.***> wrote:
Wow. Didn't know there was a CLA-bot. I love it!
On Wed, Jan 19, 2022, 4:26 PM Mike Lissner ***@***.***> wrote:
> That's kind of awesome and funny. Happy to have a look at the CLA, but
> you'll probably need to remind me.
>
> Take a look up thread, there's a link where you can agree to it.
>
> —
> Reply to this email directly, view it on GitHub
> <
#10 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AACPKOOSEMHIXM6RMCBPWB3UW5JBHANCNFSM4NVWL43A
>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
>
> or Android
> <
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub
>.
>
> You are receiving this because you are subscribed to this thread.Message
> ID: ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABZ3KRTPO6N2NOHY6H6EFDUW6MVHANCNFSM4NVWL43A>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
This PR adds a JSON Schema. It also adds a unit test to
tests.py
that tries to validate the courts data file against that schema.I fixed a few typos, and updated some bits of the court data so they would conform to the schema. For example, there were many empty strings that I turned into
null
s and other small things like that. You scan see these all in commit a7297bc.Some things I did not fix yet, because I wasn't sure whether the data or the schema should change. Those are:
level
of the Georgia Superior Courts is"gjc & iac"
, which was odd.type
of"non-trial"
. It's an unusual court, but I suggest we code it as"trial"
.type
of"trial & iac"
.Those three issues should be the last things remaining before the courts file validates.
I did not understand what is supposed to go in the
jurisdiction
andcase_types
items, so I left them mostly blank except for adescription
of"TODO"
.Closes #2