-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When the dataset doesn't have a valid URL, it cannot be serialized to n3 and ttl formats #244
Comments
Hi @EricSoroos, could you please take a look at this issue? Logs can be found here. |
Looking into this a bit -- the error is in that ckanext-dcat (and the dependency rdflib) strictly expects that URL be a valid URL and doesn't trap the error or skip when it's invalid. We definitely have metadata that's not a valid url, so the combination of that metadata not conforming to the field definition and the strict definition of the formats causes the error. 3 options to fix:
|
I think the response can be shorter. Something like : "Format not supported due to invalid URL". It would be good if you could also catch the exception and log a message that tells you exactly what the issue is, instead of the long stack-trace. |
That error message is 90% url. That response is essentially catching the exception and returning something useful to the browser that will explain the situation and prevent a crawler from retaining it. I don't think we need to log it, since we know exactly what's causing it and can find all cases of this with a sql query. |
Ok @EricSoroos, my main reason for the above message was to not have so many error logs that don't help. I agree we can just ignore the exception and return a proper response to the user. |
Hi @EricSoroos , do you have any update on this? There have been a few emails regarding errors generated by these URLs |
Hi @EricSoroos, any update on this? |
Hi @deirdrelee, do you know when this will get done? As with #249, it is hard to spot real problems in the logs because of these error logs generated by this issue. |
I've pushed a fix for this to staging |
Thank you! |
This has been fixed and deployed to production by @EricSoroos . Thank you! |
Why
There are lots of errors reported because the
turtle
format and thenotation3
format cannot be generated for datasets, when the URL is not valid.What
Notes
The URLs for the 2 formats is available in the source code of the dataset page(Eg: https://www.resourcedata.org/dataset/33a07bf8-35f4-45be-a951-b61aed8287ac)
Examples:
https://www.resourcedata.org/dataset/33a07bf8-35f4-45be-a951-b61aed8287ac
https://www.resourcedata.org/dataset/33a07bf8-35f4-45be-a951-b61aed8287ac.ttl
https://www.resourcedata.org/dataset/33a07bf8-35f4-45be-a951-b61aed8287ac.n3
https://www.resourcedata.org/dataset/f4d3130b-4557-47fb-b609-6b0080b05025
https://www.resourcedata.org/dataset/f4d3130b-4557-47fb-b609-6b0080b05025.ttl
https://www.resourcedata.org/dataset/f4d3130b-4557-47fb-b609-6b0080b05025.n3
https://www.resourcedata.org/dataset/7bbcb65a-653c-42ea-acb0-45943630bbef
https://www.resourcedata.org/dataset/7bbcb65a-653c-42ea-acb0-45943630bbef.ttl
https://www.resourcedata.org/dataset/7bbcb65a-653c-42ea-acb0-45943630bbef.n3
https://www.resourcedata.org/dataset/28350801-8f55-4155-81ca-874b94b0809d
https://www.resourcedata.org/dataset/28350801-8f55-4155-81ca-874b94b0809d.ttl
https://www.resourcedata.org/dataset/28350801-8f55-4155-81ca-874b94b0809d.n3
The text was updated successfully, but these errors were encountered: