You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to fetch incoming links for pages and some docs cause a crash when calling getIncoming().
Trying to fetch incoming links for the article 'Europe' fails with:
=-=- http response error =-=-=-
https://en.wikipedia.org/w/api.php?action=query&lhnamespace=0&prop=linkshere&lhshow=!redirect&lhlimit=500&format=json&origin=*&redirects=true&titles=Europe&lhcontinue=566556
FetchError: invalid json response body at https://en.wikipedia.org/w/api.php?action=query&lhnamespace=0&prop=linkshere&lhshow=!redirect&lhlimit=500&format=json&origin=*&redirects=true&titles=Europe&lhcontinue=566556 reason: Unexpected token < in JSON at position 0
at X:\node-projects\wiki\node_modules\node-fetch\lib\index.js:273:32
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async getIncoming (X:\node-projects\wiki\node_modules\wtf-plugin-api\builds\wtf-plugin-api.cjs:110:31)
at async X:\node-projects\wiki\index.js:385:22 {
type: 'invalid-json'
}
hey Christoph, thanks for the good issue.
Yeah - i think you're right about an timeout for some pages. The api plugin loops around and fetches things 500 at a time.
I looked into the python example - the getIncoming method is only returning pages that are wikipedia articles (namespace 0) and not other wikipedia internal stuff. I think the python discrepency is from User talk pages - haha, people are using this template on their profile pages.
Please let me know if you can track down other cases with missing articles. The Europe case needs some thinking. Maybe we could try lowering the limit down from 500. The code is here if anyone is interested.
cheers
I'm trying to fetch incoming links for pages and some docs cause a crash when calling getIncoming().
Trying to fetch incoming links for the article 'Europe' fails with:
while getIncoming() works for 'Javascript' or 'Briefcase' for example.
I'm guessing this is probably related to the number incoming links. The europe article has 86,136 direct links according to https://linkcount.toolforge.org/?project=en.wikipedia.org&page=Europe&namespaces=
The article Python (programming language) has 9,467 links according to https://linkcount.toolforge.org/?project=en.wikipedia.org&page=Python%20(programming%20language)&namespaces= but I get back 3718 pageids when calling getIncoming.
Not a big deal, just thought I'd let you know though.
The text was updated successfully, but these errors were encountered: