-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proxy management #34
Comments
I have added a 'proxies' parameters (a dictionary) to the functions .read(), .get() and .refresh(), and passed this parameter to the request and urllib library functions. This is implemented in version 0.2.3. I was unable to test this myself, so could you test this? |
Thank you very much for that, I just tested and it doesn't work For urllib, the urlopen(url) doesn't allow passing a proxy :
So I defined a sub that I called in the module.
As I hardcoded it, i only needed to do it once. Thant again for trying to add this functionality to your project |
I forgot to add the proxies to requests.head. This is fixed now and available in 0.2.4. In eiopa_data.py, lines 79 to 82, the proxy handler is installed before the urllib.request.urlretrieve (but I already done this in 0.2.3, so strange that it does not work for you). If you still have issues, you can send me the lines of code I have to change for you to enable it to work, or you can send me a pull request. |
Hi wj, thanks for this new attempt. But then I had an error " <urlopen error [Errno 11001] getaddrinfo failed>" (with the read function).
and a call of that function in get_links and check_if_download function but I then get an error : "can't concat str to bytes" I think I'll stay with my own 0.22 modified version that works (not skilled enough in python to dig deeper). Also a suggestion : a refresh() function with a date argument so the database doesn't refresh from 2016 every time. |
Description
Impossible to get the library to work when a proxy is required
In the scraping.py module I understand that there is no way to pass the proxies as a variable.
After the "import request" there should be a way to specifiy the proxy of allow passing the variable to all the request.get functions
For exemple :
And all functions calling scraping.get_links directly or indirectly should allow to pass the proxy
I understand eiopa_data.py and rfr.py should also be modified to add proxy management to
resp = urllib.request.urlopen(url)
and
urllib.request.urlretrieve(url, target_file)
What I Did
import solvency2_data
d = solvency2_data.read("2017-12-31")
Traceback
The text was updated successfully, but these errors were encountered: