Skip to content

M. Selenium, Chromium and Web Scraping

Bogdan Tudorache edited this page Feb 13, 2021 · 3 revisions

It is quite impossible to find a decent driver that can help you headless scrape annoying javascript web pages, I spent two days trying to install gekodriver and i always got stuck with the same OSError: [Errno 8] Exec format error so something was not right, but what?!

After endless google searches I found that you can use Chromium as your go to browser but again i found myself in the same error rabbit hole๐Ÿ‡ and what do you know? the same error OSError: [Errno 8] Exec format error, so i was going mad, but it turns out i was doing it all wrong so without furhter adue, here are the steps:

A. Install Selenium

$ sudo pip3 install selenium

B. Install Chromium web driver

You don't need the whole browser (I think) I have installed because it's faster but because it's not that secure I only use it to browse StackOverflow, you only need the web driver.

I've went through hundreds of web pages, all saying you should install venv, gekodriver, and a bunch of other stuff but the solutios was right there, in my face.

$ sudo apt-get install chromium-chromedriver

$ whereis chromedriver

chromedriver: /usr/bin/chromedriver

and now we run the script : selenium_headless_test.py

Results:

selenium_headless_test


                                            **Congrats, you're done!**

Conclusion

We have learned how to install Selenium - a portable framework for testing web applications & a powerful automation tool and most of all the Cromium web driver ๐Ÿš€

We're created our first web scraping script that without opening Chromium is able to extract the title of any given page, and it's especially helpful in scraping those pesky javascript pages.

If you hit a problem or have feedback (which is highly welcomed) please feel free to get in touch, more details in the footer.