-
Notifications
You must be signed in to change notification settings - Fork 5
praveen97uma/Web-Crawler
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Required Libraries: BeautifulSoup (for parsing) can be installed in Ubuntu by sudo easy_install BeautifulSoup Invoke 'python crawler.py' in the terminal to initiate the crawler. A prompt asks for the starting url. Enter 'www.iitr.ac.in' and the crawler starts crawling url which are in the same domain only. The crawler also maintains log of all the discarded, crawled and already visited links in log.txt. *The crawler still needs lot of improvement to be done.*
About
Code Arx Project
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published