Web Scraping and Social Media Scraping 2400-DS1WSMS
Issues to be discussed:
- responsible scraping & good practices
- website structure
- scraping static websites using the Beautiful Soup package
- scraping of static websites using the Scrapy package
- scraping of static and dynamic websites using the Selenium package
Type of course
Course coordinators
Term 2023L: | Term 2024L: |
Learning outcomes
- the student is familiar with web scraping
- the student is familiar with the tools of web scraping
- the student is able to use knowledge of web scraping to conduct his own research
- the student is able to collect and process data on his own
- the student is able to work in groups
- the student is able to formulate and express his opinion in a discussion
K_W01, K_U01, K_U02, K_U03, K_U04, K_U05, KS_01, K_U06
Assessment criteria
Students will be assessed on: the preparation of project work (40%), its presentation (20%) and a written test (40%). A score of at least 50% in each of the above elements is required to pass. The projects are submitted in the form indicated by the tutors.
Bibliography
R. Mitchell (2018). Web Scraping with Python: Collecting Data from the Modern Web. 2nd Edition. O’Reilly Media.
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: