Web Scraping and Social Media Scraping 2400-DS1WSMS

Issues to be discussed:
- responsible scraping & good practices
- website structure
- scraping static websites using the Beautiful Soup package
- scraping of static websites using the Scrapy package
- scraping of static and dynamic websites using the Selenium package

Estimated student workload:
Type of activity C (contact) S (independent)
lecture (classes) 0 0
exercises (classes) 15 30
exam 0 0
consultations 15 0
preparation for exercises 0 15
preparation for lectures 0 0
preparation for test 0 0
preparation for exam 0 0...
0 0
Total 30 45 = 75

Course coordinators

Jacek Lewkowicz
Maciej Świtała
Ewa Weychert

Type of course

obligatory courses

Learning outcomes

- the student is familiar with web scraping
- the student is familiar with the tools of web scraping
- the student is able to use knowledge of web scraping to conduct his own research
- the student is able to collect and process data on his own
- the student is able to work in groups
- the student is able to formulate and express his opinion in a discussion
K_W01, K_U01, K_U02, K_U03, K_U04, K_U05, KS_01, K_U06

Assessment criteria

The following will be assessed: preparation of project work (50%) and its presentation (50%). A minimum of 50% of the points from both of the above-mentioned elements is required. The work is to be submitted in the form specified by the lecturers.

Bibliography

R. Mitchell (2018). Web Scraping with Python: Collecting Data from the Modern Web. 2nd Edition. O’Reilly Media.