R: intro / data cleaning and imputation R / basics of visualisation 2400-DS1R
The main aim of the course is to make students familiar with advanced statistical software which requires coding skills. In the course we will focus on:
i) preparation of an efficient work environment
ii) preparing and tidying different types of data sets
iii) visualization.
Students will obtain the necessary knowledge of the language of the R program to use the R program packages, as well as independently develop applications based on available packages. In addition to IT skills, students will be shown examples of the empirical application of the data analysis program to encourage them to use the program in their own areas of interest.
The topics include:
• R and R-Studio basics – graphical interface/editor, projects, package installation
• Data import
• Basics of commands, types of objects and data structures
• Basic data management – operations on data-frames, handling missing values, factors, combining data, sorting, subsetting
• Basic plots – bar plots, pie charts, histograms, box plots, scatter plots, mosaic plots or correlograms
• R as the analytical solution – descriptive statistics, frequency and contingency tables, correlations, example parametric and nonparametric tests
• Basics of user-written functions, control flow
• RMarkdown report preparation
• Introduction to %>% pipe flow in R – basics of tidyverse
The list may be extended, depending on the initial skills and the interest of the group.
Type of course
Course coordinators
Learning outcomes
After this course the student:
- Student is familiar with R-CRAN computational environment
- Student is able to use different data sets to conduct their own research
- Student gains and processes data independently
- Student is capable of visualizing basic structures and solving simple research problems
- Student is able to perform troubleshooting of their code, finding solutions to simple code-related issues
K_U02, K_U05
Assessment criteria
The final grade includes:
• credits for solving tasks performed in the course of self-study in class and homework (30 credits),
• written exam (70 points),
• extra points for activity.
Grades:
Points Grade
[0-50] ndst (2)
(50-60] Dst (3)
(60-70] dst + (3+)
(70-80] Db (4)
(80-90] db + (4+)
(90-100] Bdb (5)
>100 bdb ! (5!)
Bibliography
- own materials
Compulsory literature:
• Wickham, H., & Grolemund, G. (2016). R for data science.
• Kabacoff R. (2011). R in Action. Data analysis and graphics with R.
Additional (only in Polish):
• Biecek P., 2008, Przewodnik po pakiecie R, Oficyna Wydawnicza GIS, Wrocław
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: