In this tutorial, you'll learn the basics of web scraping with the rvest package. We'll start with a discussion of the ethics or scraping and that basic structure of an HTML page. You’ll then learn about CSS selectors and how you can use them to identify the “rows” and “columns” of the data that you want to extract. Finally, you’ll write R code that uses the rvest package to turn web pages into tidy data frames. We'll also see how you can scrape paginated sites by combining rvest with httr2, and learn two techniques for scraping dynamic sites that generate HTML with javascript.
Please install the following packages prior to the tutorial:
# install.packages("pak")
pak::pak(c("tidyverse", "chromote"))
Registration:To add this tutorial to your registration,
log in to your existing registration, click the Modify Registration button, and navigate to the Reg Options page (page 4). Select the tutorial you want to attend.