Loading…
useR! 2024
Attending this event?
In Person
8 - 11 July, 2024
Learn more and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for useR! 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Central European Summer Time (UTC+02:00)To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

The virtual program will take place on 2 July. Please see the virtual schedule page for more information.
Open and reproducible science clear filter
Monday, July 8
 

09:00 CEST

Tutorial: Data Anonymisation for Open Science - Jiří Novák & Oscar Thees, UZH, FHNW; Marko Miletic, Bern University of Applied Sciences; Alžběta Beranová, Czech Statistical Office [Pre-Registration Required]
Monday July 8, 2024 09:00 - 12:30 CEST
One of the key elements of open science is open data that are available to a wide spectrum of users. Unfortunately, many datasets cannot be publicly available mostly for privacy reasons because data protection laws fundamentally restrict personal data use. In this tutorial, we will go through methods of statistical disclosure control with different anonymisation approaches that can be used to protect data confidentiality. These methods either modify or synthesise data so that they can be disclosed without revealing confidential information that may be associated with specific respondents. In particular, we will discuss non-perturbation and perturbation methods and also methods for synthetic data generation. For these purposes, the usage of packages sdcMicro, simPop, and synthpop will be shown.

Registration:
To add this tutorial to your registration, log in to your existing registration, click the Modify Registration button, and navigate to the Reg Options page (page 4). Select the tutorial you want to attend.
Speakers
avatar for Jiří Novák

Jiří Novák

Ph.D. student, UZH, FHNW
Jiří Novák received his first doctorate in Statistics from the Prague University of Economics and Business and is currently pursuing a second doctorate at the University of Zurich. In his previous research, he focused on statistical disclosure control for microdata from population... Read More →
avatar for Marko Miletic

Marko Miletic

Scientific project collaborator and software developer, Bern University of Applied Sciences
Marko Miletic is a scientific project collaborator and software developer at the Institute for Optimisation and Data Analysis at the Bern University of Applied Sciences where he focusses on the development and advancement of anonymization methods for event and longitudinal data for... Read More →
avatar for Oscar Thees

Oscar Thees

FHNW
Oscar Thees is an economist by training and a research associate at the Empirical Economic and Social Research group at the University of Applied Sciences Northwestern Switzerland. He is also doing his PhD at the Vienna University of Technology on the topic of anonymization of event... Read More →
AB

Alžběta Beranová

Czech Statistical Office
Monday July 8, 2024 09:00 - 12:30 CEST
Tennegau
 
Wednesday, July 10
 

13:30 CEST

CRANhaven - Your backup repository for recently archived CRAN packages - Lluís Revilla, IrsiCaixa & Henrik Bengtsson, University of California San Francisco (UCSF)
Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD
The Comprehensive R Archive Network (CRAN) provides the R community with more than 20,000 well-tested community-contributed R packages. One cornerstone of R is trust and correctness, which is why all CRAN packages undergo a rich set of checks - when first submitted but also daily.

R introduces new checks regularly, which means existing packages may start failing. If issues are severe enough, the CRAN Team asks the maintainer to submit a corrected version within, typically, two weeks. If not updated in time, the package is “archived” and is no longer available via traditional installation methods. As there is no public notice ahead of time, archiving of packages is a sudden, disruptive, and sometimes also blocking event for users and developers, resulting in wasted time and resources.

We have studied the archival-unarchival of CRAN packages. We will present the most common reasons for packages being archived, and how often and when they are unarchived. Based on these findings, we propose CRANhaven (https://www.cranhaven.org) - a package repository designed to mitigate the negative impact that suddenly archived packages have on the community.
Speakers
avatar for Lluís Revilla

Lluís Revilla

Dr, IrsiCaixa
Bioinformatician at IrsiCaixa. Interested in R packages quality and R repositories.
avatar for Henrik  Bengtsson

Henrik Bengtsson

Henrik Bengtsson, University of California San Francisco (UCSF)
UCSF, R Foundation, R Consortium, MSC in Computer Science, PhD in Mathematical Statistics, Applied, large-scale research in Bioinformatics and Genomics. R since 2000.
Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD
 
Thursday, July 11
 

10:30 CEST

Performance Testing and Comparative Benchmarking for data.table - Doris Afriyie Amoakohene, Northern Arizona University
Thursday July 11, 2024 10:30 - 10:35 CEST
The data.table package in R is a powerful tool for data analysis, combining efficient C code with user-friendly R syntax. To ensure its long-term sustainability, the NSF POSE program has funded a project from 2023 to 2025 to build a self-sustaining ecosystem around data.table.

In this presentation, we will discuss the importance of performance testing in the development of data.table and present a general approach that can be applied to other R packages. By creating performance tests based on historical regressions, we can measure the package's efficiency over time and memory usage, ensuring that code and version releases do not impact its performance. We will demonstrate the use of the atime package to benchmark execution time and memory usage, providing developers with confidence in maintaining efficient performance and reliability. This approach not only benefits data.table but also serves as a model for other R package developers to enhance the performance and popularity of their own projects.
Speakers
avatar for Doris Afriyie Amoakohene

Doris Afriyie Amoakohene

Performance Testing and Comparative Benchmarking for data.table, Northern Arizona Univeristy
Doris holds a degree in BSc. Statistics and is currently pursuing a master's degree in Informatics at the Northern Arizona University. She is the Founder and CEO of LAG Prestige Foundation. Additionally, Doris is a Research Assistant in a Machine learning lab and actively involved... Read More →
Thursday July 11, 2024 10:30 - 10:35 CEST
Pongau + Flachgau

11:50 CEST

Fifteen Years of the R Journal - Mark van der Loo, Statistics Netherlands
Thursday July 11, 2024 11:50 - 12:10 CEST
The first issue of the R Journal was published in June 2009. Run by volunteers from academia, government and industry, the journal has grown into an increasingly popular outlet for scientific research on anything related to R. At the time of writing the Journal has an impact factor of 1.673. In this talk I will look back at the origins and history of The R Journal. I will look back on the people involved and the formal organisation of the journal, including associate editors, editors, and the advisory board. We will take a detailed look at the current editorial process and production of issues in HTML and pdf format will be explained. This will yield extensive tips and tricks that help aspiring authors to get their submissions processed quickly. Finally, we will look into the future developments of the R Journal.
Speakers
avatar for Mark  van der Loo

Mark van der Loo

Senior Researcher, Statistics Netherlands
Mark is a Senior Researcher at Statistics Netherlands and a Research Fellow at the Leiden Institute for Advanced Computer Science at the University of Leiden. Mark published his first package in 2009 and has since co-authored about 20 R packages, a book on statistical data cleaning... Read More →
Thursday July 11, 2024 11:50 - 12:10 CEST
Salzburg I
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Level
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.