useR! 2024: Full Schedule

In Person
8 - 11 July, 2024
Learn more and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for useR! 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Central European Summer Time (UTC+02:00). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

The virtual program will take place on 2 July. Please see the virtual schedule page for more information.

13:30 CEST

A guide to R packages for synthetic data generation - Michael Kammer, University of Vienna

Wednesday July 10, 2024 13:30 - 15:00 CEST

TBD

Statistical method development is partly driven through applications and the complexities of real world datasets. But we all know that sharing these datasets is often difficult because of legal, ethical or practical concerns, thus making the creation of synthetic data closely reproducing the real world data an attractive option circumventing such issues. Similarly, generating realistic data is important for method comparison studies that are crucial for establishing the evidence base for statistical methods.

Yet there seems to be little consensus on how to actually code data generators. As a first step to make coding of simulations more accessible, we provide a systematic scoping review of existing R packages to support data generation (results publicly available on osf). We will also include our own package that aims to complement the existing ecosystem by building a library of interesting data generators derived from real-world datasets.

A single tool is not enough to fit all needs, so we will discuss how these tools help you to support open science principles by facilitating sharing of data from your own research, or by generating data for your own methods development.

Speakers

Michael Kammer

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

Statistical modelling, Poster Session

13:30 CEST

Combining probabilistic forecasts with the `gamstackr` package - Euan Enticott, University of Bristol

Wednesday July 10, 2024 13:30 - 15:00 CEST

TBD

Ensemble models are increasingly popular tools for capturing heterogeneous information and improving predictive performance. We will present the `gamstackr` R package, which provides tools for aggregating or `stacking` the probabilistic forecasts produced by different models or `experts`. In particular, the package implements a versatile, easy-to-use framework for probabilistic stacking that allows to control the experts’ weights via additive models containing fixed, random or smooth effects. It also provides statistical and computational scalability in the number of experts by exploiting context-specific relationships between them.

We will illustrate the typical workflow of the `gamstackr` package, that is how to: create a heterogeneous set of experts, build and fit several types of stacking models and visualise the ensemble weights and their relationship with the covariates. The package is currently available at https://github.com/eenticott/gamstackr.

Speakers

Euan Enticott

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

Statistical modelling, Poster Session

13:30 CEST

CompInt: A Package for Interpretable and Comparable Reporting of Effect Sizes - Hannah Schulz-Kümpel, Department of Statistics, LMU Munich

Wednesday July 10, 2024 13:30 - 15:00 CEST

TBD

Ever struggled with how to report and explain the results of a statistical model you just fit? Do not worry, the CompInt R-package is here to help you with this more than common problem! In fact, misinterpretations of statistical significance and classical effect measures like odds ratios are widespread, even among researchers familiar with their definitions. More than that, trying to compare or accumulate the results from several different models, as is the goal of multi-analyst studies and Meta-analysis, there currently really does not exist a uniform gold standard. Based on [Kümpel & Hoffmann](https://arxiv.org/pdf/2211.02621.pdf), the CompInt package implements a general reporting framework, allowing for the consistent derivation of effect size measure definitions and visualization techniques aimed at maximizing the interpretability and comparability of regression results. This session will highlight the importance of transparent reporting, explain the possible specifications of the framework, and generally showcase the applications of the CompInt package.

Speakers

Hannah Schulz-Kümpel

M.Sc., Department of Statistics, LMU Munich

After receiving her Bachelor's in Mathematics from Heidelberg University and Master's in Statistics from LMU Munich, Hannah Schulz-Kümpel is now a PhD student at the ‘Konrad Zuse School of Excellence in Reliable AI’ (relAI) under the supervision of Bernd Bischl.

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

Statistical modelling, Poster Session

13:30 CEST

Improving the Modeling of Binary Regression Based on New Proposals for Statistical Diagnostics - Alejandra Andrea Tapia Silva, Pontificia Universidad Católica de Chile

Wednesday July 10, 2024 13:30 - 15:00 CEST

TBD

Binary regression models using logit or probit link functions have been widely employed in examining the relationship between binary responses and covariates. However, misspecification of the link function can result in poor model fit and compromise the significance of covariate effects. In this study, we present a local influence diagnostic method associated with a new family of link functions that allows evaluating the sensitivity of symmetric links towards asymmetric ones. This new family offers a comprehensive model that encompasses nested symmetric cases. Furthermore, we present a local influence diagnostic method to evaluate the sensitivity of odds ratios. Monte Carlo simulations are performed to evaluate both the performance of the diagnostic method and the parameter estimation of the overall model, complemented by illustrations using medical data related to menstruation and respiratory problems. The results confirm the effectiveness of our proposal, highlighting the critical role of statistical diagnostics in modeling.

Speakers

Alejandra Andrea Tapia Silva

Dr., Pontificia Universidad Católica de Chile

"I'm Alejandra, an assistant professor in the Statistics Department at PUC, Chile. I'm part of R-Ladies and I love statistical modeling, R, cats, art, and David Bowie."

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

Statistical modelling, Poster Session

13:30 CEST

miniSize: An R package to calculate the minimal sample size in balanced ANOVA models - Bernhard Spangl, BOKU University

Wednesday July 10, 2024 13:30 - 15:00 CEST

TBD

We consider balanced one-way, two-way, and three-way ANOVA models to test the hypothesis that the fixed factor A has no effect. The other factors are fixed or random. For most of these models (including all balanced 1-way and 2-way ANOVA models) an exact F-test exists.

Given a prespecified power, miniSize allows the user to compute the minimal sample size of the above mentioned ANOVA models, i.e. the minimal number of experiments needed.

This is achieved by the determination of the noncentrality parameter for the exact F-test, a description of its minimal value by a sharp lower bound, and thus a guarantee of the worst-case power for the F-test. Additionally, we provide a structural result for the minimal sample size that we call "pivot" effect.

We will present the newly developed R package "miniSize" and give some examples of how to use its functionality to calculate the minimal sample size.

Speakers

Bernhard Spangl

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

Statistical modelling, Poster Session

13:30 CEST

Multilevel Regression with Projection Pursuit Tree - Eun-Kyung Lee & Seowoo Jung, Ewha Womans University

Wednesday July 10, 2024 13:30 - 15:00 CEST

TBD

Multilevel regression and post-stratification (MRP; Gelman & Hill, 2006; Gelman et al., 2020) are developed to process data from demographically diverse groups in complex survey designs. To obtain a representative estimate for a specific group, a multilevel regression model combines an individual-level model using individual-level data and a population-level model using group-level data. MRP is divided into two stages. The first step is the multilevel regression step, which estimates a stratified model divided into an individual model and a population model into an individual-response model using priors for parameters. The multilevel regression model is intended to calculate estimates for each class used for later post-stratification. In the individual model, only variables that enable post-stratification can be used. In this study, the existing problem of MRP, which uses only categorical variables that can be used for post-stratification, was solved by proposing a method incorporating a projection pursuit tree and implementing it in R.

Speakers

Eun-Kyung Lee

Professor, Ewha Womans University

Eun-Kyung Lee is a Professor in the Statistics Department. She earned a Ph.D., majoring in Statistical Computation and Visualization of Multi-variate Data at Iowa State University in the U.S. Currently, she's engaging in projects in medical statistics and statistical computing ar... Read More →

Seowoo Jung

Multilevel regression with projection pursuit tree, Ewha Womans University

- Bachelor’s Degree in Statistics, Ewha Womans University (2019-2023) - Master of Science in Statistics, Ewha Womans University (2023~)

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

Statistical modelling, Poster Session

13:30 CEST

SCM: An R package for Generalized Additive Modelling of Covariance Matrices - Vincenzo Gioia, University of Trieste

Wednesday July 10, 2024 13:30 - 15:00 CEST

TBD

Coupling additive mean vector and covariance matrix modelling for multivariate Gaussian models is a complex task, requiring methodological choices on the model structure, scalability of the model fitting procedures, and a set of tailored inferential and model-checking tools. The SCM (Smoothing for Covariance matrix Modelling) R package enables smooth additive modelling of the elements of the mean vector and of an unconstrained parametrisation of the covariance matrix, while ensuring computational scalability by exploiting model sparsity and using the efficient linear algebra routines provided by the RcppArmadillo package. It also leverages the well-developed inferential methods and the visualization tools provided by the mgcv and mgcViz R packages.

In this talk, we will illustrate the modelling capabilities of the SCM package and we will provide useful insights into the data modelling process on several real-world applications. In particular, we will provide an overview of the main aspects of the model building and checking phases, as well as insights on how to interpret the model output. The SCM package is currently available at https://github.com/VinGioia90/SCM/.

Speakers

Vincenzo Gioia

Ph.D., University of Trieste

Vincenzo Gioia is a research assistant at the Department of Economic, Business, Mathematical and Statistical Sciences, University of Trieste, Italy.He received a PhD in Managerial and Actuarial Sciences from the University of Udine in 2023.His research interests range from asymptotic... Read More →

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

Statistical modelling, Poster Session