Loading…
useR! 2024
Attending this event?
In Person & Virtual
8 - 11 July, 2024
Learn more and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for useR! 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Central European Time (UTC+1)To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Biostatistics + epidemiology + bioinformatics [clear filter]
Monday, July 8
 

09:00 CEST

Tutorial: Introduction to Machine Learning for Survival Analysis with Mlr3 - John Zobolas, Institute for Cancer Research & Lukas Burk, Leibniz Institute for Prevention Research and Epidemiology - BIPS and LMU Munich
This introductory tutorial is designed to equip participants with practical skills and knowledge for performing survival analysis using machine learning techniques. Survival analysis, a fundamental statistical method in biomedical and clinical research, focuses on analyzing time-to-event data, such as the time to disease progression or patient survival. In this tutorial, attendees will work with clinical and gene expression data to build, train, and test survival models. They will learn how to leverage R's mlr3 ecosystem for efficient model development, incorporating sophisticated machine learning models such as penalized linear models and random forests to enhance the accuracy of the survival predictions. Participants will also explore survival metrics and model validation techniques to assess the quality and reliability of their models in the context of real-world data. Whether you're new to survival analysis or seeking to enhance your skills, this workshop offers valuable insights and hands-on experience for tackling challenging clinical and biomedical questions.

Speakers
avatar for John Zobolas

John Zobolas

PhD, Institute for Cancer Research, Oslo University Hospital
My background is in computer science, with diverse expertise in computational modeling, software engineering, survival analysis and statistical/machine learning. Being an engineer at heart, my strongest quality is careful, analytical thinking. A productive workday consists of writing... Read More →
avatar for Lukas Burk

Lukas Burk

M.Sc., Leibniz Institute for Prevention Research and Epidemiology - BIPS and LMU Munich
Studied Public Health and Biostatistics before starting a PhD in Statistics and Machine Learning


Monday July 8, 2024 09:00 - 12:30 CEST
Pinzgau
 
Tuesday, July 9
 

11:00 CEST

Create Your Own Recipes Steps for Omics Data: The Scimo Package - Antoine Bichat, Servier
The rise of advanced high-throughput sequencing technologies has led to a massive increase in the production of omics data, encompassing genomics, transcriptomics, proteomics, metagenomics, and more. To effectively explore and analyze omics data, specialized preprocessing techniques are essential, including feature normalization, selection, and aggregation. However, many of these specific methods were not initially available in the original 'recipes' package. As a response, we have developed an extension package, 'scimo', designed to seamlessly integrate these techniques into the 'tidymodels' ecosystem. 'scimo' offers a comprehensive suite of preprocessing steps tailored for omics data analysis, while remaining adaptable to other data types. During this presentation, we will showcase the capabilities of 'scimo' and provide insights into creating your own 'recipes' extension package. Additionally, we will discuss strategies to navigate the potential pitfalls that we have encountered during the development process. https://github.com/abichat/scimo

Speakers
avatar for Antoine Bichat

Antoine Bichat

PhD, Servier
Antoine Bichat is a data scientist at Servier, where he works on pediatric oncology projects within the computational medicine team. He holds a PhD in biostatistics and has also worked in a biotech specialized in metagenomics. Antoine loves dataviz, teaching R and experimenting with... Read More →


Tuesday July 9, 2024 11:00 - 11:20 CEST
Attersee

11:20 CEST

MIEP: Make-It-Easy-Pipeline - Alberto Corradin, Veneto Institute of Oncology IOV – IRCSS, Padova, Italy;
Make-it-easy-pipeline (MIEP) is an integrated, interactive, user-friendly pipeline for RNA-seq data. This new R package helps researchers develop testable hypotheses and select targets for wet lab functional testing. MIEP performs statistical testing, annotates sequences, corrects biases, and summarizes results in HTML tables, volcano plots, and heat maps. Shiny apps allow to modify default settings, select a data shrinkage method, and set thresholds to identify differentially expressed features. Use of MIEP in a cancer research project facilitated the identification of phenotype-linked signal transduction pathways whose biological relevance was experimentally verified. MIEP’s functions include: - Dimensionality reduction and visualizations by PCA, UMAP, tSNE, SVM - Enrichment of Gene Ontology (GO) terms - Calculation of features’ importance subsequent to classification (conditional random forests) - Gene set editing based on the ranking of GO terms or features’ importance - Analyses focused on gene sets and user-friendly graphical representations. These characteristics and careful handling of exceptions make MIEP an easy-to-use tool for biologists with basic programming skills.

Speakers
avatar for Alberto Corradin

Alberto Corradin

Dr., Veneto Institute of Oncology IOV – IRCSS, Padova, Italy;
Alberto Corradin is a Data Scientist with expertise in statistics and data analysis, development of mathematical models, machine learning and artificial intelligence. He is currently a researcher at Veneto Institute of Oncology IOV – IRCSS, Padova, Italy. He is co-author of 14 peer-reviewed... Read More →


Tuesday July 9, 2024 11:20 - 11:40 CEST
Attersee

11:40 CEST

Teal - an Open Source Framework for Data Exploration in Clinical Trials and Beyond - Pawel Rucki, Roche
The {teal}, an open source framework for Shiny app development, was designed to accelerate the data exploration process within clinical trials. Throughout the years, it has grown into a robust and versatile solution, gaining recognition from various companies in the industry. In this talk, I will introduce you to the product and its core concepts and features. I will showcase a few practical applications from the clinical trials context and beyond.

Speakers
avatar for Pawel Rucki

Pawel Rucki

Ms, Roche
Pawel graduated in 2015 from University of Warsaw, Econometrics and Quantitative Economics. Working with R for almost 10 years now, Pawel applied it in the field of geospatial data analysis, credit risk assessment, financial provisions calculation and clinical trial data analysis... Read More →


Tuesday July 9, 2024 11:40 - 12:00 CEST
Attersee

12:00 CEST

MS Meets R: Unravelling Cellular Lipid Networks by Integrative Analysis & Untangling Ether Lipids Th - Jakob Koch, Medical University of Innsbruck
Membrane balance relies on specific metabolic lipid interactions. We investigated how fatty acyl side chains from different lipid classes influence each other using artificial neural networks (ANNs). We analyzed profiles of different phospholipids from 15 mouse tissues and found tissue-specific patterns. With this data we trained 362 ANN model architectures in R and were able to predict mitochondrial cardiolipin remodelling based on fatty acyl pools in phospholipids. Our analysis revealed key players in mammalian cardiolipin remodelling: high oleic acid increased lipid diversity, while linoleic acid favoured uniformity. Additionally, to reliably discriminate between plasmanyl/plasmenyl lipids in mouse tissues we conducted lipidomics experiments with ion mobility spectrometry (IMS). All data integration and analysis steps were performed in R. Statistical analysis in R confirmed the validity of this IMS approach for lipid subclass separation, which is especially powerful when combined with accurate retention time characteristics.

Speakers
avatar for Jakob Koch

Jakob Koch

MSc., Medical University of Innsbruck, Biochemical Genetics Laboratory
JK did his BSc and MSc in Chemistry at the LFU Innsbruck. Since beginning of 2019 he is a PhD student in the Biochemical Genetics Laboratory headed by Dr. MA Keller at the Institute of Human Genetics at the Medical University of Innsbruck.In his PhD he's focusses on both, the implementation... Read More →


Tuesday July 9, 2024 12:00 - 12:20 CEST
Attersee

13:25 CEST

A Bayesian Approach to Decision Making in Early Development Clinical Trials : an R Solution. - Audrey Yeo, Roche
Showcasing a new statistical software that supports decision making on whether a novel cancer treatment demonstrates sufficient safety and efficacy signals to warrant further investment.

Speakers
avatar for Audrey Yeo

Audrey Yeo

Statistical Software Engineer && Biostatistician, Roche
Audrey Yeo is a Statistical Software Engineer and Clinical Trial Biostatistician at F. Hoffman La-Roche since 2021. Together with the statistician engineering team, they are creating a state of art engineering tool to enhance decision making for early development. Audrey has a pharma... Read More →


Tuesday July 9, 2024 13:25 - 13:30 CEST
Pinzgau + Tennegau

13:30 CEST

Generate Raw Synthetic Dataset for Clinical Trial - Binod Jung Bogati, Numeric Mind
Obtaining synthetic raw datasets, particularly for clinical trials, poses significant challenges. The reliance on manual data entry in Electronic Data Capture (EDC) systems, along with the creation of test data scenarios for generating Study Data Tabulation Model (SDTM) and other clinical programming tasks, presents complexities. syngenR, an R package, addresses these challenges by offering a solution that generates customized synthetic raw datasets for clinical trials. This presentation introduces an alternative to conventional test data generation and entry methods, addressing specific limitations and challenges of the current approach. By automating the creation of synthetic data that accurately reflects real-world variability, reliability, and efficiency of SDTM generation and other clinical programming tasks while avoiding the inaccuracies associated with manual data entry. This package can also be used in educational settings, and its capability to test various clinical trial scenarios, and its potential to significantly reduce the time and effort required for clinical trial preparation and execution.

Speakers
avatar for Binod Jung Bogati

Binod Jung Bogati

Associate Manager - Data Science, Numeric Mind
Binod Jung Bogati is a Statistical Programmer at Numeric Mind since 2020. Apart from work, he is also rOpenSci 2023/24 Champion, R User Group Nepal's organizer, hosts R community events. He loves working on data and currently focusing on Clinical Data Science / Life Science.


Tuesday July 9, 2024 13:30 - 13:35 CEST
Pinzgau + Tennegau

13:30 CEST

ScMitoMut: Single Cell Lineage Informative Mitochondrial Mutation Calling Tool - Wenjie Sun, Institut Curie
Cells originate from cell, tracing their lineage from a common ancestor (lineage tracing) is provial for the exploration of development, tumors, and stem cell biology. This is particularly important in answering questions such as stem cell potency, cancer cell plasticity. Within scATACSeq or single-cell multiomics sequencing, mitochondria DNA is enriched due to their histone-free nature with somatic mutations in mitochondria acting as endogeneous marker to following cell lineage in single cell level while profiling open chromatin. We introduce scMitoMut (available in bioconductor), an R package that leverages the statistical model to accurately identify mitochondrial mutations at the single-cell level. scMitoMut is designed to enable users to analysis large datasets on personal computers. In the implementation phase, we have addressed the challenge of handling large scATACSeq datasets on personal computers by creating an HDF5-based object to store raw data and intermediate results. To speed up statistical model fitting for both binomial-mixture and beta-binomial distributions, we have utilized Rcpp for efficient computation and implemented parallel processing techniques.

Speakers
avatar for Wenjie Sun

Wenjie Sun

PostDoc, Institut Curie
As a Postdoctoral Researcher at the Institut Curie, he now concentrates on employing various statistical and computational methods to conduct DNA sequence-based cellular lineage tracing, integrating this with single-cell omics data.


Tuesday July 9, 2024 13:30 - 13:35 CEST
Attersee

13:30 CEST

Table Talk: Designing a Workflow for Reproducible Table Creation in R for Epidemiological Research - Reiko Okamoto, Bruyère Research Institute/Ottawa Hospital Research Institute
Summary tables are ubiquitous in scientific manuscripts reporting epidemiological and clinical studies. Table 1 often contains key demographic information about the study population, including the mean and standard deviation for continuous variables and frequency and proportion for categorical variables. Table 2 may present the association between the explanatory and outcome variables under investigation, and so on. Since there are endless ways (some more robust than others) to create analytical and summary tables in R, our research group wanted to define a workflow that could be adopted by colleagues of varying levels of proficiency in R to reproducibly create these tables from raw data. In this talk, I will share our approach in designing this workflow and what we discovered along the way. This will include a discussion on the current landscape of table-generating packages in R and how we overcame the limitations of existing software alongside other challenges (e.g., incorporating existing metadata). These strategies will not only be useful to researchers in epidemiology but also relevant for those in other health and social science disciplines.

Speakers
avatar for Reiko Okamoto

Reiko Okamoto

Methodologist, Bruyère Research Institute/Ottawa Hospital Research Institute
Reiko has a background in the life sciences with experience conducting data analysis in academia and the public sector. She is always eager to make analysis more open, transparent, and reproducible. Originally from the west coast of Canada, she completed a BSc in Microbiology and... Read More →


Tuesday July 9, 2024 13:30 - 13:35 CEST
Salzburg I
 
Wednesday, July 10
 

13:30 CEST

Dupseqr: Disentangling Genomic Aberrations Made Easy - Ekaterina Akimova & Philine Hoven, Laboratory for Immunological and Molecular Cancer Research
Aberrant repair of DNA double strand breaks is a prominent feature of various cancers. It can result in deletions, duplications, translocations and insertions. In our previous work, we analyzed amplicon-sequencing data with our custom pipeline to detect templated insertions at the DNA damage sites (Akimova et al. 2021, doi:10.1093/nar/gkab051). Here we present the dupseqr, an R package, which summarizes several functions for a sequential tracing of insertions, duplications and inversions. Dupseqr comprises, on the one hand, the existing bash commands in a pipe function for the pre-processing of the FASTQ files and BLAST search, followed by precise trimming and filtering of mapped sequences in order to identify insertions. On the other hand, it includes a novel function to detect and depict duplications and inversions directly from your DNA sequences, whereas the input and the output can be adjusted depending on your initial data structure and your final goal. All in all, dupseqr provides a quick possibility to elucidate aberrations, such as short duplications, inversions and insertions from distant genomic sites using the sequencing data.

Speakers
avatar for Ekaterina Akimova

Ekaterina Akimova

Dr. rer. nat., Laboratory for Immunological and Molecular Cancer Research
I completed my PhD at LIMCR, investigating DNA damage in cancer. During this time, I worked on various projects, including the development of R-based analysis pipelines, and found my passion for coding. In 2023, I finished the doctorate, but continued my research endeavours as a PostDoc... Read More →
avatar for Philine Hoven

Philine Hoven

MSc, Laboratory for Immunological and Molecular Cancer Research
I am a PhD student of Natural and Life Sciences, currently working on the characterization of templated sequences insertions in the cancer background. My work involves wet lab techniques as well as data analytics with R.


Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

13:30 CEST

Exploring the Within-Individual Variability of Human Motor Learning Using GAMLSS - Julia Wood, The University of Queensland
The neural correlates of learning are frequently explored in neuroscience research, typically through learning-induced changes in the mean of a response variable. Motor skill learning can enhance neural communication between the brain and the trained muscle. This communication is typically assessed by inducing muscle contractions in the trained pathway and measuring changes in mean size over time, with larger measurements suggesting enhanced communication. Motor learning may also improve the efficiency of this communication, possibly reflected by more consistent muscle contractions and a reduction in the within-individual variability of these measurements over time. This study explored how motor skill learning and a subsequent intervention (active vs. placebo) influenced changes in the mean size and within-individual variability of these measurements. Effects were estimated by fitting a location and scale model using the GAMLSS package in R. GAMLSS fits a distributional model, which can estimate all parameters for the specified distribution. The results and analysis pipeline from this study will be discussed, emphasising the utility of the GAMLSS model in this research.

Speakers
avatar for Julia Wood

Julia Wood

Miss, The University of Queensland
After working as an R&D chemist for several years, I became intrigued by why we sleep and how we form new memories. This inspired me to pursue a doctoral path in human sleep and memory research. During my PhD, I have discovered deep interests in data analysis, statistical modelling... Read More →


Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

13:30 CEST

MINT+: Web App with R Brains for SDTM Automation - Magdalena Krochmal & Adam Forys, Roche
In the realm of clinical research, a web application known as MINT+ is revolutionizing the process of SDTM automation. At its core, MINT+ utilizes a set of R-packages to power the entire solution. Its intuitive React UI empowers users to create custom SDTM mapping specifications, accommodating diverse study requirements. Leveraging DocumentDB for data storage, MINT+ enables easy metadata sharing and facilitates reuse across studies, significantly reducing workload and improving accuracy.
During this session, we will explore the R-based components that power MINT+ and are responsible for data processing and backend processes. The "rmint.sdtm" automates SDTM mappings, "rsaffron.api" serves as the backend API, and "roak" allows customization of mappings. Users can address complex scenarios that often arise in the SDTM mapping creation process, making R packages the preferred choice for overcoming industry challenges.
With advanced algorithms, a user-friendly interface, and seamless integration, MINT+ streamlines SDTM creation workflow, greatly reducing the time and effort required.

Speakers
avatar for Magdalena Krochmal

Magdalena Krochmal

Senior Data Scientist, Roche
Magdalena Krochmal is a Senior Data Scientist based in Basel, Switzerland. With a background in biomedical engineering and a Ph.D. in bioinformatics, she has spent three impactful years at Roche. Magdalena is an expert R developer specializing in SDTM automation. Her work centers... Read More →
avatar for Adam Forys

Adam Forys

Mr., Roche
Adam is a Principal Data Scientist at Roche. He is dedicated to building R packages that empower teams working on SDTM. He is committed to collaboration and enjoys guiding others in overcoming technical obstacles and optimizing their data science workflows.


Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

13:30 CEST

Use of R in Calibration of Infectious Disease Models - Nicole Swartwood, Harvard TH Chan School of Public Health
Calibration approaches are commonly used in infectious disease modeling, but there has been little study to describe the use of these techniques within the field. Furthermore, R is increasingly used by epidemiologists to understand disease dynamics. As part of a larger scoping review investigating the distribution of calibration methods for models of HIV, TB, and malaria, we will collect data on programming languages and packages/libraries cited in published manuscripts. We aim to identify with which calibration strategies R is most commonly used and ultimately identify any gaps in and potential for development in the available calibration packages within R. We also aim to identify any association with disease, model goal, and or reducibility.

Speakers
avatar for Nicole Swartwood

Nicole Swartwood

Senior Research Analyst, Harvard TH Chan School of Public Health
Nicole Anne Swartwood is a infectious disease modeler at the Harvard TH Chan School of Public Health. Her work focuses on tuberculosis and COVID-19 in the United States. She co-founded the Harvard R User Group and remains as a co-organizer. She is passionate about empowering junior... Read More →


Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD
 
Thursday, July 11
 

10:35 CEST

tRialblazing – advantages of using R in large clinical trials - Piotr Starnawski, Novo Nordisk A/S
Pharmaceutical industry programming has for many years been characterized by "one programming language - take it or leave it". This is reflected in persistent use of established standard programs and closed source languages, due to their prevalence within the field.
However, the transition to open source is well underway and the advantages of using modern languages, such as R, are becoming more common and accepted. Programming of datasets for large clinical trials in R greatly benefits from using i) modern, scalable infrastructure; ii) large speed gains from parallelization paired with new file formats; iii) integrated version control, and iv) DevOps solutions, just to name a few advantages. The nature of open source itself enables tapping into community solutions, e.g. the pharmaverse packages, and, in return, contributing to them with internally developed code.
This presentation will outline the challenges we have been facing while transitioning to R in Novo Nordisk, the expected and often unexpected gains resulting from that change and the direction, in our opinion, that clinical trial programming is headed towards.

Speakers

Thursday July 11, 2024 10:35 - 10:40 CEST
Pinzgau + Tennegau
 
  • Timezone
  • Filter By Date useR! 2024 Jul 7 -11, 2024
  • Filter By Venue Salzburg, Austria
  • Filter By Type
  • Big and high-dimensional data
  • Biostatistics + epidemiology + bioinformatics
  • Breaks + Special Events
  • Community and outreach
  • Cross-industry collaboration
  • Data handling and management
  • Data science education
  • Data visualisation
  • Economics + finance + insurance + business
  • Efficient programming
  • Environmental sciences
  • Interfaces with other programming languages
  • Keynote Sessions
  • Machine learning and AI
  • Numerical methods
  • Open and reproducible science
  • Predictive modelling and forecasting
  • Public sector and NGO
  • Quarto and reporting
  • R workflow + deployment + production
  • Registration
  • Research software engineering
  • Shiny + dashboards + web apps
  • Social sciences
  • Spatial data and maps
  • Sponsor Showcase
  • Statistical modelling
  • Text data and NLP
  • Level

Filter sessions
Apply filters to sessions.