useR! 2024: Full Schedule

In Person
8 - 11 July, 2024
Learn more and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for useR! 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Central European Summer Time (UTC+02:00). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

The virtual program will take place on 2 July. Please see the virtual schedule page for more information.

09:00 CEST

Tutorial: Streamlining R Package Development with Github Actions Workflows - Daphne Grasselly & Pawel Rucki, Roche; Dinakar Kulkarni, Genentech [Pre-Registration Required]

GitHub Actions provide an automated workflow for continuous integration and deployment, enhancing collaboration and code quality. This tutorial aims to demystify GitHub Actions, offering insights into their fundamentals and guiding participants through the process of crafting reusable actions tailored for R package development. The tutorial begins with an overview of GitHub Actions, elucidating their role in automating software workflows and boosting productivity in the R programming ecosystem. Attendees will gain a comprehensive understanding of the basics, including syntax, triggers, and workflow components, paving the way for seamless integration into their development pipelines. Building on this foundation, the tutorial delves into the creation of reusable actions, emphasizing best practices for designing modular, versatile components. The tutorial also showcases the benefits of running both development as well as CI/CD workflows in a common Docker container environment to guarantee reproducibility. Participants will learn how to encapsulate common tasks and share them across different projects, fostering a culture of code reuse within the R community.

Pre-requisites:
1. Create a GitHub account.
2. (Optional) Have Git and SSH installed on your computer, and have an SSH key ready.

Registration:
To add this tutorial to your registration, log in to your existing registration, click the Modify Registration button, and navigate to the Reg Options page (page 4). Select the tutorial you want to attend.

Speakers

Franciszek Walkowiak

Senior IT Professional at Roche, Roche

DevOps engineer with 4 years of experience in the pharmaceutical industry. I have worked with Amazon Web Services, Google Cloud Platform, and infrastructure as code practices. Currently, I support teams of R software developers by employing DevOps practices and tools such as GitLab... Read More →

Daphne Grasselly

Senior Data Scientist - Roche, Roche

I am currently working at Roche as a Senior Data Scientist. My main focus is on enhancing automation workflows for efficient package delivery, particularly in the realm of R development within the pharmaceutical industry. I am passionate about optimizing processes and improving code... Read More →

Pawel Rucki

Ms, Roche

Pawel graduated in 2015 from University of Warsaw, Econometrics and Quantitative Economics. Working with R for almost 10 years now, Pawel applied it in the field of geospatial data analysis, credit risk assessment, financial provisions calculation and clinical trial data analysis... Read More →

Monday July 8, 2024 09:00 - 12:30 CEST
Wolfgangsee

R workflow + deployment + production, Tutorial

14:00 CEST

Tutorial: Building Effective Docker Images: R Edition - Andrew Collier, Fathom Data [Pre-Registration Required]

Docker is a cornerstone of modern software development and deployment, ensuring reproducibility, scalability, and seamless environment management across different platforms. This tutorial will examine the art and science of crafting efficient and optimised Docker images specifically tailored for R applications. *Description* Docker has revolutionised how we develop, deploy, and run applications by offering a lightweight, portable solution for application containerisation. For the R community, Docker presents an vital tool for addressing common challenges such as "it works on my machine", dependency management, and consistent environments across development and production systems. However, creating effective Docker images that are optimised for R applications requires a nuanced understanding of both Docker and R ecosystems. This tutorial aims to bridge that gap, providing attendees with the knowledge to build Docker images that are not only functional but also optimised for performance, size, and security.

Preparation:
In order to participate in this tutorial, please have (a) Docker installed and tested* (b) a text editor on your machine. Here are some resources with further links to installation instructions on the three major operating systems:

*it is imperative that Docker be installed and tested before the session because there will not be any time to resolve problems around this during the tutorial.

Registration:
To add this tutorial to your registration, log in to your existing registration, click the Modify Registration button, and navigate to the Reg Options page (page 4). Select the tutorial you want to attend.

Speakers

Andrew Collier

Dr, Fathom Data

Andrew is Lead Data Scientist at Fathom Data. He spends his days tinkering with R, Python and Docker.

Monday July 8, 2024 14:00 - 17:30 CEST
Attersee

R workflow + deployment + production, Tutorial

14:00 CEST

Tutorial: Contributing to R - Gabriel Becker, Consultant & Heather Turner, University of Warwick [Pre-Registration Required]

Did you always want to contribute to (base) R but don't know how? This tutorial shows cases where and how users have contributed actively to (base) R, by submitting bug reports with minimal reproducible examples, how testing, reading source code, and providing patches to the R source code has helped making R better. A selection of past bug reports are provided for you to practice debugging. For bugs that have been resolved you can check what happened after the bug was reported.

Registration:
To add this tutorial to your registration, log in to your existing registration, click the Modify Registration button, and navigate to the Reg Options page (page 4). Select the tutorial you want to attend.

Speakers

Gabriel Becker

Statistical Computing Consultant

Gabe is a frequent collaborator with R-core, having contributed 7 novel features to R including proposing and subsequently working with Luke Tierney on the internal ALTREP framework. He is the author of multiple R packages, including the rtables package for creating reporting tables... Read More →

Heather Turner

Dr, University of Warwick

Heather Turner is a Research Software Engineering Fellow and Associate Professor at the University of Warwick. She is an active member of the R community, in particular, she is on the board of the R Foundation and chairs both the R Contribution Working Group and the Forwards taskforce... Read More →

Monday July 8, 2024 14:00 - 17:30 CEST
Flachgau

R workflow + deployment + production, Tutorial

11:00 CEST

Past, Present, and Future of Data.Table - Tyson Barrett, Highmark Health

This talk will walk through the past, present, and future of the data.table package. The timing of this talk is particularly important as changes to the governance of the package aimed at providing a solid foundation for long-term maintenance of the package have recently been approved. As a leading data wrangling and cleaning package in the R ecosystem, the goals of the new governance is to create a broader community that can more easily engage with the development of the package and find support for its use.

Speakers

Tyson Barrett

Manager Research Analytics and Enablement, Highmark Health

Tyson Barrett, PhD is the current data.table maintainer working with a talented team of developers and a wonderful development community. During his day job, he works with a team of researchers at Highmark Health, a healthcare organization, to improve healthcare outcomes, costs, and... Read More →

Tuesday July 9, 2024 11:00 - 11:20 CEST
Pinzgau + Tennegau

R workflow + deployment + production

11:20 CEST

WebR, and the Future of Building Web Applications with R - Colin Fay, ThinkR

One of the great joys of being a software engineer is that things keep moving. New technologies, new languages, new frameworks, every now and then new things are emerging that are changing the way we build software. In the past couple of years in the R world, we've been building and deploying web apps and API in a pretty stable way: building {shiny} app with frameworks like {golem} or {rhino}, API with {plumber}, and sending them to a server that can launch R and make our R code available to the world. In the past months, something new has emerged: webR, a version of R compiled for WebAssembly (WASM), allowing to run R in the browser and un NodeJS, with no need for an R installation. This opened a lot of new doors, JavaScript being the tool of choice when it comes to building web apps and API. In this talk, Colin will start by explaining what webR is and how it will change the way we think about building and deploying R code on the web. He will present `webrcli` and `spidyr`, two tools for creating NodeJS apps that can call R code via webR. And finally, Colin will also focus on the challenges that will arise with `webR`, and how we'll build web apps with R in the future.

Speakers

Colin Fay

Lead Developer at ThinkR, ThinkR

Colin FAY is a lead developer at ThinkR, a french agency of R experts. During the day, he helps companies by building tools and deploying infrastructure. His main areas of expertise are data & software engineering, web applications (frontend and backend), and R in production. During... Read More →

Tuesday July 9, 2024 11:20 - 11:40 CEST
Pinzgau + Tennegau

R workflow + deployment + production

11:40 CEST

Building Bilingual Bridges with Multilingual Manuals - Elio Campitelli, Universidad de Buenos Aires

The vast majority of packages are documented in English, due to the language's status as de-facto lingua franca. But what if your package is designed with a specific demographic in mind that could be better served by documentation in another language? Non-English documentation would make your package more accessible to them at the expense of isolating it from the wider international community. But... ¿por qué no los dos? The rhelpi18n package adds support for multilingual documentation in R so you can have the best of both worlds. Package authors or community projects can create translation modules that users can install to access documentation in their languages directly from R. The talk will include a high-level view of how this package extends R help system and will explain how people can create, install and use translation modules for R packages.

Speakers

Elio Campitelli

Lic, Universidad de Buenos Aires

I’m a PhD student in atmospheric sciences at the Centre for Ocean and Atmospheric Research, where I study the atmospheric circulation in the Southern Hemisphere and how it affects the weather in South America. I’m also the maintainer for several R packages and give courses.

Tuesday July 9, 2024 11:40 - 12:00 CEST
Pinzgau + Tennegau

R workflow + deployment + production

12:00 CEST

Systems Integration Tests for R Package Cohorts - Franciszek Walkowiak, Roche

One of the challenges for R developers is ensuring that their packages work correctly on an ever-increasing number of operating systems, platforms, and R versions. To aid in this endeavor, we are introducing two tools: Locksmith and Scribe. Their task is to install a cohort of R packages, along with all dependencies, and test the cohort on any kind of system. Locksmith resolves all dependencies of the cohort using provided package repositories and saves the list of all package versions and repositories to a snapshot. Scribe utilizes the snapshot to download, build, install, and check the packages in an efficient and reproducible manner. An older snapshot of packages can be restored by scribe on a new system to check for any compatibility issues. Both tools are written in Go, making their binaries easily buildable and distributable for different systems and platforms. Go also simplifies concurrent package installation and checking, significantly reducing execution time. As a result, package cohort testing can be performed frequently for various systems, allowing developers to quickly assess the overall health of their packages.

Speakers

Franciszek Walkowiak

Senior IT Professional at Roche, Roche

Tuesday July 9, 2024 12:00 - 12:20 CEST
Pinzgau + Tennegau

R workflow + deployment + production

13:35 CEST

Roam: Remote Objects with Active-Binding Magic - Yangzhuoran Fin Yang, Monash University

The "roam" package simplifies the creation of R objects that resemble regular objects but are sourced from remote locations. It empowers package developers to incorporate these "roaming" objects, which may surpass the 5MB limit, into their packages. Additionally, it facilitates dataset updates independent of package updates through functions that retrieve data from remote sources. https://github.com/FinYang/roam

Speakers

Yangzhuoran Fin Yang

PhD Candidate, Monash University

Yangzhuoran Fin Yang is a PhD candidate in the Department of Econometrics and Business Statistics at Monash University. His PhD project is on the use of transformations of time series to improve forecasting. Fin is active in research software development, (co)authoring open source... Read More →

Tuesday July 9, 2024 13:35 - 13:40 CEST
Pinzgau + Tennegau

R workflow + deployment + production, Lightning Talk

13:40 CEST

Adding the Missing Audit Trail to R - Magnus Mengelbier, Limelogic AB

The R language is used more extensively across the Life Science industry for GxP workloads. The basic architecture of R makes it near impossible to add a generic audit trail method and mechanism for all users cases. Different strategies have been developed to provide some level of auditing, from logging conventions to file system audit utilities, but each has its drawbacks and lessons learned.

The ultimate goal is to provide an immutable audit trail compliant with ICH Good Clinical Practice, FDA 21 CFR Part 11 and EU Annex 11, regardless of the R environment. We consider different approaches to implement auditing functionality with R and how we can incorporate an audit trail functionality natively in R or with existing and available external tools and utilities that completely supports Life Science best practices, processes and standard procedures for analysis and reporting.

Speakers

Magnus Mengelbier

Managing Director, Limelogic AB

Magnus is currently the Managing Director of Limelogic, a contributor, collaborator and independent consultant based in southern Sweden with over 25 years of experience in the Life Science industry. A keen advocate of simple programming approaches with a focus on GxP, compliance... Read More →

Tuesday July 9, 2024 13:40 - 13:45 CEST
Salzburg I

R workflow + deployment + production, Lightning Talk

13:40 CEST

Checklist Improves Collaboration, Quality and Visibility of Your Code - Thierry Onkelinx, Research Institute for Nature and Forest

The checklist package is a set of rules for R packages and R source code projects. The ruleset covers several topics: folder structure, filename conventions, spelling, code style, citation metadata, licence, contribution guidelines, ... Adherence to a common set of rules within an organisation facilitates collaboration between its members. Enforcing citation metadata and an open source licence improves the visibility of projects. Automated checks via GitHub Actions detect problems as soon as possible. Checklist is based on the rcmdcheck, lintr, pkgdown, codemetar and hunspell packages. Where applicable, we use the same rules for projects and packages. The maintainer can choose which parts of the ruleset apply to a project. In the case of an R package, the entire ruleset is mandatory. Publishing code on Zenodo is easy if you link to your GitHub repository. Each release on GitHub triggers a new version on Zenodo with a specific DOI. A GitHub action creates a new release for each new version of the package. Documentation and source code is available on https://inbo.github.io/checklist

Speakers

Thierry Onkelinx

statistician, Research Institute for Nature and Forest

statistician at the Research Institute for Nature and Forest

Tuesday July 9, 2024 13:40 - 13:45 CEST
Pinzgau + Tennegau

R workflow + deployment + production, Lightning Talk

14:10 CEST

Spare Cores: Harnessing Unutilized Cloud Compute Resources - Gergely Daroczi, Spare Cores

Spare Cores, an NGI Search funded open-source ecosystem, inventories and actually benchmarks for different scenarios the available compute resources of public cloud and server providers to find optimal instance types across vendors and datacenters for containerized jobs (e.g. training ML models or hosting a Shiny app). Among other open-source SDKs, we provide an R package allowing easy access to this public database, complemented by CLI helpers for launching instances in your existing cloud environment. We also briefly showcase a streamlined SaaS solution built on top of the open-source stack for those seeking simplicity and/or unwilling to manage their cloud infrastructure: the managed Spare Cores environment covers the entire life cycle of batch jobs and microservices, eliminating the need for direct cloud vendor engagement.

Speakers

Gergely Daroczi

Project lead, Spare Cores

Gergely Daróczi is an enthusiast R user and package developer, Ph.D. in Sociology; former assistant professor and founder of an R-based web reporting application at rapporter.net; ex Lead R Developer, then Director of Analytics at CARD.com; later Senior Director of Data Operations... Read More →

Tuesday July 9, 2024 14:10 - 14:30 CEST
Attersee

R workflow + deployment + production

14:30 CEST

Layered Design for R Package Development: Meeting the Needs of Pharmaceutical R&D Stakeholders - Jean Muller & Ligia Adamska, MSD Switzerland

In the pharmaceutical industry, we aim to create standard R packages for statisticians, economic modelers, and statistical programmers. The challenge is to have both flexible functions and a systematic structure to support automatic reporting systems. In this presentation, we introduce a layered design for R package development that addresses the requirements of these diverse users. Then, through a case study in Health Technology Assessment (HTA) analysis, we show how our design provides solutions while adhering to good programming and documentation practices. The design consists of two layers, first, a verb layer, embracing functional programming and built with pipeable functions to provide flexibility for exploratory analysis. Second, a reporting layer, wrapped around the verb layer, with the ability to generate agreed upon standard analysis with one call to a function, making it easy to repeat analysis for different populations, interventions, comparators, and outcomes (PICOs). Throughout the case study, we illustrate how we leveraged R package development tools, to organize, document, and test R code to ensure quality and maintainability.

Speakers

Jean Muller

Senior Scientist, MSD Switzerland

Jean Muller is a Senior Scientist in Statistical Programming at MSD, specializing in data analysis and Health Technology Assessment (HTA) in the pharmaceutical industry. With over 5 years of experience, Jean has a strong background in Biostatistics (MSc) and applied mathematics (BSc... Read More →

Ligia Adamska

Associate Director Statistical Programming, MSD Switzerland

Ligia Adamska is an Associate Director in HTA Statistical Programming at MSD. With a strong academic background, Ligia holds a Ph.D. in Engineering Surveying and Space Geodesy from the University of Nottingham (U.K.) and a B.Sc. in Mathematics from the University of East Anglia (U.K... Read More →

Tuesday July 9, 2024 14:30 - 14:50 CEST
Attersee

R workflow + deployment + production

14:50 CEST

Building Interoperability in Existing Software Ecosystems with S3 Classes - Hugo Gruson, data.org

It is common for R packages answering the same need to have different input and output formats. This may result in a large amount of spent time to reformat the inputs and outputs whenever a specific part of the data pipeline is swapped out to use a different R package. This time can come at a huge cost whenever results are needed quickly, such as in pandemic response. Using S3 classes providing standard formats that all downstream packages use may be a good solution to this issue, thus improving the interoperability within the global R package ecosystem. However, this approach comes with technical and social challenges. Here, I present the work we are doing to implement and encourage the adoption of standard S3 classes in epidemiology. I highlight key findings and challenges such as how to preserve backward compatibility in existing packages and give recommendation for future similar endeavors.

Tuesday July 9, 2024 14:50 - 15:10 CEST
Attersee

R workflow + deployment + production

15:10 CEST

Deep Dive Into Industry R Package Quality Assessment - Szymon Maksymiuk & Lorenzo Braschi, Roche

Over the past year, Roche/Genentech has been developing tools and infrastructure that facilitate our quality and validation exercises, a necessary part of using R in the regulated pharmaceutical industry. We’ve designed a process around broadly recognized package development best practices, where automated checks diligently assess packages. The process uses several tools that we have open-sourced over the past years. We want to share our approach, with particular emphasis on the core design philosophies we’ve set out to adhere to when building a cohort of packages applicable for usage when a certain level of trust is required. Beyond the process we have developed, we would like to present in detail the components we contributed to the R ecosystem. These are rd2markdown, a package designed to convert R package documentation into standard markdown files, and covtracer, which leverages our contributions to the well-known covr package to map any unit tests to the particular functions they evaluate. We believe various applications for these packages render them useful outside the strict quality assessment area and can play a significant role in day-to-day work with the R packages.

Speakers

Lorenzo Braschi

Szymon Maksymiuk

Senior R Developer, Roche

I’m a senior R Developer at Roche specializing in R package validation, ensuring robustness and reliability in pharmaceutical data analysis. With a background as a research software engineer at Warsaw University of Technology specializing in Machine Learning, I have experience as... Read More →

Tuesday July 9, 2024 15:10 - 15:30 CEST
Attersee

R workflow + deployment + production

13:30 CEST

Distributed GxP Workloads for R - Magnus Mengelbier, Limelogic AB

The broad and constantly evolving GxP use of R within Life Sciences is powerful. As the user base grows across the organization and R capabilities are added and evolved, you are not just managing a single environment of a particular use case. The workloads naturally become distributed across multiple environments with different architectures tailored to their peculiar role and use in the business.

We consider a set of common environments and their architectures and how a little bit of {plumber} can enable a simple-to-manage R architecture across dissimilar environments, even those that do not currently or simply cannot support the use of R. This new approach is easily extendable to Good Clinical Practice, and any of the other GxP domains, with a few simple processes and controls.

Speakers

Magnus Mengelbier

Managing Director, Limelogic AB

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

R workflow + deployment + production, Poster Session

13:30 CEST

Implementing Behavioral Nudges in Shiny - Shel Kariuki, UCD

Nudging is based on the idea that behavior is influenced by a wide range of enviromental factors, some of which can be altered by simple actions. Nudges can be used to boost employee motivation and productivity.

In this talk, I will demonstrate how behavioral nudges can be integrated into a shiny app.

We'll explore a hypothetical scenario involving a company named Nyawanja, which aims to boost the sales of a new product that they've recently developed. Nyawanja has developed a shiny
app that their sales people can use to keep track of their performance. They have divided their sales team into five different treatment groups as part of an experiment designed on the app to motivate their staff and hopefully improve performance. Once someone logs into the app, they'll get a different version of the home page based on their assigned treatment group. The rest of the app will be the same for everyone. Nyawanja can then analyse the effect of these different nudges on performance.

From this talk, I hope the audience will learn how to design and run experiments in a shiny app. This presentation will benefit those interested in experimental design and data-driven decision making.

Speakers

Shelmith Nyagathiri Kariuki

Shel Kariuki is currently a student pursuing an MSc in Economics and Data Analytics at University College Dublin. She has worked as a data analyst for around 7 years and has in the recent years been involved in building shiny apps. Shel is also a retired co-organizer of R-Ladies Nairobi... Read More →

Wednesday July 10, 2024 13:30 - 15:00 CEST
TBD

R workflow + deployment + production, Poster Session

10:35 CEST

Managing REDCap Data: The R package REDCapDM - João Carmezim, Germans Trias i Pujol Research Institute and Hospital (IGTP)

REDCap is a secure web application for creating and managing online surveys and databases. The aim of the R package “REDCapDM” is to process REDCap data and provide useful tools to perform all tasks involved in the data cleansing process prior to statistical analysis. The ‘REDCapDM’ package is structured into four dimensions, each serving a specific purpose. Firstly, read and process raw data from REDCap or through a REDCap API connection in R. Secondly, perform data transformation and data organization. Thirdly, identification of queries, specifically missing values, values outside the lower and upper limit of a variable and other types of inconsistencies in data from REDCap in R. Fourthly, perform an automatic control of queries already resolved or pending resolution. This package fills a gap in the available tools to manage REDCap data, making it an invaluable asset to researchers. The “REDCapDM” package is available on the CRAN library (https://cran.r-project.org/web/packages/REDCapDM/index.html) and is regularly updated.

Speakers

João Carmezim

Thursday July 11, 2024 10:35 - 10:40 CEST
Attersee

R workflow + deployment + production, Lightning Talk

10:40 CEST

caRdoon – a task queue API for R - Jakob Gepp, statworx GmbH

In this talk, I will introduce caRdoon, a plumber API that creates a local task management by enabling the asynchronous execution of arbitrary functions and providing a real-time view of job queues inspired by celery. By utilizing the asynchronous setup, one can avoid waiting for long tasks to finish, but still be able to get information on when a task is scheduled in the current queue. The result of each function is stored in a database for later retrieval. This enables the user to run tasks and review the results on demand.

Speakers

Jakob Gepp

Senior Consultant Data Science, statworx GmbH

After my M.Sc Statistics in 2016, I began working at statworx. Here I started providing statistical support in R for companies and private customers. Over the years I got more into the data science aspect, but kept R close to my heart. I developed some internal R packages and last... Read More →

Thursday July 11, 2024 10:40 - 10:45 CEST
Attersee

R workflow + deployment + production, Lightning Talk

11:30 CEST

Desert Island Docker: R Edition - Andrew Collier, Fathom Data

What 3 Docker images would you choose if you were shipwrecked on a desert island? Choosing the right images will determine whether you are rescued or end up in a cannibals' cooking pot (R images will make you unpalatable). Docker is an essential tool for survival as an R developer, regardless of whether you are stranded or not. In this talk I'll describe three R Docker images that I consider essential for survival on a desert island. I'll demonstrate how to set up and build a custom image. And finally I'll demonstrate how using a Docker image can simplify CI/CD and deployment. - What is Docker? - Food & Shelter: Base Image - SOS Signal: Shiny Image - Building a Raft: Custom Image (which includes RJava and uses renv) - Applications - CI/CD - Deployment

Speakers

Andrew Collier

Dr, Fathom Data

Andrew is Lead Data Scientist at Fathom Data. He spends his days tinkering with R, Python and Docker.

Thursday July 11, 2024 11:30 - 11:50 CEST
Salzburg II

R workflow + deployment + production

11:50 CEST

Flowchart: An R Package for Creating Participant Flow Diagrams Integrated with Tidyverse - Pau Satorra, Germans Trias i Pujol Research Institute and Hospital (IGTP)

The presentation will be a brief tutorial about a new released R package in CRAN called {flowchart} to create participant flow diagrams directly from a dataframe (https://cran.r-project.org/web/packages/flowchart/index.html). In health research, a patient flowchart is the best way to show the flow of participants in a study when reporting results as stated by the CONSORT guideline (https://www.bmj.com/content/340/bmj.c332.long). There are several packages in R for drawing flowcharts using different approaches but generally the programming is quite complex and the numbers need to be manually entered or parameterized beforehand. This new package uses a different approach integrated into the tidyverse framework. It allows you to create many different types of flowcharts in an easy and much more reproducible way because it automatically adapts to the data. This means we don’t have to manually set the flowchart parameters, such as the box coordinates or the numbers to display . The main idea behind the package is to create flowcharts from an initial dataset by combining different basic functions with the pipe operator (\|\> or %\>%).

Speakers

Pau Satorra

Mr, Germans Trias i Pujol Research Institute and Hospital (IGTP)

I'm a biostatistician with a background in mathematics and 4 years of experience in clinical research analysis. In 2019 I graduated in Mathematics from the University of Barcelona (UB). In 2023, I graduated from the Master in Fundamental Principles of Data Science at UB. From 2019... Read More →

Thursday July 11, 2024 11:50 - 12:10 CEST
Salzburg II

R workflow + deployment + production

12:10 CEST

Optimising Your Git WorkFlow - Colin Gillespie, https://jumpingrivers.com/

Everyone(?!) uses git in their day-to-day R workflow. Very soon, pushing, pulling, cloning and forking become second nature. But once you’ve mastered the basics, what next? This talk discusses the next steps in using Git. Wouldn’t it be nice if our code was automatically formatted? Errors in our packages flagged? Packages deployed to a remote CRAN-like repository. Well, have you considered GitHub Actions? Have you started working with other data scientists? How should you set up your repo to ensure a smooth workflow? How should merge requests be handled? How do you best utilise Git issues? Is your code sensitive? Should you set up GPG keys for commits? How should you ensure your API keys remain hidden? This talk aims to point useRs in the right direction for a friction-free R workflow.

Speakers

Colin Gillespie

CTO, Jumping Rivers

Colin is a Senior Statistics lecturer at Newcastle University and is a co-founder & CTO of Jumping Rivers. He has used R for over twenty years and has been teaching R for the past fifteen years. He co-authored the O’Reilly book on Efficient R Programming.

Thursday July 11, 2024 12:10 - 12:30 CEST
Salzburg II

R workflow + deployment + production

12:30 CEST

Building Large-Scale Simulation Pipelines Using Targets, Git and GitHub Actions - Sergio Olmos, Sanofi

Innovative clinical trial designs typically involve advanced statistical methods and extensive simulations. Building these complex simulation pipelines introduces challenges in reproducibility and transparency not easily addressed by traditional development workflows. In this session we will present how to use the targets R package, a Make-like pipeline tool, to develop efficient and reproducible simulation pipelines for innovative clinical trial designs. We will then show how Git and GitHub Actions can be used to deploy these large-scale simulation pipelines to cloud computing instances/clusters. The combination of these tools results in a robust and efficient workflow, enhancing the reproducibility of complex simulation pipelines. We will provide a detailed walkthrough of our approach, complete with practical examples and best practices, making it a valuable resource for statisticians and research software engineers working on innovative clinical trial designs and beyond.

Speakers

Sergio Olmos

Statistician, Sanofi

Sergio Olmos is a statistician in Sanofi working on the implementation of innovative clinical trial designs within the Statistical Innovation Hub. He is an experienced R developer with experience building reproducible analytical pipelines and creating R packages using software engineering... Read More →

Thursday July 11, 2024 12:30 - 12:50 CEST
Salzburg II

R workflow + deployment + production