Loading…
useR! 2024
Attending this event?
In Person & Virtual
8 - 11 July, 2024
Learn more and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for useR! 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Central European Time (UTC+1)To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Thursday, July 11 • 12:30 - 12:50
Split-Apply-Combine with Dynamic Grouping - Mark van der Loo, Statistics Netherlands

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Group-wise aggregation is one of the most common operations in data analyses.. There are use cases where the grouping is determined dynamically by collapsing smaller subsets into larger ones, to ensure sufficient support for the target aggregate. Examples include cases where some of the target groups suffer from missing data, or cases where the quality of target group data is judged to be too low. Often, hierarchical classifications serve as a basis for forming larger groups, but custom 'collapsing schemes' are in use as well. In this presentation we demonstrate the R package 'accumulate' [1] that offers interfaces for defining grouped aggregation, where the grouping may be dynamically determined, based on user-defined aggregations, user-defined decision rules, and user-defined collapsing schemes. The package offers several ways to define collapsing schemes, including tabular definitions that can be maintained separately from the aggregation code. It also includes facilities to use hierarchical classifications and for testing the (possibly complex) decision rules that user can create. [1] https://cran.r-project.org/package=accumulate

Speakers
avatar for Mark  van der Loo

Mark van der Loo

Senior Researcher, Statistics Netherlands
Mark is a Senior Researcher at Statistics Netherlands and a Research Fellow at the Leiden Institute for Advanced Computer Science at the University of Leiden. Mark published his first package in 2009 and has since co-authored about 20 R packages, a book on statistical data cleaning... Read More →


Thursday July 11, 2024 12:30 - 12:50 CEST
Attersee
Feedback form isn't open yet.