Loading…
useR! 2024
Attending this event?
In Person & Virtual
8 - 11 July, 2024
Learn more and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for useR! 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Central European Time (UTC+1)To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change.
Wednesday, July 10 • 15:20 - 15:40
MissForestPredict - Missing Data Imputation in Prediction Settings - Elena Albu, KU Leuven, Belgium

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Prediction models are used to predict an outcome based on input variables. Missing data in input variables often occurs at model development and at prediction time. The newly released missForestPredict R package proposes an adaptation of the missForest imputation algorithm that is fast, user-friendly and tailored for prediction settings. The algorithm iteratively imputes variables using random forests until a convergence criterion (unified for continuous and categorical variables and based on the out-of-bag error) is met. The imputation models are saved for each variable and iteration and can be applied later to new observations. The missForestPredict package offers extended error monitoring, control over variables used in the imputation and custom initialization. This allows users to tailor the imputation to their specific needs. The missForestPredict algorithm is further compared to mean/mode imputation, k-nearest neighbours, bagging and two iterative algorithms (miceRanger and IterativeImputer) on 8 simulated datasets with simulated missingness and 8 public datasets using different prediction models. missForestPredict provides satisfactory results within short computation times.

Speakers
avatar for Elena Albu

Elena Albu

Ms., KU Leuven, Belgium
During her career in healthcare IT and data science, she worked with Electronic Health Record (EHR) data and gained knowledge on medical workflows. In 2019, she earned her Master of Science in Statistical Data Analysis at the University of Ghent, Belgium. During this program, her... Read More →


Wednesday July 10, 2024 15:20 - 15:40 CEST
Wolfgangsee
Feedback form isn't open yet.