Banff or Simputation? Assessing Alternative Approaches to the Imputation of Missing and Erroneous Data on BEA’s Multinational Enterprise Surveys (PDF)

BEA employs automated data editing and imputation systems to process a subset of its multinational enterprise (MNE) surveys. Until recently, all of the auto-editing systems used at BEA were built around the Banff system for data editing and imputation produced by Statistics Canada, which runs in SAS. To enhance BEA’s flexibility, BEA researchers have explored the feasibility of creating auto-editing systems built around software other than Banff, focusing in particular on a set of R packages created by researchers at Statistics Netherlands. Since previous research has established that Banff produces highly accurate imputations on BEA’s MNE surveys, a key question is whether these R packages produce imputations that are as accurate as those produced by Banff. Among these R packages, the Simputation package is responsible for almost all of the imputation functionality. This project employs a simulation-based approach to assess the relative accuracy of the imputations produced by Banff and Simputation, using data collected by two different MNE survey instruments. The simulation results indicate that Simputation is sufficiently accurate to make the Statistics Netherlands R packages a viable alternative to Banff. However, the results differ by survey instrument, suggesting that Simputation might produce more accurate imputations when the instrument collects relatively few data items and that Banff might be more accurate for forms that are longer and more complex.

Larkin Terrie

JEL Code(s) F23 Published