Household consumption or income surveys do not typically cover refugee populations. In the rare cases where refugees are included, inconsistencies between different data sources could interfere with comparable poverty estimates. We test the performance of a recently developed cross-survey imputation method to estimate poverty for a sample of refugees in Colombia, combining household income surveys collected by the Government of Colombia and administrative (ProGres) data collected by the United Nations High Commissioner for Refugees in 2019 and 2022. We find that certain variable transformation methods can help resolve these inconsistencies. Estimation results with our preferred variable standardization method are robust to different imputation methods, including the normal linear regression method, the empirical distribution of the errors method, and the probit and logit methods. Several common machine learning techniques generally perform worse than our proposed imputation methods. We also find that we can reasonably impute poverty rates using an older household income survey and a more recent ProGres dataset for most of the poverty lines. These results provide relevant inputs into designing better surveys and administrative datasets on refugees in various country settings.
JEL Classification: C15, F22, I32, O15, O20
Keywords: Colombia, imputation, Poverty, Refugees