Title: Using the Markov Chain Monte Carlo Method to Make Inferences on Items of Data Contaminated by Missing Values,
Abstract: The Markov Chain Monte Carlo (MCMC) is a method that is used to estimate parameters of interest under difficult conditions such as missing data or when underlying distributions do not fit the assumptions of Maximum Likelihood processes. The objective of this process is to find a probability distribution known as a posterior distribution in Bayesian analysis that can be used to estimate target parameters. In this paper, we consider a case where data are contaminated with missing values and therefore need to be adequately handled using missing data techniques before making inferences on them. A review of the mathematics involved in MCMC procedures in the presence of missing data is presented. Furthermore, we use real data to compare inferences made using multiple imputation based on the multivariate normal model (MVN) that uses the MCMC procedure, the case deletion (CD) missing data method that discards subjects with missing values from the analysis, and the fully conditional specification (FCS) multiple imputation method that uses a sequence of regression models to fill in missing values. Assuming that data are missing completely at random (MCAR) on continuous and normally distributed variables, the following findings are obtained: (1) The higher the proportion of missing data on a variable of interest, the more the relationship between that variable and the dependent variable is distorted when all missing data methods are applied. (2) Multiple imputation based methods produce similar estimates which are better than estimates from the case deletion method. (3) At some stage (when the proportion of missing data becomes high), none of the missing data techniques can help to maintain an initially existing relationship between the dependent variable and some of the covariates of interest in the dataset.