Background: Gene expression matrix produced by DNA microarray technology inexorably
contains multiple missing entries due to experimental problems. Prediction of missing values in gene
expression matrix is essential as algorithms analyzing gene expression typically need a matrix without
Objective: The objective of this paper is to present a novel bicluster-based sequential interpolation
imputation method to predict missing values in gene expression data.
Method: For each missing entry, this method first generates a bicluster by selecting a number of
correlated genes and samples for that missing position and then applies interpolation based
approximation technique on that bicluster. This method starts imputation from the gene with the
minimum number of missing values and continues imputation by reusing the already imputed values.
Results: The result of the proposed method is compared with seven well known existing estimation
techniques over nine different data sets. The metric used to compare the performance are normalized
root mean square error (NRMSE) and average distance between partition errors (ADBPE).
Conclusion: Performance of the proposed method is observed to be better than the well-known methods
in a variety of data sets. The novelty of this approach lies in applying interpolation technique in the
identified local structure (bicluster) for predicting missing values.