Entropy Filtering Method and Insertion/Deletion Robust Algorithm for Multiple Local Sequence Alignment

Jun Xie
Ph.D., 2000
Advisor: Ker-Chau Li

Bayesian models have been developed for finding ungapped motifs in multiple protein sequences (Liu, Neuwald and Lawrence 1995). In this article we extend the model to allow for deletions and insertions in the motifs. Direct generalization of the ungapped algorithm, based on Gibbs sampling, proves unsuccessful because of the configuration space has become much larger. To alleviate this difficulty, a method called entropy filtering is introduced which allows us to find a better starting point. In addition to Gibbs sampling, we also provide a Metropolis-Hastings algorithm which shows more stable performance. The significance of the alignment is discussed at the end.