Bayesian Inference on Allele Group Structure for High Order Interactions in Genome-Wide Association Studies

Albert Wong
M.S., 2013
Advisor: Qing Zhou
Sophisticated Bayesian methods are often used to identify a collection of alleles that are jointly associated with a particular disease. A disease might not be expressed when only one of these alleles is present, but each associated allele might interact with each other in a rather complicated way, causing a disease to be expressed. In investigating a patient’s susceptibility to a disease, it is often useful to group the collection of associated alleles according to their risk factors. Our goal is to find the most likely grouping structure of alleles C1, …, Cm associated with Rheumatoid Arthritis given a case-control data. The number of ways to group these m alleles is given by the mth Bell number Bm, defined recursively by Bm = Pm−1 k=0 (m−1 k)Bk with B0 = B1 = 1. For 10 alleles, this translates to 115,975 groupings. For m = 15, we have over a billion ways to group C1, …, Cm. Clearly computing the probability for each grouping soon become intractable. A combination of Metropolis-Hastings and local search algorithm is proposed to accomplish this task. This strategy is first implemented on simulated data, with a sufficiently large sample size and a known grouping structure, and the correct grouping is obtained. Stable results are obtained as the algorithm is run multiple times on Rheumatoid Arthritis data.
2013