phenotyping


Methods to identify gene-disease associations primarily rely on clinical trials or observational cohorts and, more recently, Electronic Medical Record-linked DNA Biobanks.  At Vanderbilt, we have used an EMR-linked DNA biobank called BioVU to derive case and controls populations using data within the EMR to define clinical phenotypes.  Genetic data for these EMR-linked association studies are redeposited into BioVU for future EMR-linked studies.  This has opened the possibility of "reverse GWAS" or "Phenome-wide association studies" (PheWAS)

We replicated known genetic associations for five diseases. We genotyped the first 10,000 samples accrued into BioVU (the Vanderbilt EMR-associated DNA biobank) for twenty-one loci were associated with five common diseases (reported odds ratios 1.14-2.36) in at least two previous studies. We developed automated phenotype identification algorithms that used NLP techniques (to identify key findings, medication names, and family history), billing code queries, and structured data elements (such as laboratory results) to identify cases (n=70-698) and controls (n=808-3818).
Syndicate content