PheWAS - phenome-wide association studies


Current methods to identify gene-disease associations primarily rely on clinical trials or observational cohorts to identify patients.  At Vanderbilt, we have used an EMR-linked DNA biobank called BioVU to derive case and controls populations using data within the EMR to define clinical phenotypes.  Genetic data for these EMR-linked association studies are redeposited into BioVU for future EMR-linked studies.  This has opened the possibility of "reverse GWAS" or "Phenome-wide association studies" (PheWAS)


PheWAS using ICD9 codes
Our initial studies in PheWAS have been performed using a custom-developed grouping of International Classification of Disease, 9th edition (ICD9) codes.  These grouping loosely follow the 3-digit (category) and section groupings defined with the ICD9 code system itself, but vary to include, for example, all hypertension codes (401-405) as one grouping.  Each custom PheWAS code group also has an associated control group that excludes other related conditions (e.g., a patient with psoriatic arthritis cannot be a control for rheumatoid arthritis).  Such grouping are based on other similar work.

Performing PheWAS using ICD9 codes replicates previously known gene-disease associations for 4/7 diseases (see pubication).  They were multiple sclerosis, rheumatoid arthritis, Crohn's disease, and ischemic heart disease. 

The files necessary to perform PheWAS are available below:

  • code translation file: This file groups ICD9 codes into "phewas codes" of like ICD9 codes. It also defines control ranges ("phewas_exclude_range") for each "phewas code".
  • phewas.pl: A PERL script that takes as its input tab-delimited genotype files, a file containing all ICD9 files for an individual, and a file with race and gender for each individual. It has various options available in the header of the file.

 

Creative Commons License
PheWAS by Josh Denny, MD MS is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.

Users should reference: Denny JC, Ritchie MD, Basford M, Pulley J, Bastarache L, Brown-Gentry K, Wang D, Masys DR, Roden DM, Crawford DC. PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010 Mar 24. [Epub ahead of print] PMID: 20335276