Background:
Over the last few years the availability and use of densely spaced SNPs (Single Nucleotide Polymorphisms) in genetic mapping and association studies has exploded. The candidate genes suggested by full genome scans or by functional studies need to be tested in separate epidemiological studies to weed out false positives. For the SNPs to serve as fully informative markers for a deleterious gene, one needs to reconstruct the information of what SNP variants are on the same chromosome string (SNP haplotypes).
Aims:
The standard models, like logistic regression (for case-control) and the transmission-disequilibrium test (for triads, highly useful in perinatal epidemiology), must be extended to allow SNP haplotype data. Since there is a strong dem and for verification of results using both designs in parallel, methods should be developed to combine effect estimates.
Combining haplotype information with classical epidemiological data on environmental exposures should be done to evaluate gene-enviro nment interactions.
The modeling effort should be shifted from the traditional p-value-based testing approach to computation of effect estimates like relative risks, supplied with confidence intervals, in keeping with modern epidemiological thinking.
To aid researchers in choosing between different designs, modeling and software development should also include sample size computations.
The computational challenges posed by the rapidly increasing SNP information should be met by more efficient software implementations of the developed methodology, allowing a full analysis of haplotypes for candidate genes "saturated" with SNPs.
To test the practicability of the methods we will apply them to data on orofacial clefts in Norway, in collaboration with expe rts in the field.
The project is a collaboration between the Norw. Inst. of Publ. Health, Univ. of Bergen, Nat. Inst. of Envir. Health Sci. (North Carolina) and the Dept. of Pediat., Univ. of Iowa