Nuala Sheehan

Professor of Statistical Genetics


Nuala Sheehan



Department of Health Sciences
University of Leicester
Centre For Medicine, University Road
Leicester, LE1 7RH
Tel: 00 44 116 229 7271


Go To:

Research Interests

Research Projects

Professional Activities

Postgraduate Supervision and Teaching

Selected Publications



My interests are in statistical methodology development motivated by complex problems in genetics. This work is highly interdisciplinary in nature, draws heavily on ideas from statistics, computer science, genetics and epidemiology and I work closely with experts in all these areas. The main threads of my work, which all interact in various ways, comprise:

  1. Inferring causality from observational epidemiological data. Because subjects have not been randomized to an epidemiological exposure of interest, associational findings from observational data may, or may not, be causal relationships. In order to inform public health interventions, it is important to differentiate between association and causation. The main problem with inferring causality from observational epidemiological data is that it is difficult to rule out the possibility that the association of interest has been confounded by some unmeasured factor that is affecting both exposure and health outcome of interest. Mendelian randomization is a method for drawing causal inferences in the presence of unobserved confounding by exploiting the known functionality of a genetic variant. This is a particularly difficult problem in causal inference, especially for binary outcome data.
  2. Estimating relationships from genetic data. Large population Biobanks of unrelated individuals that aim to investigate the genetic risk factors underlying the common complex diseases of major public health concern do not have sufficient statistical power to discover rarer genes or genes with relatively modest effects. When rare variants are of interest, it is more efficient to identify sets of relatives as they are more likely to share genomic regions around disease susceptibility loci. Extensive population Biobanks, although designed to recruit unrelated individuals, undoubtedly contain numerous relatives so being easily able to infer pedigrees from such data could greatly extend the uses of these data. 
  3. Confidentiality issues for the sharing of genetic study data.  Although the importance of pooling data from different studies is beyond scientific dispute, there are many ethical and legal obstacles to achieving this in practice. Moreover, recent methods have shown that even summary statistics (hitherto assumed to be “safe”) can be revealing in certain circumstances. A proper understanding of how patient confidentiality can be breached and how to combine information without violating ethics permissions is essential to progress.
  4. Graphical modelling of complex problems in genetics. The formal graphical representation of a complex model facilitates assessment of computational difficulties, for both exact and approximate calculations. It also enables the use of fast message-passing algorithms that have been developed for expert systems to provide exact calculations of probabilities and likelihoods on general graphs that are intractable for existing genetic software.
  5. Estimating probabilities in complex pedigrees. Standard statistical genetics peeling algorithms for calculating probabilities and likelihoods break down on large multi-generational pedigrees featuring individuals who are involved in multiple marriages and marriages between relatives.  Markov chain Monte Carlo (McMC) methods can provide approximations of the required quantities. The Gibbs sampler, the most popular McMC algorithm, is easy to apply to pedigree data but the underlying Markov chain may not be irreducible and block updating samplers may be required.

Back to Top



Statistical and computational methods for relatedness and relationship inference from Genetic Marker data. With James Cussens (York) as co-applicant.  Funding from the International Centre for Mathematical Sciences to organise a workshop in Edinburgh, September 2014.

A graphical model approach to pedigree construction using constrained optimization. With James Cussens (York)  and Paul Burton (Leicester) and George Davey Smith (Bristol). Medical Research Council Project Grant G1002312 for three years from October 2011. (See

Statistical Methods for High Density Genetic Data. Leverhulme Research Fellowship. October 2009˗September 2011.

Inferring epidemiological causality using Mendelian randomisation. With Vanessa Didelez, Debbie Lawlor, Jonathan Sterne and Frank Windmeijer (Bristol) and John Thompson (Leicester). Medical Research Council Project Grant G0601625 for three years from October 2007.

Statistical methods for genetic epidemiology. With Elizabeth Thompson (Seattle) and Max Baur (Bonn) as co-applicants. Funding from the International Centre for Mathematical Sciences to organise a workshop in Edinburgh, May 2007.

The application and development of methods to combine information in epidemiological studies of cardiovascular traits of major public health importance. With Martin Tobin. A PhD studentship funded by the British Heart Foundation, October 2007-September 2010.

Bayesian networks for forensic inference from genetic marker awarded to 13 individuals in 8 institutions in 5 countries: Philip Dawid , Hilde Wilkinson-Herbots, Vanessa Didelez (University College, London, UK), Robert G. Cowell (City University, London, UK)  Nuala A. Sheehan, Paul R. Burton (University of Leicester, UK) Julia Mortera, Paola Vicard (University of Rome Three, Italy) Vincenzo L. Pascali, Marina Dobosz(Catholic University of Rome, Italy) Steffen L. Lauritzen (Aalborg University, Denmark) Thore Egeland (University of Oslo, Norway) Peter Mostad (Chalmers University, Gothenburg, Sweden). Funded by the Leverhulme Trust (Research Interchange Grant (F/071134/K)) from October 2001 to September 2004.

Value in People Award, as Senior Research Fellow in the Departments of Genetics and Health Sciences. Funded by the Wellcome Trust from November 2003 to October 2004

Back to Top



 Associate Editor of Stat ˗ the ISI’s new Journal for the Rapid Dissemination of Statistics Research 

Fellow of the Royal Statistical Society (and vice-chair of the Medical Section committee)

Member of the International Genetic Epidemiology Society 

Recent Workshop/Meeting Organisation:

Steering committee member for the annual UK Causal Inference Meeting (UK-CIM). The 2015 meeting was in Bristol,  April 15-17.

Joint organiser (with James Cussens, York) of an ICMS workshop on Statistical and computational methods for relatedness and relationship inference from Genetic Marker data, Edinburgh September 2014.

Joint organiser (with Jim Smith, Warwick) of the CRiSM funded workshop on Graphical Models and Genetic Applications. Warwick, April 2009.

Workshop on "Statistical Methods in Genetic Epidemiology " with Elizabeth Thompson (Seattle) and Max Baur (Bonn) at the International Centre for Mathematical Sciences (ICMS) in Edinburgh, May 2007 (

 Back to Top 



 Chin Yang Shapland (current).  Mendelian Randomization in the genomewide era. Started October 2013.  Departmental studentship commencing October 2013.

Meng Sun (current). Identifying relationships and relatedness from genetic marker data. College studentship commencing October 2011.

Nick Masca (2011). The application and development of methods to combine and infer information from genetic epidemiological studies of cardiovascular and other complex traits.  Funded by the BHF.

Martin Tobin (2005). The genetic epidemiology of blood pressure in human populations. Funded by the MRC. 

    Back to Top  



Sun M, Jobling M A, Taliun D, Pramstaller P P, Egeland T and N A Sheehan (2016). On the use of dense SNP marker data for the identification of distant relative pairs. Theoretical Population Biology 107: 14-25.

 Del Greco F, Minelli C, Sheehan N A and J R Thompson (2015). Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous disease trait. Statistics in Medicine, 34: 2926-2940.

Sheehan N A, Bartlett M and J Cussens (2014). Improved maximum likelihood reconstruction of complex multi-generational pedigrees. Theoretical Population Biology, 97: 11-19.

Egeland T, Dorum G, Vigeland M D and N A Sheehan (2014). Mixtures with relatives: a pedigree perspective. Forensic Science International: Genetics, 10: 49-54.

Jones E M, Sheehan N A, Gaye A, Laflamme P and P Burton (2013). Combined analysis of correlated data when data cannot be pooled. Stat, 2: 72-85.

Cussens J, Bartlett M, Jones E M and N A Sheehan (2013).  Maximum likelihood
pedigree reconstruction using integer linear programming. Genetic Epidemiology, 37: 69-83.

Harbord R, Didelez V, Palmer T M, Meng S, Sterne J A C and N A Sheehan (2013). Severity of bias of a simple estimator of the causal odds ratio in Mendelian randomization studies. Statistics in Medicine, 32: 1246-1258. 

Janss L, de los Campos G, Sheehan N and D Sorensen (2012). Inferences from genomic models in stratified populations. Genetics, 192(2): 693-704.

Jones, E.M., Thompson, J., Didelez, V. and N.A. Sheehan (2012). On the choice of parameterisation and priors for the Bayesian analyses of Mendelian randomisation studies. Statistics in Medicine, 31: 1483–1501.

Palmer, T M, Lawlor, D A, Harbord, R M, Sheehan, N A, Tobias, J H, Timpson, N J, Davey Smith, G and J A C Sterne (2012). Using multiple genetic variants as instrumental variables for modifiable risk factors. Statistical Methods in Medical Research, 21: 223–242.

Masca, N, Burton, P.R. and N.A. Sheehan (2011). Participant identification in genetic association studies: improved methods and practical implications. International Journal of Epidemiology, 40: 1629–1642.

Palmer, T.M, Didelez, V, Ramsahai, R and N.A. Sheehan (2011). Nonparametric bounds for the causal effect in a binary instrumental variable model. Stata Journal, 11 (3): 345–367.

Palmer, T. Sterne, J A C, Harbord, R M., Lawlor, D A, Sheehan, N A, Meng, S, Granell, R, Davey Smith, G and V Didelez (2011). Instrumental variable estimation of the causal risk ratio and causal odds ratio in Mendelian randomization analyses. American Journal of Epidemiology, 173(12): 1392–1403.

Masca, N, Sheehan, N A and M D Tobin (2011). Pharmacogenetic interactions and their potential effects on genetic analyses of blood pressure. Statistics in Medicine. 30:769–783.

Sheehan, N A, Meng, S. and V Didelez (2011). Mendelian randomisation: a tool for assessing causality in observational epidemiology. In “Genetic Epidemiology”, editor Dawn Teare. Series on Methods in Molecular Biology, Volume 713, pp 153-166. Humana Press Inc.

Didelez, V,  Meng, S and N A Sheehan (2010). Assumptions of IV methods for observational epidemiology. Statistical Science  25: 22-40

Skare, Ø, Sheehan, N and T Egeland (2009). Identification of distant family relationships. Bioinformatics, 18: 2376-2382

Sheehan, N.A, Didelez, V., Burton, P.R. and M.D. Tobin (2008). Mendelian randomisation and causal inference in observational epidemiology. PLoS Medicine 5 (8) e177.

Egeland, T. and N.A. Sheehan (2008). On identification problems requiring linked autosomal markers. Forensic Science International: Genetics 2: 219-225.

Sheehan, N.A. and T. Egeland (2008).  Adjusting for founder relatedness in a linkage analysis using prior information. Human Heredity 65: 221-231

Sheehan, N.A. and T. Egeland (2007). Structured incorporation of prior information in relationship identification problems. Annals of Human Genetics 71: 501-518.

Didelez, V. and N.A. Sheehan (2007). Mendelian randomisation as an instrumental variable approach to causal inference. Statistical Methods in Medical Research 16: 309-330.

Sheehan, N.A., Guldbrandtsen, B. and D.A. Sorensen (2007). Evaluating the performance of a block updating MCMC sampler in a simple genetic application. Journal of Agricultural, Biological and Environmental Statistics, 12: 272–299.

Didelez, V. and N A Sheehan (2007). Mendelian randomisation: why epidemiology requires a formal language for causality in "Causality and Probability in the Sciences", Texts in Philosophy Volume 5, eds. F. Russo and J. Williamson, London College Publications, 263-292.

Lauritzen, S L and N A Sheehan (2003). Graphical models for genetic analyses. Statistical Science. 18, 489-514
Cannings, C. and N A  Sheehan (2002) On a misconception about irreducibility of the single-site Gibbs sampler in a pedigree application. Genetics, 162, 993—996.

Sheehan, N A, Guldbrandtsen, B, Lund, M S and D A Sorensen (2002). Bayesian MCMC mapping of quantitative trait loci in a half-sib design: a graphical model perspective. International Statistical Review, 70, 241—267.

Sheehan, N A  (2000). On the application of Markov chain Monte Carlo methods to genetic analyses on complex pedigrees. International Statistical Review, 68, 83—110.

Back to Top

Share this page: