Nuala Sheehan

Professor of Statistical Genetics

Nuala Sheehan

Department of Health Sciences
University of Leicester
Centre For Medicine, University Road
Leicester, LE1 7RH

Tel: +44 (0)116 229 7271


Personal details

Professional activities

  • Associate Editor of Stat ˗ the ISI’s new Journal for the Rapid Dissemination of Statistics Research
  • Fellow of the Royal Statistical Society and chair of the Medical Section committee to January 2017.

Recent workshop/meeting organisation

  • Steering committee member for the annual UK Causal Inference Meeting (UK-CIM). The 2017 meeting was in Exeter,  April 4-7
  • Joint organiser (with James Cussens, York) of an ICMS workshop on Statistical and computational methods for relatedness and relationship inference from Genetic Marker data, Edinburgh September 2014
  • Joint organiser (with Jim Smith, Warwick) of the CRiSM funded workshop on Graphical Models and Genetic Applications. Warwick, April 2009
  • Workshop on 'Statistical Methods in Genetic Epidemiology' with Elizabeth Thompson (Seattle) and Max Baur (Bonn) at the International Centre for Mathematical Sciences (ICMS) in Edinburgh, May 2007


Selected publications

Sun M, Jobling M A, Taliun D, Pramstaller P P, Egeland T and N A Sheehan (2016). On the use of dense SNP marker data for the identification of distant relative pairs. Theoretical Population Biology 107: 14-25.

Del Greco F, Minelli C, Sheehan N A and J R Thompson (2015). Detecting pleiotropy in Mendelian randomisationstudies with summary data and a continuous disease trait. Statistics in Medicine34: 2926-2940.

Sheehan N A, Bartlett M and J Cussens (2014). Improved maximum likelihood reconstruction of complex multi-generational pedigrees. Theoretical Population Biology97: 11-19.

Egeland T, Dorum G, Vigeland M D and N A Sheehan (2014). Mixtures with relatives: a pedigree perspective. Forensic Science International: Genetics10: 49-54.

Jones E M, Sheehan N A, Gaye A, Laflamme P and P Burton (2013). Combined analysis of correlated data when data cannot be pooled. Stat2: 72-85.

Cussens J, Bartlett M, Jones E M and N A Sheehan (2013).  Maximum likelihood
pedigree reconstruction using integer linear programming. Genetic Epidemiology37: 69-83.

Harbord R, Didelez V, Palmer T M, Meng S, Sterne J A C and N A Sheehan (2013). Severity of bias of a simple estimator of the causal odds ratio in Mendelian randomization studies. Statistics in Medicine32: 1246-1258.

Janss L, de los Campos G, Sheehan N and D Sorensen (2012). Inferences from genomic models in stratified populations. Genetics192(2): 693-704.

Jones, E.M., Thompson, J., Didelez, V. and N.A. Sheehan (2012). On the choice of parameterisation and priors for the Bayesian analyses of Mendelian randomisation studies. Statistics in Medicine31: 1483–1501.

Palmer, T M, Lawlor, D A, Harbord, R M, Sheehan, N A, Tobias, J H, Timpson, N J, Davey Smith, G and J A C Sterne (2012). Using multiple genetic variants as instrumental variables for modifiable risk factors. Statistical Methods in Medical Research21: 223–242.

Masca, N, Burton, P.R. and N.A. Sheehan (2011). Participant identification in genetic association studies: improved methods and practical implications. International Journal of Epidemiology40: 1629–1642.

Palmer, T.M, Didelez, V, Ramsahai, R and N.A. Sheehan (2011). Nonparametric bounds for the causal effect in a binary instrumental variable model. Stata Journal11 (3): 345–367.

Palmer, T. Sterne, J A C, Harbord, R M., Lawlor, D A, Sheehan, N A, Meng, S, Granell, R, Davey Smith, G and V Didelez (2011). Instrumental variable estimation of the causal risk ratio and causal odds ratio in Mendelian randomization analyses. American Journal of Epidemiology173(12): 1392–1403.

Masca, N, Sheehan, N A and M D Tobin (2011). Pharmacogenetic interactions and their potential effects on genetic analyses of blood pressure. Statistics in Medicine30:769–783.

Sheehan, N A, Meng, S. and V Didelez (2011). Mendelian randomisation: a tool for assessing causality in observational epidemiology. In “Genetic Epidemiology”, editor Dawn Teare. Series on Methods in Molecular Biology, Volume 713, pp 153-166. Humana Press Inc.

Didelez, V,  Meng, S and N A Sheehan (2010). Assumptions of IV methods for observational epidemiology. Statistical Science 25: 22-40

Skare, Ø, Sheehan, N and T Egeland (2009). Identification of distant family relationships. Bioinformatics18: 2376-2382

Sheehan, N.A, Didelez, V., Burton, P.R. and M.D. Tobin (2008). Mendelian randomisation and causal inference in observational epidemiology. PLoS Medicine 5 (8) e177.

Egeland, T. and N.A. Sheehan (2008). On identification problems requiring linked autosomal markers. Forensic Science International: Genetics 2: 219-225.

Sheehan, N.A. and T. Egeland (2008). Adjusting for founder relatedness in a linkage analysis using prior information. Human Heredity 65: 221-231

Sheehan, N.A. and T. Egeland (2007). Structured incorporation of prior information in relationship identification problems. Annals of Human Genetics 71: 501-518.

Didelez, V. and N.A. Sheehan (2007). Mendelian randomisation as an instrumental variable approach to causal inference. Statistical Methods in Medical Research 16: 309-330.

Sheehan, N.A., Guldbrandtsen, B. and D.A. Sorensen (2007). Evaluating the performance of a block updating MCMC sampler in a simple genetic application. Journal of Agricultural, Biological and Environmental Statistics12: 272–299.

Didelez, V. and N A Sheehan (2007). Mendelian randomisation: why epidemiology requires a formal language for causality in "Causality and Probability in the Sciences", Texts in Philosophy Volume 5, eds. F. Russo and J. Williamson, London College Publications, 263-292.

Lauritzen, S L and N A Sheehan (2003). Graphical models for genetic analyses. Statistical Science. 18, 489-514
Cannings, C. and N A  Sheehan (2002) On a misconception about irreducibility of the single-site Gibbs sampler in a pedigree application. Genetics162, 993—996.

Sheehan, N A, Guldbrandtsen, B, Lund, M S and D A Sorensen (2002). Bayesian MCMC mapping of quantitative trait loci in a half-sib design: a graphical model perspective. International Statistical Review70, 241—267.

Sheehan, N A  (2000). On the application of Markov chain Monte Carlo methods to genetic analyses on complex pedigrees. International Statistical Review68, 83—110.


My interests are in statistical methodology development motivated by complex problems in genetics.

This work is highly interdisciplinary in nature, draws heavily on ideas from statistics, computer science, genetics and epidemiology and I work closely with experts in all these areas.

The main threads of my work, which all interact in various ways, comprise:

Inferring causality from observational epidemiological data

Because subjects have not been randomised to an epidemiological exposure of interest, associational findings from observational data may, or may not, be causal relationships. In order to inform public health interventions, it is important to differentiate between association and causation. The main problem with inferring causality from observational epidemiological data is that it is difficult to rule out the possibility that the association of interest has been confounded by some unmeasured factor that is affecting both exposure and health outcome of interest. Mendelian randomization is a method for drawing causal inferences in the presence of unobserved confounding by exploiting the known functionality of a genetic variant. This is a particularly difficult problem in causal inference, especially for binary outcome data.

Estimating relationships from genetic data

Large population Biobanks of unrelated individuals that aim to investigate the genetic risk factors underlying the common complex diseases of major public health concern do not have sufficient statistical power to discover rarer genes or genes with relatively modest effects. When rare variants are of interest, it is more efficient to identify sets of relatives as they are more likely to share genomic regions around disease susceptibility loci. Extensive population Biobanks, although designed to recruit unrelated individuals, undoubtedly contain numerous relatives so being easily able to infer pedigrees from such data could greatly extend the uses of these data.

Confidentiality issues for the sharing of genetic study data

Although the importance of pooling data from different studies is beyond scientific dispute, there are many ethical and legal obstacles to achieving this in practice. Moreover, recent methods have shown that even summary statistics (hitherto assumed to be 'safe') can be revealing in certain circumstances. A proper understanding of how patient confidentiality can be breached and how to combine information without violating ethics permissions is essential to progress.

Graphical modelling of complex problems in genetics

The formal graphical representation of a complex model facilitates assessment of computational difficulties, for both exact and approximate calculations. It also enables the use of fast message-passing algorithms that have been developed for expert systems to provide exact calculations of probabilities and likelihoods on general graphs that are intractable for existing genetic software.

Estimating probabilities in complex pedigrees

Standard statistical genetics peeling algorithms for calculating probabilities and likelihoods break down on large multi-generational pedigrees featuring individuals who are involved in multiple marriages and marriages between relatives.  Markov chain Monte Carlo (McMC) methods can provide approximations of the required quantities. The Gibbs sampler, the most popular McMC algorithm, is easy to apply to pedigree data but the underlying Markov chain may not be irreducible and block updating samplers may be required.

Research projects

Statistical and computational methods for relatedness and relationship inference from Genetic Marker data

With James Cussens (York) as co-applicant. Funding from the International Centre for Mathematical Sciences to organise a workshop in Edinburgh, September 2014.

A graphical model approach to pedigree construction using constrained optimisation

With James Cussens (York)  and Paul Burton (Leicester) and George Davey Smith (Bristol). Medical Research Council Project Grant G1002312 for three years from October 2011.

Statistical Methods for High-Density Genetic Data

Leverhulme Research Fellowship. October 2009˗September 2011.

Inferring epidemiological causality using Mendelian randomisation

With Vanessa Didelez, Debbie Lawlor, Jonathan Sterne and Frank Windmeijer (Bristol) and John Thompson (Leicester). Medical Research Council Project Grant G0601625 for three years from October 2007.

Statistical methods for genetic epidemiology

With Elizabeth Thompson (Seattle) and Max Baur (Bonn) as co-applicants. Funding from the International Centre for Mathematical Sciences to organise a workshop in Edinburgh, May 2007.

The application and development of methods to combine information in epidemiological studies of cardiovascular traits of major public health importance

With Martin Tobin. A PhD studentship funded by the British Heart Foundation, October 2007-September 2010.

Bayesian networks for forensic inference from genetic marker awarded to 13 individuals in eight institutions in five countries:

  • Philip Dawid, Hilde Wilkinson-Herbots, Vanessa Didelez (University College, London, UK)
  • Robert G. Cowell (City University, London, UK)
  • Nuala A. Sheehan, Paul R. Burton (University of Leicester, UK)
  • Julia Mortera, Paola Vicard (University of Rome Three, Italy)
  • Vincenzo L. Pascali, Marina Dobosz (Catholic University of Rome, Italy)
  • Steffen L. Lauritzen (Aalborg University, Denmark)
  • Thore Egeland (University of Oslo, Norway)
  • Peter Mostad (Chalmers University, Gothenburg, Sweden)

Funded by the Leverhulme Trust (Research Interchange Grant (F/071134/K)) from October 2001 to September 2004.

Value in People Award, as Senior Research Fellow in the Departments of Genetics and Health Sciences

Funded by the Wellcome Trust from November 2003 to October 2004.


  • Chin Yang Shapland (2017). Mendelian Randomization in the genomewide era.  Departmental studentship.
  • Meng Sun (2015). Identifying relationships and relatedness from genetic marker data. College studentship.
  • Nick Masca (2011). The application and development of methods to combine and infer information from genetic epidemiological studies of cardiovascular and other complex traits.  Funded by the BHF
  • Martin Tobin (2005). The genetic epidemiology of blood pressure in human populations. Funded by the MRC

Share this page: