Large-Scale Inference

Bradley Efron

Language: English

Published: Sep 29, 2013

Description:

We live in a new age for statistical inference, where modern scientific technology such as microarrays and fMRI machines routinely produce thousands and sometimes millions of parallel data sets, each with its own estimation or testing problem. Doing thousands of problems at once is more than repeated application of classical methods. Taking an empirical Bayes approach, Bradley Efron, inventor of the bootstrap, shows how information accrues across problems in a way that combines Bayesian and frequentist ideas. Estimation, testing and prediction blend in this framework, producing opportunities for new methodologies of increased power. New difficulties also arise, easily leading to flawed inferences. This book takes a careful look at both the promise and pitfalls of large-scale statistical inference, with particular attention to false discovery rates, the most successful of the new statistical techniques. Emphasis is on the inferential ideas underlying technical developments, illustrated using a large number of real examples.

Review

An Interview with Brad Efron of Stanford Read Steve Miller's Bayes and Business Intelligence, Part 3 Business Intelligence

"In the last decade, Efron has played a leading role in laying down the foundations of largescale inference, not only in bringing back and developing old ideas, but also linking them with more recent developments, including the theory of false discovery rates and Bayes methods. We are indebted to him for this timely, readable and highly informative monograph, a book he is uniquely qualified to write. it is a synthesis of many of Efron's own contributions over the last decade with that of closely related material, together with some connecting theory, valuable comments, and challenges for the future. His avowed aim is "not to have the last word" but to help us deal "with the burgeoning statistical problems of the 21st century". He has succeeded admirably."
Terry Speed, International Statistical Review

"This is the first major text on the methods useful for large-scale inference which are being applied to microarrays and fMRI imaging data. It is the first step in the development of an exciting new branch of statistics addressed at some important modern medical problems. ... The difficulty here is that the scientists want to conduct hundreds or thousands of formal hypothesis tests using the microarray data. The theory of multiple testing came about to correct inference on an individual test due to the fact that it was one of many being tested. ... I recommend the book be read to understand the tests in more detail as well as the applications. I have not looked at Brad Efron's course on large scale inference, which is certainly very new. Nothing like it was available in the 1970s when I was at Stanford. I can say from first-hand experience that Efron is a terrific instructor and that he writes in a very clear and intuitive way."
Michael R. Chernick, Significance

"Undoubtedly, the monograph will contribute to the further development of the simultaneous inference technique and its application to genetics, quality control, sociology, medicine, etc. Moreover, the material of the book opens up a new area in the d-posterior approach to the problem of statistical inference guarantee applications."
I.N. Volodin, Mathematical Reviews

"Max H. Stein Professor of Statistics and Biostatistics at Stanford, Bradley Efron, is one of today's greatest statisticians. The book is actually good reading. The mathematical level of the material presented is not really high, and an undergraduate student with a second course in statistics could follow the text quite easily, even if some linear algebra is necessary to get on with some of the more technical parts. Demonstrations are found in the end of sections so as not to stop the reading flow, whereas examples and exercises are very well used as devices to keep the reader focused and interested. In summary I think this book is a pretty good gateway to the statistics of the future for the future Fishers and Neymans."
Jordi Prats, Significance

"Typical of Efron's work, the book presents fresh ideas, charts new ground, and lays an impressive theory, much of which he developed single-handedly.... We can expect a superb learning experience from Efron, and his pedagogical style delivers. The book is written carefully and thoughtfully, with ample mathematical detail to make the concepts clear. Abundant examples of specific data analyses and simulation studies supplement the mathematical theory, making his points crystal clear. The book is chock full of eye-catching and informative color graphics, and thus is not at all dry. Software (in R) sources are identified so that readers can easily perform the analyses. The technical details and examples are interspersed with "exercises" embedded directly in the text, so that readers can test their knowledge on what they just read in "just-in-time" fashion, rather than waiting to the end of the chapter. The problems are not too difficult. I tried some and gained confidence that I had learned something. Thus, the book will serve well as both a reference book, in which the exercises assist the reader in learning, and as a textbook for a graduate seminar course, where the exercises can be assigned as homework problems."
Peter Westfall, Journal of the American Statistical Association

"Efron focuses on empirical Bayes methodology for large-scale inference, by which he mostly means multiple testing (rather than, say, data mining). As a result, the book is centred on mathematical statistics and is more technical. (Which does not mean it's less of an exciting read!) ... Unsurprisingly, I found the book nicely written, with a wealth of R (colour!) graphs (the R programs and dataset are available on Brad Efron's home page)."
Christian Robert, Xi'an's Og full review.

Book Description

Modern scientific technology (e.g. microarrays, fMRI machines) produces data in vast quantities. Bradley Efron explains the empirical Bayes methods that help make sense of a new statistical world. This is essential reading for professional statisticians and graduate students wishing to use and understand important new techniques like false discovery rates.