Statistical Analytics for Health Data Science with SAS and R, by Jeffry R. Willson, Ding-Geng Chen, and Karl E. Peace, serves as a user-friendly and practical guide to statistics and downstream health data analyses. The book is especially valuable for those who perform downstream analyses on health data. By organizing issues based on response variables and focusing on hands-on tools rather than complex statistical theories, the authors make advanced analytics accessible. The book steers clear of complicated mathematical language, opting instead for straightforward explanations and simplified sample codes for SAS and R, which can conveniently fit on a cheat sheet.
Uniquely structured, the book organizes its chapters around scientific questions rather than statistical methodologies, providing an engaging narrative. This approach resonates particularly well with students in Statistics, Biostatistics, and Computational Biology programs, as it showcases the practical applicability of their field of study. The first eight chapters serve as an introductory guide for undergraduates, while the later chapters offer graduate students deeper insights into the practical applications of their statistical expertise. Each chapter commences with a clearly defined framework, directing readers toward the most appropriate statistical tools for their specific research questions.
This book takes readers on a comprehensive journey through statistical analysis. Beginning with “Sampling and Data Collection,” it delves into foundational statistics, such as measures of tendency and spread. As the chapters progress, the content becomes more complex, exploring statistical models for both continuous and binary outcomes. The book delves deep into ANOVA, Linear Regression, and ANCOVA techniques, transitioning to specialized models like Standard Logistic Regression and Generalized Linear Models. Advanced topics include modeling with the Generalized Estimating Equations (GEE) and random effects, culminating in the exploration of hierarchical logistic regression for correlated binary outcomes. Each chapter offers both theory and application to ensure a robust understanding.
Another salient feature of this book is its employment of real-world datasets, either generated or scrutinized under the authors’ expert supervision. This enhances the book’s authenticity and establishes it as a reliable reference for statistical consulting—making it particularly useful for researchers in health departments. Additionally, the book serves as a comprehensive compendium of statistical methods, ranging from fundamental concepts to advanced techniques. This extensive scope is made more accessible by including practical examples executed in both R and SAS. Further enriching the book’s utility is the availability of datasets and computer programs, transforming it from an educational resource into a practical toolkit for research and industrial applications.
In summary, Statistical Analytics for Health Data Science with SAS and R excels in demystifying intricate statistical concepts and offers both theoretical grounding and practical experience. Whether you are an applied data scientist, a graduate student, or a public health researcher, this work by Willson, Chen, and Peace is an invaluable asset for learning and applying statistics in real-world settings.
The book references additional materials and a dataset. It would be great if the author could directly provide these resources, especially the SAS or R files, online.
The George Washington University
Washington, DC
[email protected]