1st Edition
Introduction to R for Social Scientists A Tidy Programming Approach
Introduction to R for Social Scientists: A Tidy Programming Approach introduces the Tidy approach to programming in R for social science research to help quantitative researchers develop a modern technical toolbox. The Tidy approach is built around consistent syntax, common grammar, and stacked code, which contribute to clear, efficient programming. The authors include hundreds of lines of code to demonstrate a suite of techniques for developing and debugging an efficient social science research workflow. To deepen the dedication to teaching Tidy best practices for conducting social science research in R, the authors include numerous examples using real world data including the American National Election Study and the World Indicators Data. While no prior experience in R is assumed, readers are expected to be acquainted with common social science research designs and terminology.
Whether used as a reference manual or read from cover to cover, readers will be equipped with a deeper understanding of R and the Tidyverse, as well as a framework for how best to leverage these powerful tools to write tidy, efficient code for solving problems. To this end, the authors provide many suggestions for additional readings and tools to build on the concepts covered. They use all covered techniques in their own work as scholars and practitioners.
Preface
1. Introduction
Why R?
Why This Book?
Why the Tidyverse?
What tools are needed?
How This Book Can be Used in a Class
Plan for the Book
2. Foundations
Scripting with R
Understanding R
Working directories
Setting Up an R Project
Loading and Using Packages and Libraries
Where to Get Help
Moving Forward
3. Data Management and Manipulation
Loading Our Data
Data Wrangling
Grouping and Summarizing Your Data (and Using “the Pipe”)
Creating New Variables
Combining Data sets
Basic Descriptive Analysis
Tidying a Data Set
Saving Your Data Set for Later Use
Saving Your Data Set Details for Presentation
4. Visualizing Your Data
The Global Data Set
The Data and Preliminaries
Histograms
Bar Plots
Scatterplots
Combining Multiple Plots
Saving Your Plots
Advanced Visualizations
Parting Thoughts
More Resources
5. Essential Programming
Data Classes
Data Structures
Operators
Conditional Logic
User-Defined Functions
Making your Code Modular
Loops
The map_*() Family from purrr
Concluding Remarks
6. Exploratory Data Analysis
Visual Exploration
Numeric Exploration
Putting it All Together: Skimming Data
Concluding Remarks
7. Essential Statistical Modeling
Loading and Inspecting the Data
t-statistics
Chi-square Test for Contingency Tables
Correlation
Ordinary Least Squares Regression
Binary Response Models
Parting Thoughts
8. Parting Thoughts
Continuing to Learn with R
Where To Go From Here
Final Thought
Biography
Ryan Kennedy is an associate professor of political science at the University of Houston and a research associate for the Hobby Center for Public Policy. His work has appeared in top journals including Science, the American Political Science Review, and Journal of Politics. These articles have won several awards, including best paper in the American Political Science Review, and have been cited over 1,700 times. They have also drawn attention from media outlets like Time, the New York Times, and Smithsonian Magazine.
Philip Waggoner is an assistant instructional professor of computational social science at the University of Chicago and a visiting research scholar at ISERP at Columbia University. He is an Associate Editor at the Journal of Mathematical Sociology and the Journal of Open Research Software, and author of the forthcoming book, Unsupervised Machine Learning for Clustering in Political and Social Research (Cambridge University Press). His work has appeared or is forthcoming in many journals including the Journal of Politics, Journal of Mathematical Sociology, and Journal of Statistical Theory and Practice.
"The authors do a good job of explaining why and how programmers should use R! This book is ideal for social scientists but also good for all industries since it does not assume prior knowledge of R and also addresses R learning pain points. The book examples are based on real-world applications and the R syntax is explained in easy to understand language. The book is unique because it divides exercises into three levels: Easy, Intermediate and Advanced for all levels of R programmers.The step-by-step guide helps new R programmers stay on the workflow as well as apply best practices.The R examples show various options for each function which helps R programmers understand the function better. Finally, the essential programming chapter is great since all R programmers need to learn and master these R concepts."
-Sunil Gupta, SAS, CDISC and R Corporate Trainer and Author, Founder of R-Guru.com