This textbook provides future data analysts with the tools, methods, and skills needed to answer data-focused, real-life questions; to carry out data analysis; and to visualize and interpret results to support better decisions in business, economics, and public policy. Data wrangling and exploration, regression analysis, machine learning, and causal analysis are comprehensively covered, as well as when, why, and how the methods work, and how they relate to each other. As the most effective way to communicate data analysis, running case studies play a central role in this textbook. Each case starts with an industry-relevant question and answers it by using real-world data and applying the tools and methods covered in the textbook. Learning is then consolidated by 360 practice questions and 120 data exercises. Extensive online resources, including raw and cleaned data and codes for all analysis in Stata, R, and Python, can be found at
Recensioner i media

'This exciting new text covers everything today's aspiring data scientist needs to know, managing to be comprehensive as well as accessible. Like a good confidence interval, the Gabors have got you almost completely covered!' Joshua Angrist, Massachusetts Institute of Technology, winner of the Nobel Memorial Prize in Economic Sciences

'This is an excellent book for students learning the art of modern data analytics. It combines the latest techniques with practical applications, replicating the implementation side of classroom teaching that is typically missing in textbooks. For example, they used the World Management Survey data to generate exercises on firm performance for students to gain experience in handling real data, with all its quirks, problems, and issues. For students looking to learn data analysis from one textbook, this is a great way to proceed.' Nicholas Bloom, Stanford University

'I know of few books about data analysis and visualization that are as comprehensive, deep, practical, and current as this one; and I know of almost none that are as fun to read. Gbor Bks and Gbor Kzdi have created a most unusual and most compelling beast: a textbook that teaches you the subject matter well and that, at the same time, you can enjoy reading cover to cover.' Alberto Cairo, University of Miami

'A beautiful integration of econometrics and data science that provides a direct path from data collection and exploratory analysis to conventional regression modeling, then on to prediction and causal modeling. Exactly what is needed to equip the next generation of students with the tools and insights from the two fields.' David Card, University of California, Berkeley, winner of the Nobel Memorial Prize in Economic Sciences

'This textbook is excellent at dissecting and explaining the underlying process of data analysis. Bks and Kzdi have masterfully woven into their instruction a comprehensive range of case studies. The result is a rigorous textbook grounded in real-world learning, at once accessible and engaging to novice scholars and advanced practitioners alike. I have every confidence it will be valued by future generations.' Kerwin K. Charles, Yale School of Management

'This book takes you by the hand in a journey that will bring you to understand the core value of data in the fields of machine learning and economics. The large amount of accessible examples combined with the intuitive explanation of foundational concepts is an ideal mix for anyone who wants to do data analysis. It is highly recommended to anyone interested in the new way in which data will be analyzed in the social sciences in the next years.' Christian Fons-Rosen, Barcelona Graduate School of Economics

'This sophisticatedly simple book is ideal for undergraduate- or Master's-level Data Analytics courses with a broad audience. The authors discuss the key aspects of examining data, regression analysis, prediction, Lasso, and rando...

Gbor Bks is an assistant professor at the Department of Economics and Business of the Central European University, and Director of the Business Analytics Program. He is a senior fellow at KRTK and a research affiliate at the Center for Economic Policy Research (CEPR). He has published in top economics journals on multinational firm activities and productivity, business clusters, and innovation spillovers. He has managed international data collection projects on firm performance and supply chains. He has done policy advising (the European Commission, ECB) as well as private-sector consultancy (in finance, business intelligence, and real estate). He has taught graduate-level data analysis and economic geography courses since 2012. Gbor Kzdi is a research associate professor at the University of Michigan's Institute for Social Research. He has published in top journals in economics, statistics, and political science on topics including household finances, health, education, demography, and ethnic disadvantages and prejudice. He has managed several data collection projects in Europe; currently, he is co-investigator of the Health and Retirement Study in the US. He has consulted for various governmental and non-governmental institutions on the disadvantage of the Roma minority and the evaluation of social interventions. He has taught data analysis, econometrics, and labor economics from undergraduate to Ph.D. levels since 2002, and supervised a number of MA and Ph.D. students.


Part I. Data Exploration: 1. Origins of data; 2. Preparing data for analysis; 3. Exploratory data analysis; 4. Comparison and correlation; 5. Generalizing from data; 6. Testing hypotheses; Part II. Regression Analysis: 7. Simple regression; 8. Complicated patterns and messy data; 9. Generalizing results of a regression; 10. Multiple linear regression; 11. Modeling probabilities; 12. Regression with time series data; Part III. Prediction: 13. A framework for prediction; 14. Model building for prediction; 15. Regression trees; 16. Random forest and boosting; 17. Probability prediction and classification; 18. Forecasting from time series data; Part IV. Causal Analysis: 19. A framework for causal analysis; 20. Designing and analyzing experiments; 21. Regression and matching with observational data; 22. Difference-in-differences; 23. Methods for panel data; 24. Appropriate control groups for panel data; Bibliography; Index.