Data Analysis with Open Source Tools

(häftad)

av Philipp K Janert

Bloggar      
Format:
Häftad (paperback)
Utgiven:
2010-12-14
Språk:
Engelska

Collecting data is relatively easy, but turning raw information into something useful requires that you know how to extract precisely what you need. With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.

Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.

  • Use graphics to describe data with one, two, or dozens of variables
  • Develop conceptual models using back-of-the-envelope calculations, as well as scaling and probability arguments
  • Mine data with computationally intensive methods such as simulation and clustering
  • Make your conclusions understandable through reports, dashboards, and other metrics programs
  • Understand financial calculations, including the time-value of money
  • Use dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situations
  • Become familiar with different open source programming environments for data analysis

"Finally, a concise reference for understanding how to conquer piles of data." --Austin King, Senior Web Developer, Mozilla

"An indispensable text for aspiring data scientists." --Michael E. Driscoll, CEO/Founder, Dataspora

Fler böcker av Philipp K Janert

Gnuplot In Action: Understanding Data From Graphs (häftad)

Gnuplot In Action: Understanding Data From Graphs

Philipp K Janert (häftad)
198:- Köp

Kundrecensioner

Bli först med att recensera och betygsätt boken Data Analysis with Open Source Tools - du kan vinna 200 kr varje månad i tävlingen "Månadens recension".

Bloggat om Data Analysis with Open Source Tools

Övrig information

Philipp K. Janert is Chief Consultant at Principal Value, LLC. He has worked for small start-ups and in large corporate environments, both in the US and overseas, including several years at Amazon.com, where he initiated and led several projects to improve Amazon's order fulfillment processes. Philipp K. Janert has written about software and software development for the O'Reilly Network, IBM developerWorks, IEEE Software, and Linux Magazine. He holds a Ph.D. in Theoretical Physics from the University of Washington. Visit his website at www.principal-value.com.

Innehållsförteckning

; Preface; Before We Begin; Conventions Used in This Book; Using Code Examples; Safari Books Online; How to Contact Us; Acknowledgments; Chapter 1: Introduction; 1.1 Data Analysis; 1.2 Whats in This Book; 1.3 Whats with the Workshops?; 1.4 Whats with the Math?; 1.5 What Youll Need; 1.6 Whats Missing; Graphics: Looking at Data; Chapter 2: A Single Variable: Shape and Distribution; 2.1 Dot and Jitter Plots; 2.2 Histograms and Kernel Density Estimates; 2.3 The Cumulative Distribution Function; 2.4 Rank-Order Plots and Lift Charts; 2.5 Only When Appropriate: Summary Statistics and Box Plots; 2.6 Workshop: NumPy; 2.7 Further Reading; Chapter 3: Two Variables: Establishing Relationships; 3.1 Scatter Plots; 3.2 Conquering Noise: Smoothing; 3.3 Logarithmic Plots; 3.4 Banking; 3.5 Linear Regression and All That; 3.6 Showing Whats Important; 3.7 Graphical Analysis and Presentation Graphics; 3.8 Workshop: matplotlib; 3.9 Further Reading; Chapter 4: Time As a Variable: Time-Series Analysis; 4.1 Examples; 4.2 The Task; 4.3 Smoothing; 4.4 Dont Overlook the Obvious!; 4.5 The Correlation Function; 4.6 Optional: Filters and Convolutions; 4.7 Workshop: scipy.signal; 4.8 Further Reading; Chapter 5: More Than Two Variables: Graphical Multivariate Analysis; 5.1 False-Color Plots; 5.2 A Lot at a Glance: Multiplots; 5.3 Composition Problems; 5.4 Novel Plot Types; 5.5 Interactive Explorations; 5.6 Workshop: Tools for Multivariate Graphics; 5.7 Further Reading; Chapter 6: Intermezzo: A Data Analysis Session; 6.1 A Data Analysis Session; 6.2 Workshop: gnuplot; 6.3 Further Reading; Analytics: Modeling Data; Chapter 7: Guesstimation and the Back of the Envelope; 7.1 Principles of Guesstimation; 7.2 How Good Are Those Numbers?; 7.3 Optional: A Closer Look at Perturbation Theory and Error Propagation; 7.4 Workshop: The Gnu Scientific Library (GSL); 7.5 Further Reading; Chapter 8: Models from Scaling Arguments; 8.1 Models; 8.2 Arguments from Scale; 8.3 Mean-Field Approximations; 8.4 Common Time-Evolution Scenarios; 8.5 Case Study: How Many Servers Are Best?; 8.6 Why Modeling?; 8.7 Workshop: Sage; 8.8 Further Reading; Chapter 9: Arguments from Probability Models; 9.1 The Binomial Distribution and Bernoulli Trials; 9.2 The Gaussian Distribution and the Central Limit Theorem; 9.3 Power-Law Distributions and Non-Normal Statistics; 9.4 Other Distributions; 9.5 Optional: Case StudyUnique Visitors over Time; 9.6 Workshop: Power-Law Distributions; 9.7 Further Reading; Chapter 10: What You Really Need to Know About Classical Statistics; 10.1 Genesis; 10.2 Statistics Defined; 10.3 Statistics Explained; 10.4 Controlled Experiments Versus Observational Studies; 10.5 Optional: Bayesian StatisticsThe Ot...

De som köpt "Data Analysis with Open Source Tools" har även köpt:

Programming Collective Intelligence (häftad)

Programming Collective Intelligence

Toby Segaran (häftad)
211:-
Think Stats (häftad)

Think Stats

Allen B Downey (häftad)
166:-
Core Python Applications Programming 3rd Edition (häftad)

Core Python Applications Programming 3rd Edition

Wesley J Chun (häftad)
267:-
Fractals, Chaos, Power Laws ()

Fractals, Chaos, Power Laws

Manfred R Schroeder
175:-
Principles of Electrodynamics (häftad)

Principles of Electrodynamics

M M Schwartz (häftad)
138:-
Data Analysis with Open Source Tools (häftad)
  • Titel: Data Analysis with Open Source Tools
  • ISBN: 9780596802356
  • Förlag: O'REILLY & ASSOCIATES
  • Utgivningsland: USA
  • Utgivningsort: Sebastopol
  • Illustrationer: illustrations
  • Upplaga: 1
  • Antal sidor: 509
  • Vikt: 703 g
  • Höjd: 234 mm
  • Antal komponenter: 1
  • Format: Häftad (paperback)