Data Science for Business: What you need to know about data mining and data-analytic thinking by Foster Provost Download (read online) free eBook

Data Science for Business: What you need to know about data mining and data-analytic thinking

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the “data-analytic thinking” necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.
Todd N

Feb 01, 2014

rated it
it was amazing


This is probably the most practical book to read if you are looking for an overview of data science, either so you can be in the know when terms like k-means and ROC curves are being bandied about or so you have some context when you start digging deeper into how some of these algorithms are implemented (esp when plowing through a book like The Elements of Statistical Learning: Data Mining, Inference, and Prediction).

I found it to be at just the right level because there is just enough math to e

I found it to be at just the right level because there is just enough math to explain the fundamental concepts and make them stick in my head. This isn’t a book on implementing these concepts or a bunch of algorithms. (Check out Elements of Statistical Leaning above or Data Mining: Practical Machine Learning Tools and Techniques, Second Edition for that.) This gives the book the advantage of being something you throw at an intelligent manager or interested developer, and they can both get a lot out of it. And if they are interested in the next level of learning there are plenty of pointers.

Other chapters cover the business-related aspects, which frankly I’m less interested in. Though I did find the chapter on presenting results through ROC curves, lift curves, etc. pretty interesting.

It would be cool if this book had some more hands on, so maybe after reading this one should download Weka and jump to Part 3 of Data Mining: Practical Machine Learning Tools or maybe go to Kaggle and browse around the current and past competitions.

A few minor nits — I felt that Baysean methods were covered too quickly, even though the book is clear that it’s a pretty large topic in itself. The equation that is finally shown has a big hole in it in that is can quickly go to zero, so it would be nice to mention that sometimes terms are left out or to mention the LaPlace estimator, at least in a side bar. Also, random forests get a only passing mention. But I’m partly complaining because I think I’d benefit from their explanations of these things too.

Oh, and it was cool to see Topic Models mentioned in the chapter on text because way back in the ’90s I worked for a company that used a very primitive, manual version of this technique for classifying documents.


Jan 17, 2016

rated it
it was amazing


When people say that data science is the way of the future, I break into a bit of a cold sweat because there’s an implication that I’m going to have to read another book filled solely with equations and proofs. It’s rare to find a book where you can get into the grit of a scientific framework without getting too bogged down by endless abstraction. However, Provost and Fawcett manage to soften the blow of overtly academic writing, while simultaneously fostering an intricate understanding and appr

Data Science for Business is all about the conceptual framework of the field as it pertains to the different aspects of entrepreneurship. The authors place a lot of emphasis on the structure of the book: there’s a very clear progression across statistical theory and its application across very detailed examples. Every technical term is immediately connected to real world applications, making this an invaluable resource for someone finding a pragmatic use for what would otherwise be esoteric concepts.

However, beginners and veterans alike are advised that the content of this book is a little dense. If you’re a complete beginner, it will take a while to fully digest the content. However, for the initiated, this will make for excellent reference material. This book explores the idea that data science is more than a perfect machine: it is analytical engineering paired with innovative exploration, but it takes a lot of tweaking to get the results that you need.

If you find the wave of innovation sweeping you towards the field of data science, Fawcett and Provost’s book will serve as a gentle but thorough introduction to the basics of data science. You can then use their very detailed reference list to dive further into the most relevant topics. Or at the very least you could name drop a few of the references during your next meeting to score some points with the data scientists at work.