- First chapter of this book is dedicated to overall definition of data science, big data and similar.
- Second chapter introduces "canonical data mining tasks".
- Chapter 3 shows first steps with supervised segmentation and decision trees.
- Next chapter adds linear regression, support vector machine and logistic regression.
- Chapter 5 - in my opinion most useful - defines overfitting. Authors shows examples how one can hit the overfitting problem, but also shows how to avoid it and deal with potential problems.
- Chapter 6 introduces additional data science tools: similarity, neighbors and clustering methods.
- Chapter 7 focuses on aspects strictly related to applying earlier mentioned tools to business - expected profit. This well written chapter shows that there is almost always second bottom, apart of pure data tools - business bottom.
- Great data scientist, at some point has to show his results and hypothesis to stakeholders. He can use lots of complicated mathematical formulas, but also can use simple plots with additional information to nicely visualize his ideas. Chapter 8 describes some fundamental "curves" which are often used in data science.
- In chapter 9, authors describe Bayes' rule and discuss its advantages and disadvantages.
- Chapter 10 is dedicated to "text mining". Authors know, that they just scratch top layer of this issue. But on the other hand, reader can find here some basics ideas how to work with text and how to start researching different methods.
- Final evaluation of example problem which was used through this book is done in chapter 11.
- Chapter 12 discuss other techniques with approaching analytical tasks: co-occurrence and associations (example usage: determining item which are bough together). Profiling, link prediction and data reduction is also discussed with nice example of Netflix Prize. Authors also clearly explain why ensemble of models could give better results in some cases.
- In chapter 13, authors shows how to think about data science in business context, but also points how to work as data scientist in business environment.
- Last chapter is dedicated to overall summary. Authors gives hints how we should ask data science related questions and how to think in general about data science.
Actually I can't say anything bad about this book. Of course, I would just love to see a complementary handbook with code in Python or R, but I guess that there are plenty of such books.
No comments:
Post a Comment