Searching...
English
EnglishEnglish
EspañolSpanish
简体中文Chinese
FrançaisFrench
DeutschGerman
日本語Japanese
PortuguêsPortuguese
ItalianoItalian
한국어Korean
РусскийRussian
NederlandsDutch
العربيةArabic
PolskiPolish
हिन्दीHindi
Tiếng ViệtVietnamese
SvenskaSwedish
ΕλληνικάGreek
TürkçeTurkish
ไทยThai
ČeštinaCzech
RomânăRomanian
MagyarHungarian
УкраїнськаUkrainian
Bahasa IndonesiaIndonesian
DanskDanish
SuomiFinnish
БългарскиBulgarian
עבריתHebrew
NorskNorwegian
HrvatskiCroatian
CatalàCatalan
SlovenčinaSlovak
LietuviųLithuanian
SlovenščinaSlovenian
СрпскиSerbian
EestiEstonian
LatviešuLatvian
فارسیPersian
മലയാളംMalayalam
தமிழ்Tamil
اردوUrdu
Becoming a Data Head

Becoming a Data Head

How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning
by Alex J. Gutman 2021 288 pages
4.23
363 ratings
Listen
Try Full Access for 7 Days
Unlock listening & more!
Continue

Key Takeaways

1. Define the Problem Before Diving into Data

Gutman and Goldmeier filter through much of the noise to break down complex data and statistical concepts we hear today into basic examples and analogies that stick.

Focus on the business problem. Before starting any data project, clearly define the problem you're trying to solve. Avoid getting caught up in the hype of new technologies or methodologies. Instead, focus on the business value and the impact of solving the problem.

Ask key questions. To ensure the problem is well-defined, ask:

  • Why is this problem important?
  • Who does this problem affect?
  • What if we don't have the right data?
  • When is the project over?
  • What if we don't like the results?

Avoid methodology and deliverable focus. Be wary of projects that start with a specific technology or deliverable in mind. Instead, focus on the business problem and then determine the appropriate tools and methods.

2. Data is Encoded Information, Not Just Numbers

In demystifying these complex statistical topics, they have also created a common language that bridges the longstanding communication divide that has — until now — separated data work from business value.

Data vs. Information. Understand the difference between data and information. Data is encoded information, while information is derived knowledge. Data is the raw material, and information is the result of analysis.

Data Types. Be familiar with different data types:

  • Numeric (continuous and count)
  • Categorical (ordered and unordered)
  • Dates

Data Collection. Understand how data is collected (observational vs. experimental) and structured (structured vs. unstructured). This will help you assess its quality and limitations.

3. Statistical Thinking Requires Questioning Everything

Statistical thinking is a different way of thinking that is part detective, skeptical, and involves alternate takes on a problem.

Embrace skepticism. Develop a critical mindset and question the data and results you encounter. Don't take numbers at face value. Be especially skeptical of claims that align with your existing beliefs.

Understand variation. Recognize that there is variation in all things. Not every peak and valley needs an explanation. Differentiate between measurement variation and random variation.

Probability and Statistics. Use probability and statistics to manage uncertainty. Understand the difference between probability (drilling down) and statistics (drilling up).

4. Argue with the Data's Origin and Representativeness

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.

Data Origin Story. Always ask about the origin of the data. Who collected it? How was it collected? Is it observational or experimental? This will help you assess its reliability and potential biases.

Representativeness. Ensure the data is representative of the population you care about. Is there sampling bias? What did you do with outliers? What data am I not seeing? How did you deal with missing values?

Measurement. Can the data measure what you want it to measure? Be wary of proxy measures and indirect approximations.

5. Explore Data to Uncover Relationships and Opportunities

Gutman and Goldmeier offer practical advice for asking the right questions, challenging assumptions, and avoiding common pitfalls.

Embrace the exploratory mindset. Approach data analysis with curiosity and a willingness to iterate. Don't follow a rigid script. Be open to discovering new relationships and opportunities.

Ask guiding questions. As you explore the data, ask:

  • Can the data answer the question?
  • Did you discover any relationships?
  • Did you find new opportunities in the data?

Use visualizations. Use histograms, box plots, bar charts, and scatter plots to explore the data and spot anomalies. Verify noteworthy correlations with visualizations.

6. Probabilities Quantify Uncertainty, Challenge Intuition

Many people’s notion of probability is so impoverished that it admits [one] of only two values: 50-50 and 99%, tossup or essentially certain.

Probability vs. Intuition. Recognize that your intuition can play tricks on you. Don't underestimate variation, especially when dealing with small numbers.

Rules of the Game. Understand the basic rules of probability:

  • Probabilities range from 0 to 1.
  • The sum of all possible outcomes must equal 1.
  • The chance of any two events happening together cannot be greater than either event happening by itself.

Conditional Probability. Know that all probabilities are conditional. Be careful assuming independence. Don't fall for the gambler's fallacy.

7. Challenge Statistics by Understanding Inference

The most clear, concise, and practical characterization of working in corporate analytics that I’ve seen.

Statistical Inference. Understand the process of statistical inference:

  1. Ask a meaningful question.
  2. Formulate a hypothesis test.
  3. Establish a significance level.
  4. Calculate a p-value.
  5. Calculate confidence intervals.
  6. Reject or fail to reject the null hypothesis.

Key Questions. Ask these questions to challenge the statistics:

  • What is the context for these statistics?
  • What is the sample size?
  • What are you testing?
  • What is the null hypothesis?
  • What is the significance level?
  • How many tests are you doing?
  • Can I see the confidence intervals?
  • Is this practically significant?
  • Are you assuming causality?

Decision Errors. Balance decision errors (false positives and false negatives).

8. Unsupervised Learning Reveals Hidden Groups

Becoming a Data Head raises the level of education and knowledge in an industry desperate for clarity in thinking.

Unsupervised Learning. Understand the goal of unsupervised learning: to discover hidden patterns and groups in datasets without predefined labels.

Dimensionality Reduction. Learn about dimensionality reduction and principal component analysis (PCA). PCA creates composite features that capture the most variance in the data.

Clustering. Understand clustering and k-means clustering. K-means groups similar observations together based on a distance metric.

9. Regression Models Explain and Predict Relationships

Gutman and Goldmeier have written a book that is as useful for applied statisticians and data scientists as it is for business leaders and technical professionals.

Supervised Learning. Understand the goal of supervised learning: to find relationships in data with inputs and known outputs.

Regression Models. Learn about linear regression and its goal: to find the line of best fit that minimizes the sum of squared errors.

Multiple Regression. Extend linear regression to multiple features. Understand the importance of coefficients and p-values.

10. Classification Models Predict Categories

THE book that business and technology leaders need to read to fully understand the potential, power, AND limitations of data science.

Classification Models. Understand the goal of classification models: to predict a categorical variable (label).

Logistic Regression. Learn about logistic regression and its ability to predict probabilities.

Decision Trees. Understand decision trees and their ability to create a flowchart of rules.

Ensemble Methods. Learn about ensemble methods (random forests and gradient boosted trees) and their ability to improve prediction accuracy.

11. Text Analytics Transforms Words into Insights

Gutman and Goldmeier filter through much of the noise to break down complex data and statistical concepts we hear today into basic examples and analogies that stick.

Text Analytics. Understand the goal of text analytics: to extract useful insights from raw text.

Bag of Words. Learn about the bag-of-words model and its limitations.

N-grams. Understand N-grams and their ability to capture context.

Word Embeddings. Learn about word embeddings and their ability to represent words as vectors.

12. Deep Learning Mimics the Brain for Complex Tasks

What is keeping data science from reaching its true potential? It is not slow algorithms, lack of data, lack of computing power, or even lack of data scientists.

Neural Networks. Understand the basic structure of neural networks: neurons, activation functions, and layers.

Deep Learning. Learn about deep learning and its ability to automate feature engineering.

Convolutional Neural Networks. Understand convolutional neural networks and their application to image analysis.

Last updated:

Want to read the full book?

FAQ

1. What is Becoming a Data Head by Alex J. Gutman about?

  • Comprehensive data literacy guide: The book aims to make data science, statistics, and machine learning accessible to non-experts, bridging the gap between technical and business professionals.
  • Critical thinking focus: It teaches readers to ask the right questions, understand data nuances, and recognize common pitfalls in data projects.
  • Practical, real-world approach: Everyday examples, analogies, and clear explanations help demystify complex topics for a broad audience.
  • Human side of data: The book also addresses communication challenges and team dynamics in data-driven organizations.

2. Why should I read Becoming a Data Head by Alex J. Gutman?

  • Demystifies data science buzzwords: The book breaks down intimidating concepts like AI, machine learning, and big data into understandable language.
  • Bridges communication gaps: It equips non-technical readers to engage meaningfully with data professionals, improving collaboration and decision-making.
  • Prepares for a data-driven future: Readers gain the skills to think critically about data and become effective “Data Heads” in their organizations.
  • Real-world relevance: The book highlights common pitfalls and challenges, making it valuable for business leaders and professionals navigating data projects.

3. What are the key takeaways from Becoming a Data Head by Alex J. Gutman?

  • Ask fundamental questions: Always clarify why a problem matters, who it affects, and whether the right data exists before starting a project.
  • Think statistically: Understand variation, probability, and the difference between data and information to interpret results critically.
  • Challenge assumptions: Learn to question data sources, representativeness, statistical significance, and causality to avoid common analytical errors.
  • Continuous learning: The book encourages ongoing critical thinking and building a shared language between data workers and decision makers.

4. How does Becoming a Data Head by Alex J. Gutman define data and its types?

  • Data vs. information: Data is encoded information, typically structured in datasets with rows (observations) and columns (features or variables).
  • Types of data: Numeric data can be continuous (like temperature) or count-based, while categorical data can be ordered (ordinal) or unordered (nominal).
  • Observational vs. experimental: Observational data is passively collected, while experimental data is gathered under controlled conditions to infer causality.
  • Importance of structure: Understanding data types is crucial for selecting appropriate analysis methods.

5. What is statistical thinking according to Becoming a Data Head by Alex J. Gutman?

  • Embrace variation and uncertainty: All data contains variation; questioning and understanding this is essential.
  • Probability and statistics tools: Use these to manage uncertainty, understand sampling variation, and avoid misconceptions like the law of small numbers.
  • Critical, not cynical: The goal is to appreciate data’s limitations and use it wisely, not to reject it outright.
  • Ask probing questions: Always inquire about data sources, collection methods, and potential biases.

6. What are the main questions to ask when arguing with data in Becoming a Data Head by Alex J. Gutman?

  • Data origin story: Who collected the data, how was it collected, and is it observational or experimental?
  • Representativeness and bias: Is the data representative of the population, and are there sampling biases or outliers?
  • Missing data and measurement: What data is missing, how was it handled, and does the data truly measure the intended concept?
  • Context matters: Always consider the context and limitations of the data before drawing conclusions.

7. How does Becoming a Data Head by Alex J. Gutman explain probability and its common pitfalls?

  • Probability basics: Probability quantifies uncertainty, ranging from 0 to 1, with conditional probabilities depending on other events.
  • Common traps: Avoid assuming independence incorrectly, falling for the gambler’s fallacy, and confusing conditional probabilities.
  • Bayes’ theorem: The book introduces Bayes’ theorem as a key tool for relating conditional probabilities, using practical examples like virus testing.
  • Critical interpretation: Understanding probability helps avoid misinterpretation and supports better decision-making.

8. What guidance does Becoming a Data Head by Alex J. Gutman provide on challenging statistics and hypothesis testing?

  • Understand context and sample size: Always ask what the statistics mean, the size of the sample, and what question is being tested.
  • Hypothesis testing basics: Learn about null and alternative hypotheses, significance levels, p-values, and confidence intervals.
  • Beware of causality assumptions: Correlation does not imply causation; experimental design is needed to infer causal relationships.
  • Skeptical mindset: Maintain skepticism and seek transparency in statistical claims.

9. What are the key concepts of unsupervised learning explained in Becoming a Data Head by Alex J. Gutman?

  • Dimensionality reduction with PCA: Principal component analysis reduces many correlated features into fewer uncorrelated components, simplifying data analysis.
  • Clustering techniques: Methods like k-means and hierarchical clustering group similar observations without predefined labels, revealing natural groupings.
  • Practical cautions: Unsupervised learning requires careful supervision in choosing the number of components or clusters and interpreting results.
  • Data preparation: Proper scaling and understanding of the data are essential to avoid misleading conclusions.

10. How does Becoming a Data Head by Alex J. Gutman explain supervised learning, regression, and classification models?

  • Supervised learning paradigm: Models are trained on input-output pairs to predict outcomes for new data, with regression for continuous values and classification for categories.
  • Linear and logistic regression: Linear regression predicts continuous outcomes, while logistic regression handles binary classification, outputting probabilities.
  • Decision trees and ensembles: Decision trees split data into rules, while ensemble methods like random forests and gradient boosting improve accuracy but reduce interpretability.
  • Common pitfalls: The book warns about data leakage, overfitting, not splitting data into training and test sets, and misunderstanding accuracy metrics.

11. How does Becoming a Data Head by Alex J. Gutman approach text analytics, deep learning, and unstructured data?

  • Text data challenges: Computers require text to be converted into numbers using methods like bag-of-words, n-grams, and word embeddings.
  • Text analysis methods: Topic modeling and text classification (e.g., Naïve Bayes) are covered, with emphasis on context and domain adaptation.
  • Deep learning basics: Neural networks with hidden layers learn complex patterns, automating feature engineering for tasks like image and language processing.
  • Limitations and requirements: Deep learning needs large labeled datasets, significant computing power, and careful attention to ethical issues like algorithmic bias.

12. What common pitfalls, biases, and communication challenges in data projects does Becoming a Data Head by Alex J. Gutman warn about?

  • Statistical and project biases: The book explains survivorship bias, regression to the mean, Simpson’s paradox, confirmation bias, and algorithmic bias, showing how they can mislead analysis.
  • Project management pitfalls: Misapplied problems, data leakage, overfitting, non-representative samples, and unrealistic expectations are highlighted as common causes of failure.
  • Communication breakdowns: Scenarios like The Postmortem, The Telephone Game, and The Blowhard illustrate how miscommunication derails projects.
  • Data personalities: Understanding and engaging with Data Enthusiasts, Cynics, and Heads is crucial for successful collaboration and project outcomes.

Review Summary

4.23 out of 5
Average of 363 ratings from Goodreads and Amazon.

Becoming a Data Head is highly praised for its accessible introduction to data science concepts. Readers appreciate its clear explanations of complex topics, making it valuable for both beginners and experienced professionals. The book covers a wide range of subjects, from basic statistics to machine learning and AI. Many reviewers found it helpful for understanding data-driven decision-making in business contexts. While some felt it was too basic, most agreed it provides a solid foundation for anyone looking to enhance their data literacy.

Your rating:
4.59
54 ratings

About the Author

Alex J. Gutman is a data scientist and author who co-wrote "Becoming a Data Head" with Jordan Goldmeier. The book aims to demystify data science concepts for a broad audience, including business professionals and those new to the field. Gutman's approach focuses on practical applications and real-world examples, helping readers understand how data can be used effectively in various contexts. His writing style is praised for its clarity and ability to explain complex topics in an accessible manner. Gutman's expertise in data science and his skill in communicating technical concepts to non-technical audiences are evident throughout the book.

Download PDF

To save this Becoming a Data Head summary for later, download the free PDF. You can print it out, or read offline at your convenience.
Download PDF
File size: 0.23 MB     Pages: 13

Download EPUB

To read this Becoming a Data Head summary on your e-reader device or app, download the free EPUB. The .epub digital book format is ideal for reading ebooks on phones, tablets, and e-readers.
Download EPUB
File size: 2.95 MB     Pages: 9
Listen
Now playing
Becoming a Data Head
0:00
-0:00
Now playing
Becoming a Data Head
0:00
-0:00
1x
Voice
Speed
Dan
Andrew
Michelle
Lauren
1.0×
+
200 words per minute
Queue
Home
Swipe
Library
Get App
Create a free account to unlock:
Recommendations: Personalized for you
Requests: Request new book summaries
Bookmarks: Save your favorite books
History: Revisit books later
Ratings: Rate books & see your ratings
200,000+ readers
Try Full Access for 7 Days
Listen, bookmark, and more
Compare Features Free Pro
📖 Read Summaries
Read unlimited summaries. Free users get 3 per month
🎧 Listen to Summaries
Listen to unlimited summaries in 40 languages
❤️ Unlimited Bookmarks
Free users are limited to 4
📜 Unlimited History
Free users are limited to 4
📥 Unlimited Downloads
Free users are limited to 1
Risk-Free Timeline
Today: Get Instant Access
Listen to full summaries of 73,530 books. That's 12,000+ hours of audio!
Day 4: Trial Reminder
We'll send you a notification that your trial is ending soon.
Day 7: Your subscription begins
You'll be charged on Oct 3,
cancel anytime before.
Consume 2.8x More Books
2.8x more books Listening Reading
Our users love us
200,000+ readers
"...I can 10x the number of books I can read..."
"...exceptionally accurate, engaging, and beautifully presented..."
"...better than any amazon review when I'm making a book-buying decision..."
Save 62%
Yearly
$119.88 $44.99/year
$3.75/mo
Monthly
$9.99/mo
Start a 7-Day Free Trial
7 days free, then $44.99/year. Cancel anytime.
Scanner
Find a barcode to scan

Settings
General
Widget
Loading...