Career Guides15 min read2026-04-25Julian Caraulani

Data Analyst Interview Questions — 40 Real Questions & How to Answer Them (2026)

SQL queries they actually ask, business case frameworks, take-home assignment patterns, and the mistakes that get candidates rejected.

Data analyst interviews in 2026 test four distinct skills: SQL proficiency, statistical reasoning, business sense, and communication. The format typically includes a recruiter screen, a SQL assessment (live or take-home), a business case round, and a behavioral interview. 25% of interviews include a take-home assignment. Here are 40 real questions from companies like Amazon, Google, Meta, Stripe, and Uber — with answer guidance for each.

SQL questions — the make-or-break round

SQL is the single most important skill in data analyst interviews. Every company tests it, and most candidates pass or fail based on this round alone. Here are the 10 patterns that appear most frequently:

  • Window functions (DENSE_RANK, ROW_NUMBER, LAG/LEAD): 'Rank the top 2 highest-paid employees per department.' Know the difference between ROW_NUMBER (unique), RANK (gaps), and DENSE_RANK (no gaps) — interviewers WILL ask.
  • Rolling aggregates: 'Calculate a rolling 7-day average of daily revenue.' Use AVG() OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW). Know ROWS vs RANGE.
  • Self-joins: 'List all employees and their managers' names.' LEFT JOIN employees e on employees m. LEFT ensures employees without managers still appear.
  • Anti-joins / EXCEPT: 'Find customers who ordered in January but NOT February.' Both EXCEPT and LEFT JOIN + IS NULL patterns work — show you know multiple approaches.
  • Correlated subqueries: 'Find employees earning above their department average.' The inner query re-executes per row. Understand performance implications.
  • Recursive CTEs: 'Build an organizational hierarchy.' Need an anchor (base case) + recursive member. Interviewers ask 'what prevents infinite loops?'
  • Running totals: SUM() OVER (PARTITION BY product ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW).
  • HAVING vs WHERE: 'Find cities with above-average home prices.' WHERE filters rows before aggregation, HAVING filters groups after. This distinction is asked in nearly every interview.
  • Duplicate detection: 'Find duplicate transactions within a 10-minute window.' Self-join with a.id < b.id prevents matching a row with itself.
  • Month-over-month change: LAG(total_orders) OVER (ORDER BY month). The third argument of LAG is the default when no previous row exists.

Statistics questions

  • 'When would you use mean vs median?' — Mean is sensitive to outliers, median is robust. Household income: median ($70K) is more representative than mean ($100K) because billionaires skew it. Mention skewness.
  • 'Explain correlation vs causation with a business example.' — Ice cream sales and drowning correlate (summer). Ad spend and revenue may correlate, but you need A/B tests to prove causation. Mention confounding variables.
  • 'A/B test shows p-value of 0.04. Should we ship?' — Statistically significant (below 0.05). BUT also check practical significance (is the effect size meaningful?), sample size, and multiple comparison corrections.
  • 'Type I vs Type II errors — which is worse?' — Type I = false positive, Type II = false negative. Answer depends on context: in fraud detection, Type II is worse. In medical testing, Type I may be worse. Show you reason about tradeoffs.
  • 'How do you handle outliers?' — Detect with IQR or Z-score. Do NOT automatically remove. Investigate: data errors (remove), natural extremes (keep), or separate population (segment). State your assumption.

Business case questions

  • 'Sales dropped 25% last month. Investigate.' — Segment by region, product, channel, cohort. Check if it is across-the-board or isolated. Look at external factors. Check data quality. Present a hypothesis tree.
  • 'How would you measure success of a new feature?' — Adoption rate, retention, engagement depth, business impact. Define success criteria BEFORE launch, not after.
  • 'Which city should we expand to next?' — Market size, competitive density, unit economics by geo, operational feasibility, existing organic demand.
  • 'Build a KPI dashboard for a subscription business.' — North Star = MRR. Supporting: churn, LTV, CAC, LTV/CAC ratio. Layer: acquisition, activation, retention funnel.
  • 'PM says engagement is down. What do you do?' — First clarify the metric. DAU? Session length? Actions per session? Then segment by user type, platform, geo. Do NOT accept vague metrics without questioning.

Take-home assignment patterns

25% of data analyst interviews include a take-home assignment. Four common patterns: (1) Sales funnel analysis — identify best channels, recommend budget reallocation, present in 5 slides. (2) User churn analysis — define active vs churned, build cohort retention, identify features correlated with retention. (3) A/B test analysis — determine statistical significance, calculate effect size, flag issues like sample ratio mismatch. (4) Open-ended 'improve X' — no dataset provided, define metrics, outline analysis plan, suggest experiments.

Timeline is typically 4-7 days. Tools are your choice (SQL, Python, Excel all acceptable). Presentation runs 45-60 minutes with 2-4 interviewers plus Q&A. The biggest mistake: submitting analysis without mentioning missing values, outliers, or assumptions.

Behavioral questions

  • 'Tell me about a time you handled conflicting stakeholder priorities.' — Use STAR format. Show you clarified underlying needs, found common ground in data, and proposed a solution addressing both sides.
  • 'Describe a time your analysis was wrong.' — They want humility. Explain what went wrong, how you caught it, and what you changed to prevent recurrence.
  • 'Tell me about a time you influenced a product decision with data.' — Key word is 'influenced.' Show you changed someone's mind, not just presented a chart.
  • 'How do you handle messy data?' — Acknowledge it is the norm. Triage: assess what is missing, determine impact, document assumptions, communicate confidence levels.
  • 'Why data analytics? Why this company?' — Do not say 'I love data.' Connect to a specific problem the company solves and how analytics drives it.

Excel and visualization questions

  • 'INDEX-MATCH vs VLOOKUP vs XLOOKUP?' — VLOOKUP only searches rightward. INDEX-MATCH is bidirectional. XLOOKUP (2026 standard) replaces both — searches any direction, handles errors natively.
  • 'How would you handle 500K+ rows in Excel?' — This is a trap. The correct answer acknowledges Excel's limits: use Power Query for ETL, Power Pivot for data modeling, or pivot to SQL/Python.
  • 'Which chart for user retention over 12 months?' — Line chart with cohorts as separate lines. NOT a bar chart. Add reference lines for targets.
  • 'Redesign a cluttered dashboard with 15 charts?' — 5-second rule: can the viewer grasp the key message instantly? Reduce to 4-6 charts. Lead with KPI, support with trends, detail on demand via filters.
  • 'Explain LOD expressions or DAX measures.' — Tableau: FIXED [Customer] : SUM([Sales]) calculates per customer regardless of view filters. Power BI: CALCULATE with ALL removes filter context.

9 mistakes that get candidates rejected

  • Coding in silence — interviews are collaborative. Narrate your thought process while writing SQL/Python. Silent typing is the #1 rejection reason.
  • Jumping to code before clarifying — always ask: what time range? How is the metric defined? Are there data quality issues?
  • Confusing statistical significance with practical significance — p < 0.05 does not mean the feature matters if the effect size is trivial.
  • Unstructured case study answers — framework first (clarify, decompose, analyze, recommend), then execute.
  • Saying 'I used advanced Excel' without specifics — pair every skill claim with a concrete example.
  • Not asking questions at the end — ask about team structure, data stack, how analysts influence roadmap.
  • Overlong answers — optimal response length is 1-2 minutes. Rambling for 5 minutes loses the interviewer.
  • Getting defensive when challenged — when an interviewer pushes back, they are testing adaptability. Acknowledge, adjust, show flexibility.
  • Ignoring data quality in take-homes — submitting analysis without mentioning missing values or assumptions signals you would ship unreliable insights.