Practice real-world cases with AI. See where you can grow.

Every case is a real business problem with a real dataset. Work it in an AI-integrated notebook — explore, model, recommend — exactly the way a data scientist works. Then get a coach-style report on where you stand and what to sharpen next.

Expert-crafted cases

Real-world data cases across every industry.

Advanced
FINANCE
Suspicious Transaction Flagging
Anomaly Detection · Precision vs Recall · Cost Analysis
StripePayPalSquare
Practice this case →
Advanced
FINANCE
Payment Decline Rate Diagnosis
Metric Decomposition · Root Cause Analysis · Time Series
StripeAdyenShopify
Practice this case →
Advanced
E-COMMERCE
Cart Abandonment Root Cause
Funnel Analysis · Session Behavior · Device Segmentation
ShopifyWayfairChewy
Practice this case →
Advanced
E-COMMERCE
Dynamic Pricing Pilot Assessment
A/B Test Evaluation · Diff-in-Diff · Price Elasticity
AmazonUberInstacart
Practice this case →
Advanced
PRODUCT & TECH
Free-to-Paid Conversion Optimization
Funnel Analysis · Behavioral Features · Predictive Signals
SlackDropboxNotion
Practice this case →
Advanced
PRODUCT & TECH
Pricing Tier Restructuring
Segmentation · Willingness to Pay · Usage Analysis
TwilioHubSpotAtlassian
Practice this case →
Advanced
ADTECH & MARKETING
Referral Program A/B Test
Growth Metrics · Experiment Design · Network Effects
UberLyftDoorDash
Practice this case →
Advanced
HEALTHCARE
Claims Denial Pattern Analysis
Pattern Mining · Root Cause Analysis · Process Optimization
OptumHumanaCigna
Practice this case →
Foundational
FINANCE
Credit Card Application Approval
Classification · Feature Importance · Class Imbalance
JPMorganCapital OneGoldman Sachs
Practice this case →
Foundational
E-COMMERCE
Checkout Flow Drop-off
Funnel Analysis · Device Segmentation
ShopifyAmazonEtsy
Practice this case →
Foundational
E-COMMERCE
Homepage Banner A/B Test
A/B Testing · Conversion Metrics
AmazonShopifyTarget
Practice this case →
Foundational
E-COMMERCE
Black Friday Discount True Impact
Causal Reasoning · Pre/Post Analysis
AmazonTargetWalmart
Practice this case →
Foundational
PRODUCT & TECH
Support Ticket Volume Spike
Metric Decomposition · Root Cause Analysis
SlackNotionAtlassian
Practice this case →
Foundational
PRODUCT & TECH
Push Notification Timing Test
A/B Testing · Engagement Metrics
SlackDropboxNotion
Practice this case →
Foundational
ADTECH & MARKETING
Paid Search vs SEO Credit
Attribution · Channel Analysis
GoogleMetaHubSpot
Practice this case →
Foundational
INSURANCE
Insurance Quote Completion Funnel
Funnel Analysis · Segmentation
ProgressiveGeicoAllstate
Practice this case →
Advanced
FINANCE
Suspicious Transaction Flagging
Anomaly Detection · Precision vs Recall · Cost Analysis
StripePayPalSquare
Practice this case →
Advanced
FINANCE
Payment Decline Rate Diagnosis
Metric Decomposition · Root Cause Analysis · Time Series
StripeAdyenShopify
Practice this case →
Advanced
E-COMMERCE
Cart Abandonment Root Cause
Funnel Analysis · Session Behavior · Device Segmentation
ShopifyWayfairChewy
Practice this case →
Advanced
E-COMMERCE
Dynamic Pricing Pilot Assessment
A/B Test Evaluation · Diff-in-Diff · Price Elasticity
AmazonUberInstacart
Practice this case →
Advanced
PRODUCT & TECH
Free-to-Paid Conversion Optimization
Funnel Analysis · Behavioral Features · Predictive Signals
SlackDropboxNotion
Practice this case →
Advanced
PRODUCT & TECH
Pricing Tier Restructuring
Segmentation · Willingness to Pay · Usage Analysis
TwilioHubSpotAtlassian
Practice this case →
Advanced
ADTECH & MARKETING
Referral Program A/B Test
Growth Metrics · Experiment Design · Network Effects
UberLyftDoorDash
Practice this case →
Advanced
HEALTHCARE
Claims Denial Pattern Analysis
Pattern Mining · Root Cause Analysis · Process Optimization
OptumHumanaCigna
Practice this case →
Foundational
FINANCE
Credit Card Application Approval
Classification · Feature Importance · Class Imbalance
JPMorganCapital OneGoldman Sachs
Practice this case →
Foundational
E-COMMERCE
Checkout Flow Drop-off
Funnel Analysis · Device Segmentation
ShopifyAmazonEtsy
Practice this case →
Foundational
E-COMMERCE
Homepage Banner A/B Test
A/B Testing · Conversion Metrics
AmazonShopifyTarget
Practice this case →
Foundational
E-COMMERCE
Black Friday Discount True Impact
Causal Reasoning · Pre/Post Analysis
AmazonTargetWalmart
Practice this case →
Foundational
PRODUCT & TECH
Support Ticket Volume Spike
Metric Decomposition · Root Cause Analysis
SlackNotionAtlassian
Practice this case →
Foundational
PRODUCT & TECH
Push Notification Timing Test
A/B Testing · Engagement Metrics
SlackDropboxNotion
Practice this case →
Foundational
ADTECH & MARKETING
Paid Search vs SEO Credit
Attribution · Channel Analysis
GoogleMetaHubSpot
Practice this case →
Foundational
INSURANCE
Insurance Quote Completion Funnel
Funnel Analysis · Segmentation
ProgressiveGeicoAllstate
Practice this case →
01
40 min

A real case, not a quiz.

Practice in an AI-integrated notebook — the same shape of work a data scientist does on the job.

02
3 dimensions

Deliverable · AI collaboration · Domain expertise.

The 3 skills that matter most for data jobs in the AI era. Learn how good you are at each — and exactly how to grow.

03
A detailed report

Built to help you grow.

Every report is a coach-style diagnosis you can act on — so each case leaves you better prepared for real-world data jobs and interviews.

What you get from practicing cases

Not a score.
A diagnosis.

Other platforms let you memorize the answers to interview questions — none of them help you truly understand the knowledge. Hands-on analysis practice is the best way to actually learn it.

  • 3-dimension breakdown — Final Deliverable + AI Collaboration + Domain Expertise. These are the skills that get you hired in the AI era.
  • An overall performance summary calibrated to your level — what your work felt like, what you reached, what's next.
  • Cell-level and prompt-level evaluation on your work — see exactly where you performed well or fell short.
  • A detailed 4-week plan plus 10 follow-up questions with answers you can practice against.
Start practicing →
Sample Report
1
Performance overview
7.1/10
Overall
Final Deliverable
Strong
AI Collaboration
On track
Domain & Analytical Expertise
On track
Summary

Your investigation moved methodically through the data and ended with a clear, defensible recommendation. You found the surface story cleanly but stopped short of the deeper subgroup pattern that separates senior from expert work — the next step is hypothesis-first framing, naming what you expect to find before you query.

WHAT YOU DID WELL
Engaged with a serious credit-risk problem
Chose to take on an advanced case involving imbalanced classification and behavioral vs. origination signals — a non-trivial framing.
Synthesized the metrics into a real recommendation
You didn't stop at scattered numbers — you committed to a recommendation a VP of Credit Risk could act on.
Recognized this is a domain-judgment case
Walking away rather than guessing suggests awareness that credit risk recommendations need real grounding, not a plausible-sounding narrative.
WORTH A CLOSER LOOK
A sharper path is hypothesis-first framing before any code
Even 60 seconds writing 'I expect behavioral signals to dominate origination signals' would have anchored the entire 30 minutes.
Reviewers will push on the behavioral vs. origination distinction
Denise needs to know which signals are real-time distress vs. stale risk profile — this split is the spine of the recommendation.
The habit that separates senior from expert is finishing the loop
Even a rough segment-level default rate table beats a blank notebook — partial evidence with calibrated language is the senior move.
2
Case deep dive

Basic analysis leads to a surface story, but with domain expertise and good use of AI, we can find a deeper story that brings more valuable insights.

CASE RECAP

Horizon Consumer Finance has noticed that too many loans are reaching serious delinquency (90+ days past due) before anyone intervenes. Leadership wants an early warning system that flags at-risk loans while there is still time to offer payment relief or restructuring.

THE BUSINESS QUESTION

"Which loans in Horizon's active portfolio are most likely to default in the next 6 months, and what signals should the collections team use to decide who to contact first?"

— Denise Kowalski, VP of Credit Risk

THE SURFACE STORY
WHAT BASIC ANALYSIS SHOWS

A straightforward analysis builds a predictive model on the full dataset and finds that credit score at origination and loan amount are the top two factors associated with default. Borrowers who had lower credit scores when they took out their loan and borrowers with larger loan balances are more likely to default.

THE NAIVE RECOMMENDATION

Flag loans from borrowers who had lower credit scores at origination and carry larger balances for early outreach. Build a simple risk score and use it to rank the portfolio for the collections team.

WHY IT'S INCOMPLETE

This approach has two significant problems. First, it leans on origination features — information that was true when the loan was made but may no longer reflect the borrower's situation today.

THE DEEPER STORY
THE REAL MECHANISM

An expert candidate recognizes that the most important question is not 'who looked risky when they got the loan?' but 'who is showing signs of financial stress right now?' They identify that current credit utilization above 0.80 — meaning a borrower is nearly maxed out on their revolving credit — and a missed payment in the last 6 months are the two strongest individual predictors of near-term default, each roughly 3–5× stronger than credit score at origination.

3
Evaluation detail

In the age of AI, strong analytical work rests on three things — and your report scores each one honestly so you know exactly where to grow.

3-1
Final Deliverable

Did the notebook and your recommendation solve the business problem at the right depth?

Strong

You delivered the surface story cleanly and committed to a recommendation a VP could act on — the next step up is operationalizing the deeper subgroup pattern.

What moved this3 observations
HelpedSummary findings and recommendation

Synthesized scattered metrics into a clear recommendation a VP of Credit Risk could act on.

Helpedcell #6 at min 14

Opened with default rate by credit-score band — the correct first segmentation, reached early.

Worth a closer lookcell #11 at min 26

Stopped before segmenting by current utilization and recent missed payments, where the deeper signal lives.

3-2
AI Collaboration

How well you directed the AI and judged what it handed back.

On track

Your prompts gave the AI enough context to be useful from the first message; sharpening them with a hypothesis is the next gear.

What moved this2 observations
Helpedmsg #2 at min 3

Prompts named the task and the dataset, so the AI was useful from the first message — no blank-page stall.

Worth a closer lookmsg #7 at min 17

Prompts stayed task-shaped; leading with a hypothesis would turn the AI from a tool into a thinking partner.

3-3
Domain & Analytical Expertise

The analytical skill and domain knowledge you brought in your own work.

On track

You showed solid command of the fundamentals; the credit-risk nuance this case rewards is within reach with one more pass.

What moved this2 observations
HelpedQuiz Q2

Named the ~7% default rate and leaned toward a recall-oriented framing — the right instinct for an early-warning system.

Worth a closer lookSummary findings and recommendation

Current utilization (a live distress signal) and credit score at origination (stale) were not separated.

4
Next steps
YOUR 4-WEEK ACTION PLAN
WEEK 1
Retry with a 5-minute framing ritual
Targets: Final Deliverable
  1. 01Retry this case. Before touching the data, spend 5 minutes writing the decision Denise needs to make, two competing hypotheses about what drives default, and the one table you'd want to see first.
  2. 02Force yourself to produce a default-rate-by-segment table within the first 10 minutes — utilization bands × recent-missed-payment recency. Don't optimize, just ship it.
WEEK 2
Behavioral vs. static signals in consumer credit
Targets: Domain & Analytical Expertise
  1. 01Re-read the Deeper Story and the deeper-tagged data patterns in this report. Write down, in your own words, why utilization at 0.80+ is a different KIND of signal than credit_score_at_origination.
  2. 02Pick another credit, lending, or churn case on the platform. Before opening it, list which features you expect to be 'stale at origination' vs. 'live distress signals' — then check yourself against the data.
WEEK 3
Hypothesis-led AI prompting
Targets: AI Collaboration
  1. 01On your next case attempt, open with a prompt of the form: 'My hypothesis is X because Y. Help me design the cut that would falsify it.' Never start with 'analyze this dataset.'
  2. 02After every AI response, write one sentence: 'What did this confirm, what did it leave open, what's my next question?' before sending the next prompt.
WEEK 4
Calibrated language under time pressure
Targets: All three
  1. 01Take any practice case with a 15-minute self-imposed timer. Force yourself to ship one chart, one segment cut, and one calibrated recommendation — even if rough. Build the muscle of finishing the loop.
  2. 02Re-read your own Section 2 answer afterward and rewrite it in one sentence to a hiring manager. Cut every hedge that isn't earned.
10 practice questions to sharpen on cases like this

Tap any question to see the suggested answer.

01
Denise asks: 'Why shouldn't I just sort the portfolio by credit score and have collections call the bottom decile?' How do you respond?
MODEL ANSWER

Credit score at origination is a snapshot from when the loan was made — it doesn't reflect the borrower's current financial state. A borrower who looked prime at origination but is now near max utilization with a recent missed payment is far higher risk than a near-prime borrower paying steadily.

02
What's the single most actionable signal you'd put in front of the collections team tomorrow morning, and why that one?
MODEL ANSWER

Current revolving utilization combined with recent missed payment recency. Utilization above 0.80 plus a missed payment in the last 6 months is the segment with the highest concentration of imminent defaults.

03
The default rate is roughly 7%. How does that shape your analytical approach?
MODEL ANSWER

It means accuracy is a useless metric — predicting 'no default' for everyone scores 93%. I'd anchor on recall in the top-risk segments, frame the output as a ranked list rather than a binary classifier.

04
How would you decide where to set the cutoff between 'call this borrower' and 'leave them alone' on the ranked list?
MODEL ANSWER

Work backwards from collections capacity. If the team can handle 200 calls a week, the cutoff is whatever score puts the top 200 borrowers in the queue. Then track whether those 200 actually reduce next-month delinquency vs. a holdout — that's your ROI signal.

05
What's the risk of basing the model on behavioral signals like utilization?
MODEL ANSWER

They're more volatile, so the score moves week-to-week and a borrower can flip in and out of the high-risk segment. That's actually a feature for an early-warning system — but it means you need a lookback window discipline, otherwise you'll over-call borrowers who briefly spiked and recovered.

06
The dataset has both 'days_past_due' and 'months_since_last_missed.' Which would you trust more for prediction?
MODEL ANSWER

Months_since_last_missed for non-current borrowers because it captures pattern; days_past_due is a real-time state variable that's already late. I'd use both — recency for ranking, days-past-due for triage urgency within the ranked list.

07
If you had to ship something tomorrow with no model, just rules, what's the simplest decision rule you'd give Denise?
MODEL ANSWER

If utilization > 0.80 AND any missed payment in the last 6 months, call. That single rule captures most of the high-risk segment and is explainable in one sentence — which Denise needs for compliance review anyway.

08
How would you talk about model uncertainty to a non-technical VP?
MODEL ANSWER

I'd anchor on outcomes: 'Of the 200 borrowers we flag, we expect roughly 30-40 to actually default in the next 6 months. That's a 5-7× lift over random. The other 160 are time spent on conversations that may or may not need to happen.' Numbers, not confidence intervals.

09
What kind of validation would convince you the model is doing what you think?
MODEL ANSWER

A holdout test where the top decile of predicted defaults actually shows 4-5× the default rate of the bottom decile six months later. Plus a fairness check — the score isn't proxying for protected attributes via correlated features like ZIP code.

10
One year in, how would you know it's time to retrain?
MODEL ANSWER

Two signals: the score-to-default lift in the top decile drops below ~3×, or the feature distributions drift (utilization rates shift macro-level because of a rate cycle). Either is a flag to retrain on fresher data.

Hover or scroll to read
Common questions

What candidates ask.

To prevent cheating, hiring managers have changed how they interview. They go deep into the knowledge and projects a role actually needs — and they can tell right away whether you're reading AI's answers or genuinely understand the work. The only way to stand out is to learn it: practice hands-on cases, understand the knowledge deeply, and see how it's leveraged in the real world. LitMetrics is the only platform that offers that kind of practice.

Real business problems modeled on the work data analysts and scientists actually do — built on synthetic datasets engineered for genuine analytical depth. They span finance, e-commerce, healthcare, gaming, marketplaces, and more.

Yes — and we want it to stay that way. We're here to support every analyst's journey in data, and your feedback is our most valuable asset as we build.

You work in a real Jupyter-style notebook with an AI assistant built right in — the same setup a modern data scientist uses. Explore the data, write and run code, and direct the AI, all in one place.

Yes — unlimited retakes on every case. Each attempt produces a fresh report, so you can watch your judgment and your AI collaboration sharpen over time.

A coach-style diagnosis: how your work scored across three dimensions — the deliverable, your AI collaboration, and the expertise you brought — with the specific evidence behind every call, plus a personalized 4-week plan and practice questions to close the gaps.

Lead with judgment, not code. The strongest work picks a sharp hypothesis early, names the trade-offs, and uses AI deliberately — to pressure-test thinking, not to skip it. Each report gives you cell-level and prompt-level feedback on exactly where that broke down, so you know what to fix.

Start a case

30 minutes between you and a real read.