Data Analysis

Data Analysis with AI (That You Can Actually Trust)

AI is great at explaining charts, generating SQL, and summarizing findings—but it’s also great at sounding confident while being subtly wrong. Data analysis lives in the land of details, so your prompts need guardrails.

This guide shows how to prompt for analysis in a way that works for AI engineers and data folks, and also for non-technical teams (sales, ops, finance, marketing, execs) who want clear insights without getting buried in jargon.

The Big Idea

Don’t ask for “insights.” Ask for a method, an output format, and evidence tied to the input data.

Ask Better Questions

A strong analysis prompt states:

Goal: what decision are you trying to make?
Metric: what matters (conversion, churn, revenue, latency)?
Slice: which segment/time window?
Output shape: table, bullets, dashboard notes, SQL, etc.
Confidence: what’s certain vs assumed?

That last one—confidence—is what separates analysis from storytelling.

Two Habits That Prevent Bad Takes

Require evidence Ask for “Evidence” columns or quotes from the data.
Force unknowns “If the data can’t support a claim, say ‘Unknown’ and list what’s missing.”

Example 1: Technical (SQL + Findings + Checks)

You want month-over-month retention analysis, and you want it reproducible.

Text

Context: You are a senior data scientist. We need a retention readout for a product review.
Instruction: Write BigQuery SQL to compute 30-day retention by signup cohort. Then summarize the results and flag data quality concerns.
Input Data:
Tables:
- users(user_id STRING, signup_ts TIMESTAMP, acquisition_channel STRING)
- events(user_id STRING, event_ts TIMESTAMP, event_name STRING)
Definitions:
- Active in day 30 window = user has >=1 event between day 30 and day 37 after signup.
Output Indicator:
1) FINAL_SQL (single query)
2) RESULTS_SUMMARY (bullets: trend, biggest cohort drop, channel differences)
3) VALIDATION_CHECKS (5 bullets: pitfalls, missing data risks, join/dup risks)
Constraints: Use only the columns provided. If a definition is ambiguous, list assumptions explicitly.

This is the analysis trifecta: compute → interpret → validate. It also prevents “phantom columns” and sloppy joins.

Example 2: Non-Technical (Sales Funnel Diagnosis)

A sales leader wants to know why conversion is down. Great—make the AI behave like an analyst.

Text

Context: I’m a sales operations manager. We’re trying to diagnose a conversion dip without blaming individuals.
Instruction: Analyze the funnel table and identify the most likely drivers. Provide 3 hypotheses and what data would confirm each.
Input Data:
Funnel (last 2 months):
- Visitors: 120,000 -> 118,000
- Trial signups: 6,000 -> 4,500
- Activated (did key action): 3,300 -> 2,100
- Paid conversions: 660 -> 630
Notes: Website redesign launched mid-month 2. Trial onboarding email sequence was updated week 3 of month 2.
Output Indicator:
- Provide a table: Stage | Change | Likely Cause | Evidence | Next Test
- Then provide 3 experiments (max 1 sentence each) to validate top hypotheses
Constraints: Avoid jargon. Do not invent numbers beyond what’s provided.

This keeps the model grounded: it can hypothesize, but it must propose tests instead of “declaring truth.”

Make it Audit-Friendly

Add “Evidence” and “Next Test” to your output. If the model can’t point to data and propose a check, it’s probably guessing.

Takeaway

AI can accelerate data analysis—if you treat it like an analyst-in-training, not an oracle. Give it clear goals, explicit definitions, and structured outputs. Require evidence, force unknowns, and add a validation step. Do that, and the model won’t just generate insights—it’ll produce analysis you can defend in a meeting.