Side project · AI diagramming

From messy AI outputs to clean, editable diagrams.

Bhava is my nights-and-weekends AI tool built with one part-time engineer. I lead everything from product discovery to shipped experiments—driving 60% activation, 5% landing conversion, $150–200 MRR, and −40% cost per diagram in four weeks.

Prompt strategy Trust signals Pricing & retention AI evals

Product discovery

Mapped workflows, ran 8 interviews, and sized activation gaps before touching UI.

Trust-led UX

Turned the prompt into a guided demo, added progress states, and rebuilt onboarding.

Growth experiments

Removed free mode, shipped usage-based pricing, and tracked retention + MRR weekly.

AI evaluation

Logged 100+ failed diagrams, clustered errors, and guided sub-agent strategy to lift accuracy.

60%

Activation lift

1,600+

Early users

25

Paying teams

4

Renewals sealed

Bhava hero collage interface preview
Context

Why we built this

I'm a product designer at an ad-tech startup. By day, I'm deep in B2B dashboards. By night, I watch my team waste hours redrawing the same system diagram in Figma, Draw.io, Excalidraw, and Miro.

Same workflow. Four different tools. Different versions. Complete chaos.

So I started building Bhava—an AI tool that generates diagrams instantly. But more importantly, one that doesn't feel like a black box.

This is early stage. We're 4 weeks post-launch with ~1,500 users and $150–200 MRR. I work on this part-time alongside my full-time job. One engineer friend helps part-time. Between us, I handle design, product, evals, UI fixes, pricing experiments, and customer interviews. He handles optimization and infrastructure.

This is the story of how we went from a fuzzy idea to 60% activation—and what I learned about building AI products people actually trust.

The problem

Broken workflows, broken trust

🔀

Diagramming was fragmented

  • Engineers used Draw.io
  • Designers used Figma
  • PMs used Miro
  • Same diagram, 4 places, 30+ minutes each time
🤔

AI tools fell short

  • Vague prompts → broken diagrams
  • No feedback while AI "thought" → mistrust
  • Failures with zero explanation
The activation problem: In early tests, only 38% of users created their first diagram. The other 62% bounced without trying.

Our bet: Build on top of Draw.io (largest user base) and make AI feel reliable, not random.

Approach

Designing for trust

Every design decision mapped back to a trust framework for AI research

01

Ability

Can the AI actually do the task?

02

Benevolence

Does it feel like it's helping me?

03

Integrity

Is it honest about what it can and can't do?

04

Reliability

Does it work consistently?

Discovery

Understanding the drop-off

Before redesigning anything, I spent 2 weeks analyzing user behavior—watching session recordings, tracking prompts, and interviewing people who churned.

01

Blank canvas paralysis

Users landed on an empty editor with no guidance, no examples. They froze.

02

Mode confusion

"Intelligent" vs "Basic" results varied wildly. Trust eroded fast.

03

No progress feedback

3–8 seconds of spinner. No updates. Pure anxiety.

04

Hidden export

Only 15% exported their first diagram. The happy path was invisible.

"I don't know what to type, so I just close the tab."

— Product manager, B2B SaaS
What I shipped

Six experiments that moved metrics

Each redesign tackled a specific trust or activation gap. Here's what worked.

Homepage prompt with guided examples
Experiment 01

Homepage prompt became the product demo

Problem: Vague CTAs meant visitors signed up without understanding what to type.

Solution: Elevated a giant prompt box with example chips and a mini walkthrough so users preview the experience before creating an account.

1% → 5% Landing conversion (+4pp)
45s → 12s Time-to-first-prompt
Guided onboarding cards
Experiment 02

Guided prompt experience replaced the blank chat

Problem: New users froze on an empty chat and churned without generating anything.

Solution: Added diagram-type cards, contextual hints, and a three-step progress indicator that nudges people into action.

38% → 60% Activation (+22pp)
+18% Prompt quality scores
Premium generation screen
Experiment 03

Removing free mode protected trust

Problem: The legacy "Basic" mode produced low-quality diagrams that tanked perceived reliability.

Solution: Sunset the free mode, offered one premium try, and introduced usage-gated access to keep output quality consistent.

30% Day-7 retention (stabilized)
−55% "Bad diagrams" support tickets
Usage-based pricing screen
Experiment 04

Usage-based pricing matched value to spend

Problem: Unlimited $10/month plans were unprofitable and encouraged abuse.

Solution: Swapped to a $10 base plan with transparent credit packs and real-time usage tracking.

$30 First enterprise add-on purchase
−22% → +14% Contribution margin flip
Usage dashboard
Experiment 05

Usage clarity reduced support debt

Problem: Pricing changes created confusion—users couldn't tell where credits went.

Solution: Built an always-available tutorial and a usage dashboard detailing credits, modes, and expiry.

−60% Billing questions
+1.2 pts NPS on transparency
Manual evaluation log
Experiment 06

Manual evals powered sub-agent quality

Problem: Diagram quality varied by type and we lacked clarity on failure patterns.

Solution: Logged ~100 failed diagrams, clustered errors, and routed high-volume types through specialized sub-agents.

+70% Flowchart success rate
Flat Costs (thanks to caching)
4 weeks post-launch

Current metrics

A snapshot of where things stand after the first month of shipping.

Activation

60%

Sign up → First diagram

+22pp from 38%

Landing conversion

5%

Visitors → Sign ups

+4pp from 1%

Cost per diagram

$0.048

After prompt caching

−40% reduction

MRR

$150–200

~30 paying customers

First month baseline

Day-7 retention

30%

Active after one week

Stabilized post-pricing shift

Generation speed

3.2s

p50 latency

7.8s at p95
Reflection

What broke, what worked

Failures

  • Template gallery: zero usage, removed quickly
  • Three free tries: people churned emails, costs spiked
  • Basic mode: poor quality eroded trust
  • Five-step onboarding: too long, trimmed to three

Wins

  • Homepage prompt lifted conversion 4pp
  • Progress indicators stopped mid-gen refreshes
  • Caching reduced costs by 40%
  • Quality-first approach kept users returning

What I'd do differently

  • Set up analytics on day one (not week two)
  • Start manual evals earlier to catch quality issues
  • Test pricing experiments faster—waiting cost us margin
What's next

Future bets

Shipping soon (2–4 weeks)

  • "Explain this diagram" overlays
  • One-click export to PNG, SVG, PDF
  • Diagram versioning

Exploring

  • Team collaboration (comments, shared workspaces)
  • API access (10 inbound requests)
  • Template starter packs

Said no to

  • Custom branding (low demand)
  • Slack/Notion integrations (unclear ROI)
  • White-label offering (not before $2K MRR)

12-month thesis: AI will replace 30–40% of manual diagramming. Winners will prioritize speed, transparency, and trust. Draw.io integration gives distribution. Usage-based pricing aligns incentives.

Honesty

Risks I'm owning

  • Four weeks is too early to claim product-market fit
  • Need to track day-30 and day-60 retention before declaring success
  • Export rate stuck at 15%—my next focus area
  • Eval rubric is v1; still manually labeled and somewhat subjective
  • Some metrics estimated; refining instrumentation
  • Pricing tests ongoing; willingness-to-pay still forming

Want to dive deeper?

Let's chat about designing trustworthy AI, running growth experiments, or how I can bring this playbook to your team.