The Challenge
Kreebo is an AI-powered app designed to encourage children to read and create their own stories. On the surface, it has strong educational potential. But potential alone doesn't make a product usable — and nobody had systematically asked where the experience was breaking down, or why.
My goal was to answer exactly that: run a structured, multi-method UX investigation and turn the findings into evidence that could drive real product decisions.
My Approach
Rather than jumping straight into user testing, I started by understanding the ecosystem. I mapped eight stakeholder groups and traced how their needs and constraints interact.
Key Insight from Ecosystem Mapping
Two structural tensions surfaced immediately: Teachers needed personalized assignments, but the app's monolingual design excluded bilingual families. And School Administrators needed content consistency, but the AI's generative nature made that impossible to guarantee. These weren't user complaints — they were design-level contradictions embedded in the product architecture.
From there, I built a learner journey map to pinpoint where tensions would surface — then ran usability testing with four parents to confirm it with real behavior.
What I Found
Using Nielsen's Heuristics as my diagnostic framework, I identified four key findings — but the most important insight was the pattern connecting them:
The System Was Silent When It Broke
When the microphone failed, the app showed nothing. All 4 participants were stranded — a failure of Heuristic 1 that made the product's most creative feature completely inaccessible.
The AI Was Doing Too Much, Too Fast
Excessive questions during story generation caused 3 of 4 participants to disengage. The AI's flexibility became a burden — no shortcut, no escape.
A Mandatory Task Undermined the Core Experience
A required post-story English summary — deemed too hard for children — generated the highest negative sentiment in the study. A moment of accomplishment became a frustrating obligation.
The Reading Phase Was a Clear Win
27 positive codes, zero negatives. The contrast proved the product works — just not around the reading experience itself.
The Insight That Tied It Together
The WCAG 2.2 audit revealed a critical failure in WCAG 3.3.1: no text-based error descriptions, no ARIA labels — just visual cues.
Triangulation Moment
This single accessibility failure directly explained the microphone crisis. Qualitative behavior (100% stranded), quantitative data (10 "Confused" codes), and the audit all pointed to the same root cause. That's the value of triangulation: it turns observations into evidence.
What I Recommended
- Fix error communication first. Text-based error messages + ARIA labels resolves the accessibility failure and the mic crisis in one move — the highest-leverage fix available.
- Reduce AI friction with defaults. "Quick Generate" templates let users bypass the prompt sequence without removing flexibility.
- Rethink the Summary Task. Replace the mandatory writing task with a simple comprehension check — preserving the learning goal without punishing users for finishing a story.