Yesterday, Microsoft announced three brand new MAI models — MAI-Voice-1, MAI-Image-2, and MAI-Transcribe-1 — available through Microsoft Foundry and explorable via Copilot Labs. These are Microsoft’s own in-house AI models, purpose-built for voice, image, and transcription tasks, and they’re genuinely impressive.
Naturally, I had to build something with them.
What is “Once Upon a Prompt”?
Once Upon a Prompt is a personalised AI bedtime story generator. You tell it about your kids — their names, ages, and a few fun details — pick a story theme, and it creates a fully narrated, multi-voice story with a custom illustration. All generated in seconds.
It’s a proof of concept that showcases what’s possible with these new models, and it’s available for anyone to try whilst the underlying Copilot Labs features remain accessible.
How it works
The experience is simple — three steps:
- Tell us about the kids — Add up to three children with their name, pronouns, and age. Optionally add their favourite colour, a pet or cuddly toy, and any extra story ideas.
- Pick a theme — Choose from ten kid-friendly adventures: Space, Pirates, Enchanted Forest, Dinosaurs, Dragons, and more.
- Generate — Choose a story length (Quick Tale, Bedtime Story, or Epic Adventure) and hit the big button.
Behind the scenes, the app uses MAI-Voice-1 to generate a multi-voice narrated story and MAI-Image-2 to create a whimsical storybook illustration — both personalised to your children and the theme you chose. The voice model is particularly special — it doesn’t just read text aloud, it performs a full dramatic reading with multiple character voices, sound effects cues, and real emotion.
What makes the MAI models special?
Without going into heavy technical detail, here’s what stood out to me:
- MAI-Voice-1 produces incredibly natural, expressive speech. It supports multiple voices and styles — from gentle narration to excited character voices — and it handles dialogue brilliantly. The multi-voice story mode is a real standout feature.
- MAI-Image-2 generates high-quality images from text prompts with a focus on safety. It’s perfect for kid-friendly content where you need the output to be appropriate every time.
- Both models are fast. Story generation (voice + image) typically completes in 10–20 seconds.
You can explore these models yourself at Copilot Labs and read the full announcement at microsoft.ai.
A few things to know
This is a proof of concept built over an afternoon with GitHub Copilot CLI. A few honest caveats:
- Story length is limited — the voice model caps at around 3 minutes for the longest stories. Still perfect for a quick bedtime tale!
- It’s experimental — the underlying Copilot Labs features could change or disappear at any time. Enjoy it while it lasts.
- Content is AI-generated — always listen along with your little ones. The models have safety guardrails, but it’s good practice to be present.
- Dark mode included 🌙 — because bedtime stories should feel cosy.
Try it and share!
I’d love to see what stories you create. Give it a go and share your favourites on social media — screenshot the illustration, record the audio playing, or just tell me about the adventure your kids went on.
Tag me and link back to this post so I can see your creations:
- YouTube: @DamoBird365
- LinkedIn: DamoBird365
- GitHub: DamoBird365
Use the hashtag #OnceUponAPrompt so we can find each other’s stories! 🧸✨
Built with AI, for fun
This whole project — from API exploration to the finished app — was built in a single session using GitHub Copilot CLI as my coding partner. It’s another example of how AI-assisted development is making it possible to go from idea to working product in hours rather than days.
If you’re curious about building with the MAI models yourself, they’re available through Microsoft Foundry for developers, and you can explore what they can do through Copilot Labs.
Happy storytelling! 📖