How to Create Presentation Slides by Speaking Out Loud
Creating presentation slides by speaking out loud sounds intuitive. And mostly it is -- but a few specific habits determine whether you get clean, professional output or a slide deck that needs an hour of cleanup.
This is a practical tutorial. Follow these steps and your first session will produce a usable first draft.
Before You Start: Three Things to Get Right
1. Set Your Context
Before you hit the mic, spend 3 minutes filling in the context layer.
Every good voice-to-slides tool has a way to input context about your presentation before you start. At minimum: your company name, a short description of what you're pitching, team members, and key metrics you'll reference.
This context shapes every slide the AI generates. Without it, the AI fills in placeholder text and guesses your company name from the speech transcript (which means a transcription error early on corrupts your slides). With it, the AI uses your actual data: real names, real numbers, real company description.
Context setup takes 3 minutes. It saves 20 minutes of post-session editing.
2. Find a Quiet Space
Speech recognition is sensitive to background noise. A coffee shop, an open-plan office, or a room with a TV on all degrade the transcription quality -- which degrades the slide quality.
Find a space where you can speak at normal presentation volume without significant background noise. Your home office, a conference room, a quiet corner. It doesn't need to be a recording studio. It needs to be quiet enough that you could have a phone call there.
3. Have Your Outline Ready
You don't need a script. You do need to know what you want to cover.
Write a quick outline -- on paper, in a notes app, wherever -- that lists the 8-12 major points of your presentation. Problem. Solution. How it works. Market size. Traction. Team. Ask.
You'll speak to each point naturally, but having the outline means you won't forget a section mid-session. Skipping a topic means you have to add it manually later.
The Session: Step-by-Step
Step 1: Open the Tool and Set Up
Open your voice-to-slides tool in a browser (no software to install for browser-based products). Fill in the context fields: company name, description, team, key numbers.
If the tool allows a session title or pitch description, fill that in too. The more specific context you provide, the better the output.
Step 2: Hit the Mic
When you're ready, start the recording. You'll typically see some kind of audio level indicator showing the tool is listening.
Take a breath. Don't start talking immediately. A 2-3 second pause after hitting the mic lets the system initialize cleanly.
Step 3: Speak Your First Section
State the first section of your presentation naturally. Not from a script -- just how you'd say it in a meeting.
"We're building a tool that lets founders create pitch decks by speaking. The problem is that founders spend 3-5 hours in PowerPoint every time they need to update a deck, and most of that time is on formatting, not content."
Then pause. A deliberate pause -- 2 full seconds. That pause signals to the system that the first thought is complete.
Watch for the first slide to appear.
Step 4: Continue Through Your Outline
Work through your outline point by point. For each major section:
- Speak the point naturally
- Pause for 2 seconds
- Watch the slide appear
- Move to the next point
You'll find a rhythm. It feels like delivering a slow version of your pitch.
A few speaking patterns that produce clean output:
For bullet-point slides: List the points explicitly. "There are three reasons this market is ready now: first, [point one]. Second, [point two]. Third, [point three]." The explicit structure helps the AI format this as a bullets slide rather than a wall of text.
For metric slides: Lead with the number. "We've grown 40% month-over-month for the last 6 months." Leading with numbers signals a metrics layout.
For team slides: Introduce each person with their name and key credential. "Our team: Alice Smith, lead engineer, previously at Stripe. Bob Chen, head of design, previously at Notion." Clear name/credential pairs map well to team slide layouts.
For the problem/solution: Be direct. State the problem as one clear sentence. State the solution as one clear sentence. Don't ramble. The AI extracts the core statement -- if you give it too much, it has to guess which part matters most.
Step 5: Cover Everything, Then Stop
Keep going until you've covered your entire outline. Don't stop in the middle to edit slides -- that breaks your flow and you'll lose the session rhythm. Finish the full pitch, then stop the recording.
Your session should take 15-25 minutes for a typical 10-12 slide deck. If it's taking longer, you're probably over-explaining each point. At presentation pace, each major section takes 1-2 minutes of speaking.
After the Session: Editing
Plan for 15-20 minutes of review and editing after every session.
Go through the slides in order:
Check for transcription errors. The STT engine is accurate, but not perfect. Proper nouns (company names, people's names, technical terms) are the most common error source. Fix these first.
Check layout appropriateness. Did the AI pick the right layout for each slide? If a slide should be a metrics card but came out as a bullets slide, swap the layout.
Check for missing slides. Did the AI miss a major section? If you spoke about your competitive advantage but no competitor slide appeared, the segment might not have had a clean pause. Add the missing slide manually.
Check for duplicates. If you repeated a point or circled back to a topic, you might have two similar slides. Delete the weaker one.
Check order. Sometimes the AI generates slides out of the order you expected. Reorder as needed.
Trim excess. If a slide has too much text -- you spoke a long paragraph and it all ended up on one slide -- trim it to the key point. Slides should support the pitch, not replace it.
Common Mistakes and Fixes
Mistake: Not pausing between sections. Result: Consecutive slides that blur together or one long slide covering two topics. Fix: Deliberately pause 2 full seconds after every major point. It feels slow. Do it anyway.
Mistake: Rushing through the session. Result: The STT engine falls behind. Transcription quality drops. Fix: Speak at presentation pace, not conversation pace. Slightly slower than normal.
Mistake: Going off-script mid-session. Result: The AI generates slides about your tangent, not your pitch. Fix: Stick to your outline. If you want to cover something not on the outline, note it and add the slide manually after.
Mistake: Skipping context setup. Result: Generic slides with placeholder text. Fix: Always fill in context before starting. Non-negotiable.
What Good Output Looks Like
A well-structured 15-minute session should produce:
- 10-14 slides covering your major sections
- Correct content in most slides (95%+)
- 2-3 layout choices that need adjusting
- A few minor text edits needed
- Total cleanup time: 15 minutes
You started with nothing. 30 minutes later (15 session + 15 editing), you have a first-draft pitch deck.
That's not the same as a professionally designed deck. But it's a functional, structured, presentable deck that you built by talking through your pitch once.
For a deeper understanding of how the underlying technology works, read how voice-to-slides AI actually works. And for the argument on why speaking produces better pitches than typing, read why you should build your pitch deck by talking, not typing.
Start your first session on Talkpitch -- free tier, no credit card required.