

Speech-to-Text! Audio/video file transcription tools and techniques
Join the Zk Av Club for a simple, hands-on workshop on turning audio and video into text. We will use real recordings from DWeb + Coolab Camp 2023 in Brazil. The examples include Portuguese, English, and Spanish.
What we’ll cover
Cloud tools: Notta, Descript, Dictationer, ElevenLabs, HappyScribe How to upload a file, get a transcript, export captions (SRT/VTT), and copy text.
Local/offline tools: Whisper, Kdenlive Speech-to-Text, VOSK Basic install tips, how to run them, and when to use CPU vs. GPU.
Quality check: Compare accuracy, timestamps, speaker labels, and edit speed across PT/EN/ES.
Clean & share: Quick steps to fix punctuation, add or adjust speakers, translate or subtitle, and publish to sites or archives.
Who this is for
Editors, community archivists, researchers, and event organizers. Beginners welcome.
What to bring
A computer (mobile devices may work for cloud transcription)
(Optional) a video file to test.
Feel free use one of the videos from the collection we'll be working with: https://archive.org/details/dweb-stories-brazil-2023
(Optional) pre-install Whisper or Kdenlive if you want to try local tools during the session. You can also just watch.
You’ll leave with
A short checklist for cloud vs. local workflows
Copy-paste commands and export recipes
A clear comparison to help you choose a tool next time
Practical, privacy-aware, and reproducible. Let’s make your recordings searchable and accessible.