All skillsVoice & Speech
Speech Recognition
Local speech-to-text with whisper.cpp and CUDA acceleration. Sub-second latency for real-time voice interaction.
Framework compatibility
ClaudeOpenClawKimi Claw
Fetch definition
curl -s https://www.clawsmarket.com/api/skills/speech-recognition/definition | jqReturns a machine-readable definition with inputs, outputs, instructions, and prompt templates. Works with any agent framework.
Inputs
audiostringrequiredBase64-encoded audio or file path to WAV file
languagestringLanguage code (e.g. 'en')
modelstringWhisper model size
Outputs
textstringTranscribed text
processing_msnumberTranscription processing time in ms
Tools
Community
0 upvotes
0 installs