AI

edge-tts: Python TTS Using Microsoft Edge Online Service

edge-tts is a Python CLI tool for generating speech from text using Microsoft Edge's online TTS service, supporting hundreds of voices and languages.

Keeping this site alive takes effort — your support means everything.
無程式碼也能輕鬆打造專業LINE官方帳號!一鍵導入模板,讓AI助你行銷加分! 無程式碼也能輕鬆打造專業LINE官方帳號!一鍵導入模板,讓AI助你行銷加分!
edge-tts: Python TTS Using Microsoft Edge Online Service

High-quality text-to-speech usually requires expensive cloud APIs or complex local model setup. Edge-TTS, created by rany2, takes a clever approach: it taps into Microsoft Edge’s built-in online TTS service, providing free access to hundreds of natural-sounding voices across dozens of languages.

The tool is a simple Python CLI that transforms text into audio files using the same neural TTS voices available in Microsoft Edge’s browser read-aloud feature. With support for SSML, voice tuning, and subtitle generation, it punches far above its weight as a free, open-source TTS solution.

Voice and Language Support

LanguageMale VoicesFemale VoicesQuality
English (US)810Neural high
English (UK)56Neural high
Chinese (Mandarin)45Neural high
Japanese34Neural high
Spanish45Neural high
French34Neural high
German34Neural high
Total across 60+ languages100+200+Neural

Audio Generation Pipeline

The pipeline handles both plain text and SSML input. SSML allows fine-grained control over pronunciation, pitch, rate, and emphasis. The audio stream from Edge’s API is saved as MP3 or WAV, and subtitles can be generated with word-level timing.

Feature Comparison

Featureedge-ttsGoogle TTSAWS PollyElevenLabs
CostFreeFree tier limitedPay per usePay per use
Voice count300+100+50+100+
SSML supportYesYesYesPartial
Subtitle exportYesNoNoNo
API key requiredNoYesYesYes

Practical Applications

Edge-TTS is ideal for content creators generating voiceovers, developers prototyping voice features, accessibility tools that need screen reader voices, language learning applications, and podcast creation. The lack of API keys or usage limits makes it particularly attractive for projects with unpredictable volume or budget constraints.

For more information, visit the edge-tts GitHub repository and the Microsoft Edge TTS voice list.

Frequently Asked Questions

Q: Is edge-tts legal to use? A: Yes, it uses the same public API as Microsoft Edge’s browser feature. Check Microsoft’s terms for commercial use.

Q: Does it require an internet connection? A: Yes, the TTS processing happens on Microsoft’s servers via the Edge API.

Q: Can I adjust voice speed and pitch? A: Yes, through SSML tags for fine-grained control over prosody.

Q: What audio formats does it output? A: MP3 and WAV are supported out of the box.

Q: How long can the generated audio be? A: There is no hard limit, but very long texts should be segmented for reliability.

TAG
CATEGORIES