TTS-TextAnalyzer

受 Introducing Unified Neural Text Analyzer: an innovation for Neural Text-to-Speech pronunciation accuracy improvement 启发，可在 BERT 模型基础上构建多个任务的 heads 来统一语音合成文本分析的任务，包括：分词，词性预测、文本归一化、多音词消歧等。这个项目用来收集适用于各任务的数据集信息。

Inspired by Introducing Unified Neural Text Analyzer: an innovation for Neural Text-to-Speech pronunciation accuracy improvement, Different tasks of speech synthesis text analysis can be built on the BERT model, including: Word Segmentation, Part-of-Speech Tagging, Text Normalization, Polyphone Disambiguation and etc. This project is used to collect dataset information suitable for each task.

Pretrained BERT

Word Segmentation

datasets	code
TODO

Part-of-Speech Tagging

datasets	code
TODO

Text Normalization

datasets / rules	code
rules	WeTextProcessing
Text normalization covering grammars	TextNormalizationCoveringGrammars
TODO

Polyphone Disambiguation

datasets	code
g2PL	https://github.com/whzikaros/g2pL
CPP (g2pM)	https://github.com/kakaobrain/g2pm
TODO

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TTS-TextAnalyzer

Pretrained BERT

Word Segmentation

Part-of-Speech Tagging

Text Normalization

Polyphone Disambiguation

About

Releases

Sponsor this project

Packages

License

lifeiteng/TTS-TextAnalyzer

Folders and files

Latest commit

History

Repository files navigation

TTS-TextAnalyzer

Pretrained BERT

Word Segmentation

Part-of-Speech Tagging

Text Normalization

Polyphone Disambiguation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Packages