Janicki et al., 2005 - Google Patents
Reconstruction of Polish diacritics in a text-to-speech system.Janicki et al., 2005
View PDF- Document ID
- 5282686764382462304
- Author
- Janicki A
- Herman P
- Publication year
- Publication venue
- INTERSPEECH
External Links
Snippet
This paper describes an approach to reconstruction of the Polish diacritic signs, needed eg in a speech synthesis system. Some telecommunication services (for example SMS transmission in GSM) remove diacritics from the text. Without them the text is usually still …
- 238000013528 artificial neural network 0 abstract description 15
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3125235B1 (en) | Learning templates generated from dialog transcripts | |
US7587308B2 (en) | Word recognition using ontologies | |
Daland et al. | Learning diphone‐based segmentation | |
JP2000353161A (en) | Method and device for controlling style in generation of natural language | |
JP2001249922A (en) | Word division system and device | |
CN111930914A (en) | Question generation method and device, electronic equipment and computer-readable storage medium | |
Li et al. | Normalization of Text Messages Using Character-and Phone-based Machine Translation Approaches. | |
Meylan et al. | Word forms-not just their lengths-are optimized for efficient communication | |
Ajees et al. | A named entity recognition system for Malayalam using neural networks | |
Rajendran et al. | A robust syllable centric pronunciation model for Tamil text to speech synthesizer | |
Shahid et al. | Next word prediction for Urdu language using deep learning models | |
CN113051388A (en) | Intelligent question and answer method and device, electronic equipment and storage medium | |
Janicki et al. | Reconstruction of Polish diacritics in a text-to-speech system. | |
Dutta | Word-level language identification using subword embeddings for code-mixed Bangla-English social media data | |
CN115455912A (en) | Text analysis method and device, electronic equipment and computer readable storage medium | |
KR100784730B1 (en) | Method and apparatus for statistical HMM part-of-speech tagging without tagged domain corpus | |
CN116089601A (en) | Dialogue abstract generation method, device, equipment and medium | |
CN111090720B (en) | Hot word adding method and device | |
Nanayakkara et al. | Context aware back-transliteration from english to sinhala | |
Hlaing et al. | Myanmar Number Normalization for Text-to-Speech | |
CN112992128A (en) | Training method, device and system for intelligent voice robot | |
Neme | A fully inflected Arabic verb resource constructed from a lexicon of lemmas by using finite-state transducers | |
Wong et al. | Linguistic and behavioural studies of Chinese chat language | |
CN111159360A (en) | Method and device for obtaining query topic classification model and query topic classification | |
CN112580365A (en) | Chapter analysis method, electronic device and storage device |