Janicki et al., 2005 - Google Patents

Reconstruction of Polish diacritics in a text-to-speech system.

Janicki et al., 2005

Document ID: 5282686764382462304
Author: Janicki A; Herman P
Publication year: 2005
Publication venue: INTERSPEECH

External Links

Cited by

Snippet

This paper describes an approach to reconstruction of the Polish diacritic signs, needed eg in a speech synthesis system. Some telecommunication services (for example SMS transmission in GSM) remove diacritics from the text. Without them the text is usually still …

Continue reading at www.isca-archive.org (PDF) (other versions)

238000013528 artificial neural network 0 abstract description 15

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G06F17/2217—Character encodings
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems

Similar Documents

Publication	Publication Date	Title
EP3125235B1 (en)	2020-09-09	Learning templates generated from dialog transcripts
US7587308B2 (en)	2009-09-08	Word recognition using ontologies
Daland et al.	2011	Learning diphone‐based segmentation
JP2000353161A (en)	2000-12-19	Method and device for controlling style in generation of natural language
JP2001249922A (en)	2001-09-14	Word division system and device
CN111930914A (en)	2020-11-13	Question generation method and device, electronic equipment and computer-readable storage medium
Li et al.	2012	Normalization of Text Messages Using Character-and Phone-based Machine Translation Approaches.
Meylan et al.	2017	Word forms-not just their lengths-are optimized for efficient communication
Ajees et al.	2018	A named entity recognition system for Malayalam using neural networks
Rajendran et al.	2019	A robust syllable centric pronunciation model for Tamil text to speech synthesizer
Shahid et al.	2024	Next word prediction for Urdu language using deep learning models
CN113051388A (en)	2021-06-29	Intelligent question and answer method and device, electronic equipment and storage medium
Janicki et al.	2005	Reconstruction of Polish diacritics in a text-to-speech system.
Dutta	2022	Word-level language identification using subword embeddings for code-mixed Bangla-English social media data
CN115455912A (en)	2022-12-09	Text analysis method and device, electronic equipment and computer readable storage medium
KR100784730B1 (en)	2007-12-12	Method and apparatus for statistical HMM part-of-speech tagging without tagged domain corpus
CN116089601A (en)	2023-05-09	Dialogue abstract generation method, device, equipment and medium
CN111090720B (en)	2023-09-12	Hot word adding method and device
Nanayakkara et al.	2022	Context aware back-transliteration from english to sinhala
Hlaing et al.	2018	Myanmar Number Normalization for Text-to-Speech
CN112992128A (en)	2021-06-18	Training method, device and system for intelligent voice robot
Neme	2013	A fully inflected Arabic verb resource constructed from a lexicon of lemmas by using finite-state transducers
Wong et al.	2006	Linguistic and behavioural studies of Chinese chat language
CN111159360A (en)	2020-05-15	Method and device for obtaining query topic classification model and query topic classification
CN112580365A (en)	2021-03-30	Chapter analysis method, electronic device and storage device