d’Eon et al., 2021 - Google Patents

Musical speech: a transformer-based composition tool

d’Eon et al., 2021

View PDF
Document ID
5455022933948042238
Author
d’Eon J
Dumpla S
Sastry C
Oore D
Oore S
Publication year
Publication venue
NeurIPS 2020 Competition and Demonstration Track

External Links

Snippet

In this paper, we propose a new compositional tool that will generate a musical outline of speech recorded/provided by the user for use as a musical building block in their compositions. The tool allows any user to use their own speech to generate musical …
Continue reading at proceedings.mlr.press (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/06Elementary speech units used in speech synthesisers; Concatenation rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids transforming into visible information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • G10L13/10Prosody rules derived from text; Stress or intonation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/471General musical sound synthesis principles, i.e. sound category-independent synthesis methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H7/00Instruments in which the tones are synthesised from a data store, e.g. computer organs
    • G10H7/08Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform
    • G10H7/10Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform using coefficients or parameters stored in a memory, e.g. Fourier coefficients

Similar Documents

Publication Publication Date Title
Canazza et al. Modeling and control of expressiveness in music performance
Umbert et al. Expression control in singing voice synthesis: Features, approaches, evaluation, and challenges
Feugère et al. Cantor Digitalis: chironomic parametric synthesis of singing
Reddy et al. Two-stage intonation modeling using feedforward neural networks for syllable based text-to-speech synthesis
Bellegarda et al. Statistical prosodic modeling: from corpus design to parameter estimation
Hill et al. Low-level articulatory synthesis: A working text-to-speech solution and a linguistic tool1
Wada et al. Sequential generation of singing f0 contours from musical note sequences based on wavenet
d’Eon et al. Musical speech: a transformer-based composition tool
Trochidis et al. CAMeL: Carnatic percussion music generation using n-gram models
Stan et al. Generating the Voice of the Interactive Virtual Assistant
Locqueville et al. Voks: Digital instruments for chironomic control of voice samples
Oh et al. LOLOL: Laugh Out Loud On Laptop.
Siegel Timbral Transformations in Kaija Saariaho's From the Grammar of Dreams
Ronanki Prosody generation for text-to-speech synthesis
Yao et al. SR-TTS: a rhyme-based end-to-end speech synthesis system
Blaauw Modeling timbre for neural singing synthesis: methods for data-efficient, reduced effort voice creation, and fast and stable inference
Lomax The Analysis and Synthesis of the Singing Voice
Thompson IV Creating Musical Scores Inspired by the Intersection of Human Speech and Music Through Model-Based Cross Synthesis
Landarech Mayo From voice to virtuosity: DDSP-based timbre transfer
Saiteja Towards building controllable Text to Speech systems
Story TubeTalker: An airway modulation model of human sound production
Mo et al. [Retracted] Analysis of Computer Visualized Sound Parameters of Vocal Music Singing Based on Deep Learning
Morist Emotional speech synthesis for a radio dj: corpus design and expression modeling
McLean Improvising with synthesised vocables, with analysis towards computational creativity
Oralbekova et al. Current advances and algorithmic solutions in speech generation