US9595266B2 - Audio encoding/decoding device using reverberation signal of object audio signal - Google Patents

Audio encoding/decoding device using reverberation signal of object audio signal Download PDF

Info

Publication number
US9595266B2
US9595266B2 US14/435,372 US201314435372A US9595266B2 US 9595266 B2 US9595266 B2 US 9595266B2 US 201314435372 A US201314435372 A US 201314435372A US 9595266 B2 US9595266 B2 US 9595266B2
Authority
US
United States
Prior art keywords
audio signal
signal
audio
bitstream
reverberation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US14/435,372
Other versions
US20150279376A1 (en
Inventor
Seung Kwon Beack
Jeong Il Seo
Tae Jin Lee
Jong Mo Sung
Kyeong Ok Kang
Jin Woong Kim
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electronics and Telecommunications Research Institute ETRI
Original Assignee
Electronics and Telecommunications Research Institute ETRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electronics and Telecommunications Research Institute ETRI filed Critical Electronics and Telecommunications Research Institute ETRI
Priority claimed from PCT/KR2013/006471 external-priority patent/WO2014058138A1/en
Assigned to ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE reassignment ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KIM, JIN WOONG, SUNG, JONG MO, BEACK, SEUNG KWON, KANG, KYEONG OK, LEE, TAE JIN, SEO, JEONG IL
Publication of US20150279376A1 publication Critical patent/US20150279376A1/en
Application granted granted Critical
Publication of US9595266B2 publication Critical patent/US9595266B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels

Definitions

  • the present invention relates to au audio coding and decoding apparatus using a reverberation signal of an object audio signal, and more particularly, to an audio coding and decoding apparatus which encodes and decodes audio using an audio signal including a reverberation signal of an object audio signal.
  • MPEG moving picture expert group
  • SAOC spatial audio object coding
  • Dolby Atmos construct a sound scene using an input signal or an object, respectively.
  • MPEG SAOC considers an input audio signal as an object and receives the input audio signal.
  • MPEG SAOC constructs the sound scene only with respect to input rendering information.
  • MPEG SAOC is capable of transmission at a low bit rate and uses a spatial audio coding method as a high compression method.
  • Dolby Atmos refers to a multichannel audio format for theatres. Dolby Atmos transmits or stores a channel signal called ‘Beds’ and an object signal called ‘object’ and constructs the sound scene using metadata.
  • Beds a channel signal
  • object an object signal
  • a sound scene not corresponding to an intention of content according to the input audio signal or the object signal may be included. This is because only base signals for constructing the sound scene are included.
  • An aspect of the present invention provides an audio coding and decoding apparatus capable of reproducing an audio signal more efficiently and realistically, using a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
  • Another aspect of the present invention provides an audio coding and decoding apparatus capable of reconstructing a realistic sound scene according to a reverberation signal of an object audio signal, by rendering the object audio signal and the reverberation signal of the object audio signal.
  • an audio coding apparatus including an audio signal encoding unit to encode an audio signal, and a bitstream transmission unit to convert the audio signal into a bitstream and transmit the bitstream, wherein the audio signal comprises a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
  • an audio decoding apparatus including a bitstream receiving unit to receive a bitstream including an encoded audio signal, and an audio signal decoding unit to extract a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal from the bitstream by decoding the audio signal included in the bitstream.
  • the audio decoding apparatus may further include an audio rendering unit to render the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal based on the rendering information included in the bitstream.
  • an audio coding method including encoding an audio signal, and converting the audio signal into a bitstream and transmitting the bitstream, wherein the audio signal comprises a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
  • an audio decoding method including receiving a bitstream including an encoded audio signal, extracting a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal from the bitstream by decoding the audio signal included in the bitstream, and rendering the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal based on rendering information included in the bitstream.
  • an audio coding and decoding apparatus may be capable of reproducing an audio signal more efficiently and realistically, by using a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal, in reproducing a multichannel audio signal.
  • an audio coding and decoding apparatus may be to capable of reconstructing a realistic sound scene according to a reverberation signal of an object audio signal, by rendering the object audio signal and the reverberation signal of the object audio signal.
  • FIG. 1 is a diagram illustrating an audio coding and decoding apparatus according to an embodiment.
  • FIG. 2 is a diagram illustrating an audio coding apparatus according to an embodiment.
  • FIG. 3 is a diagram illustrating an audio decoding apparatus according to an embodiment.
  • FIG. 4 is a diagram illustrating the audio coding apparatus of FIG. 2 in detail.
  • FIG. 5 is a diagram illustrating the audio decoding apparatus of FIG. 3 in detail.
  • FIG. 6 is a diagram illustrating a configuration of rendering information.
  • FIG. 7 is a diagram illustrating an audio coding method according to an embodiment.
  • FIG. 8 is a diagram illustrating an audio decoding method according to an embodiment.
  • FIG. 1 is a diagram illustrating an audio coding and decoding apparatus according to an embodiment.
  • an audio coding apparatus 101 may receive an audio signal which includes a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
  • the audio coding apparatus 101 may receive the audio signal by considering the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal as an object.
  • the audio coding apparatus 101 is necessary to receive the audio signal including the foregoing three types of audio signal.
  • the audio coding apparatus 101 may receive rendering information.
  • the rendering information may include rendering information based on a gain value and rendering information related to a time delay.
  • the rendering information may a sound scene corresponding to the audio signal.
  • the audio coding apparatus 101 may encode the received audio signal, and convert the rendering information into a bit string. For example, the audio coding apparatus 101 may perform binary conversion to convert the rendering information into the bit string. In addition, the audio coding apparatus 101 may encode the audio signal and the rendering information simultaneously. Here, the audio coding apparatus 101 may include a block for converting the rendering information into the bit string.
  • the audio coding apparatus 101 may convert the encoded audio signal into the bitstream.
  • the audio coding apparatus 101 may include a block capable of converting the rendering information into the bit string.
  • the audio coding apparatus 101 may convert the rendering information and the encoded audio signal into the bitstream.
  • the bitstream may include the rendering information and the encoded audio signal.
  • the audio coding apparatus 101 may transmit the bitstream to an audio decoding apparatus 102 .
  • the audio decoding apparatus 102 may receive the bitstream from the audio coding apparatus 101 .
  • the audio decoding apparatus 102 may extract the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal from the bitstream by decoding the audio signal included in the received bitstream. Additionally, the audio decoding apparatus 102 may render the extracted audio signal, object audio signal, and reverberation signal of the object audio signal, based on the rendering information included in the bitstream.
  • the audio decoding apparatus 102 may output a rendered multichannel audio signal.
  • FIG. 2 is a diagram illustrating an audio coding apparatus 201 according to an embodiment.
  • the audio coding apparatus 201 may include an audio signal encoding unit 202 and a bitstream transmission unit 203 .
  • the audio signal encoding unit 202 may encode the audio signal.
  • the audio signal may include a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
  • the channel audio signal may be a generally used channel audio signal and allocated to a channel of a random reproduction device when reproduced.
  • the channel audio signal may be a signal not varied by the rendering information.
  • the channel audio signal may be expressed by a vector stream with respect to an N-number of channel audio signals using Equation 1.
  • X ch [x 1 ch ,x 2 ch , . . . ,x N-1 ch ] T [Equation 1]
  • the object audio signal may determine a particular audio signal among a plurality of audio signals as the object audio signal, and use the object audio signal as a subject to perform rendering.
  • the object audio signal may be a signal that may be defined in a predetermined spot through geometry analysis of the reproduction device.
  • the object audio signal may be expressed by a matrix constituted by vector streams with respect to an M-number of object audio signals using Equation 2.
  • X obj [x 1 obj ,x 2 obj , . . . ,x M-1 obj ] T [Equation 2]
  • Equation 2 may be used when rendering is performed independently from location information and delay information related to the object audio signal.
  • the object audio signal may be expressed by the matrix because each object audio signal may include a plurality of channel audio signals.
  • the object audio signal may be expressed by Equation 3.
  • x 1 obj [x 1 obj,1 ,x r obj,1 ] [Equation 3]
  • a reverberation signal of the object audio signal is a reverberation signal applied to the object audio signal, which expresses a sound field feeling of the object audio signal.
  • the reverberation signal of the object audio signal may include reverberation signals of the M-number of object audio signals, corresponding to the object audio signal.
  • the reverberation signal of the object audio signal may be expressed by Equation 4.
  • X rev [x 1 rev ,x 2 rev , . . . ,x M-1 rev ] T [Equation 4]
  • the reverberation signal of the object audio signal may include a plurality of channel audio signals.
  • the reverberation signal of the object audio signal including five 5.1 channels may be expressed by Equation 5.
  • x 1 rev [x 1 rev,1 ,x r rev,1 ,x c rev,1 ,x ls rev,1 ,x rs rev,1 ] T [Equation 5]
  • the audio signal encoding unit 202 may encode the audio signal by including a reverberation signal having various layouts with respect to the object audio signal.
  • the bitstream transmission unit 203 may convert the encoded audio signal into a bitstream.
  • the bitstream transmission unit 203 may generate the bitstream from the encoded audio signal and the rendering information for outputting the audio signal.
  • the rendering information may be additional data with respect to the audio signal. That is, the rendering information may be information applied to the audio signal to reproduce scene information related to a sound.
  • the rendering information may include location information of an audio object, sound pressure information of the audio object, and delay information of the audio object.
  • the rendering information may be expressed by Equation 6.
  • R ( t ) P ( t ) G p ( t )+ D ( t ) G d ( t ) [Equation 6]
  • R(t) may refer to the location information of the object audio signal.
  • G i (t) may refer to the sound pressure of the object audio signal.
  • D(t) may refer to the delay of the object audio signal.
  • G 1 (t) and G 2 (t) may be scale matrices for controlling the sound pressure with respect to the object audio signal.
  • t may refer to an index related to time.
  • Equation 7 When rendering is performed with respect to the location information and the delay information simultaneously, the rendering may be expressed by Equation 7.
  • R ( t ) PD ( t ) G pd ( t ) [Equation 7]
  • the bitstream transmission 203 may transmit the bitstream to the audio decoding apparatus.
  • FIG. 3 is a diagram illustrating an audio decoding apparatus 301 according to an embodiment.
  • the audio decoding apparatus 301 may include a bitstream receiving unit 302 , an audio signal decoding unit 303 , an audio rendering unit 304 .
  • the bitstream receiving unit 302 may receive a bitstream including an encoded audio signal from an audio coding apparatus.
  • the audio signal decoding unit 303 may decode the audio signal included in the bitstream.
  • the audio signal decoding unit 303 may extract a channel audio signal, an object audio signal, and a reverberation signal of the audio signal from the bitstream.
  • the audio signal decoding unit 303 may be expressed by Equation 8, Equation 9, and Equation 10, corresponding to the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal.
  • x ch [X 1 ch ,x 1 ch , . . . ,x N-1 ch ] T
  • x obj [x 1 obj ,x 2 obj , . . . x M-1 obj ] T
  • x rev [x 1 rev ,x 2 rev , . . . ,x M-1 rev ] T [Equation 10]
  • the audio rendering unit 304 may render the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal, based on the rendering information included in the bitstream.
  • the audio rendering unit 304 may construct a sound scene based on scene information related to the sound of the rendering information.
  • the audio rendering unit 304 may express a principle of rendering of the audio signal by Equation 11.
  • Equation 11 A process of applying a first term of Equation 11 will be described.
  • the sound pressure of the object audio signal may be controlled.
  • the process of controlling the object audio signal may be expressed by Equation 12.
  • x′ obj with the sound pressure controlled may be allocated to a speaker position of a reproduction device, where output is actually performed by a sound image localization matrix P(t).
  • Elements of the sound image localization matrix P(t) may be expressed by gain values of the sound pressure.
  • the gain value may include a real number between 0 and 1.
  • x′ obj may be applied to the image localization matrix as in Equation 13.
  • Equation 13 when the object audio signal x j obj includes a J-number of layouts, the object audio signal x j obj may be expressed by Equation 14.
  • x j obj [x 0 obj , . . . ,x J-1 obj ] T [Equation 14]
  • Equation 15 calculation process of each element of the sound image localization matrix may be described through Equation 15.
  • Equation 16 a signal output by the sound image localization matrix P(t) may be expressed by Equation 16.
  • Equation 10 A second term of Equation 10 may perform matrix calculation of a same dimension.
  • the matrix calculation of the dimension may be expressed by Equation 17.
  • Equation 18 the object audio signal x j obj of Equation 17 including the J-number of layouts may be expressed by Equation 18.
  • the delay calculation process of the object audio signal cannot be expressed through matrix multiplication, different from the sound image localization matrix application calculation, the delay calculation process may be expressed using an operator ⁇ .
  • a signal output through the delay calculation matrix D(t) may be expressed by Equation 19.
  • the audio rendering unit 304 may apply the sound image localization matrix and the delay calculation matrix independently.
  • a matrix PD(t) may be expressed using Equation 20.
  • the audio rendering unit 304 may extract a result as shown in Equation 21.
  • the audio rendering unit 304 may allocate the object audio signal to a channel signal which may be output, using the foregoing equation. In addition, the audio rendering unit 304 may combine the allocated object audio signal with the decoded channel audio signal. Additionally, the audio rendering unit 304 may generate an output signal to be finally output.
  • the audio rendering unit 304 may render the reverberation signal of the object audio signal as shown in Equation 22 or Equation 23.
  • R ( t ) ⁇ X rev [P ( t ) G p ( t )+ D ( t ) G d ( t )]
  • X rev P ( t )[ G p X ⁇ X rev ]+D ( t ) ⁇ [ G d ( t ) ⁇ X rev ] [Equation 22]
  • R ( t ) ⁇ X obj PD ( t ) G pd ( t ) ⁇ X rev [Equation 23]
  • Rendering of the reverberation signal of the object audio signal using Equation 22 and Equation 23 may render the object audio signal.
  • the sound scene with higher reality may be implemented.
  • the output signal to be finally output may be an integrated signal of the rendered object audio signal, the reverberation signal of the rendered object audio signal, and the decoded channel audio signal.
  • the output signal may be expressed by Equation 24.
  • y ch x′ ch +R obj ( t ) ⁇ X obj ⁇ R rev ( t ) ⁇ X rev [Equation 24]
  • Equation 24 the output signal may be separated into R obj (t) and R rev (t). That is, the output signal may be transmitted through different methods as information on the rendered object audio signal and information on the reverberation signal of the object audio signal. Therefore, Equation 23 shows that the output signal is to be transmitted as R obj (t) and R rev (t) as the rendering information.
  • the decoded channel audio signal is denoted by x′ ch since the decoded channel audio signal x′ ch is expressed in the form of a downmixed signal when the number of channels for final output does not correspond to the decoded channel audio signals.
  • x ch may be converted into x′ ch through a downmix matrix. That is, a number of dimensions of a row matrix of the R obj (t) and R rev (t) may also be K.
  • Equation 26 when the number of the decoded channel audio signals is N and the number of the output signals is K, the downmixing process may be expressed by Equation 26.
  • Equation 27 DMX ( t )[ x ch +R obj ( t ) ⁇ X obj +R rev ( t ) ⁇ X rev ] [Equation 27]
  • the output signal may be downmixed by using DMX(t).
  • the time index t may be varied according to time of information of DMX(t).
  • the audio coding apparatus 101 and the audio decoding apparatus 102 may fully reflect a content production intention of an original sound engineer, using the reverberation signal of the object audio signal corresponding to the object audio signal.
  • the audio coding apparatus 101 and the audio decoding apparatus 102 may control the reverberation signal of the object audio signal. Therefore, the audio coding apparatus 101 and the audio decoding apparatus 102 may include rendering information corresponding to the reverberation signal of the object audio signal, for additional control of the reverberation signal.
  • FIG. 4 is a diagram illustrating the audio coding apparatus of FIG. 2 in detail.
  • the audio coding apparatus may include an audio signal encoding unit 401 and a bitstream transmission unit 402 .
  • the audio signal encoding unit 401 may receive a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
  • the audio signal encoding unit 401 may implement a sound scene of a higher quality by receiving the reverberation signal of the object audio signal.
  • the audio signal encoding unit 401 may encode the received channel audio signal, object audio signal, reverberation signal of the object audio signal into an audio signal.
  • the audio coding apparatus may receive rendering information 403 .
  • the audio coding apparatus may include a block for converting the rendering information 403 into a binary form.
  • the audio signal encoding unit 401 may encode to the audio signal including the channel audio signal, the object audio signal, the reverberation signal of the object audio signal, and the rendering information 403 .
  • the bitstream transmission unit 402 may convert the audio signal into a bitstream, and transmit the bitstream to the audio decoding apparatus.
  • the bitstream may include the audio signal including the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal, and the rendering information 403 .
  • the bitstream transmission unit 402 may transmit the bitstream to generate multichannel scene information.
  • the multichannel scene information may be generated based on the rendering information 403 .
  • the rendering information 403 may be used as additional data with respect to the reverberation signal of the object audio signal.
  • FIG. 5 is a diagram illustrating the audio decoding apparatus of FIG. 3 in detail.
  • the audio decoding apparatus may include a bitstream receiving unit 501 , an audio signal decoding unit 502 , and an audio rendering unit 503 .
  • the bitstream receiving unit 501 may receive a bitstream from an audio coding apparatus.
  • the received bitstream may include the audio signal and the rendering information.
  • the audio signal decoding unit 502 may decode the audio signal. That is, the audio signal decoding unit 502 may extract the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal included in the audio signal.
  • the audio rendering unit 503 may perform rendering with respect to the decoded channel audio signal, object audio signal, and reverberation signal of the object audio signal.
  • the object audio signal may be rendered based on the rendering process of FIG. 3 .
  • the reverberation signal of the object audio signal may be rendered according to an index of the corresponding object audio signal.
  • the reverberation signal of the object audio signal may be controlled in the same manner as the object reverberation signal being controlled, thereby providing a more realistic sound image.
  • the audio rendering unit 503 may generate the output signal by rendering the decoded channel audio signal, object audio signal, reverberation signal of the object audio signal.
  • the output signal may include the rendered object audio signal, the reverberation signal of the rendered object audio signal, and the decoded channel audio signal.
  • the output signal may be output to channels of the multichannel audio signal.
  • FIG. 6 is a diagram illustrating a configuration of rendering information 600 .
  • the rendering information 600 may be expressed in a matrix form.
  • Each matrix of the rendering information 600 may be expressed by a substitute value to express the rendering information.
  • location information of the object may be expressed by angles of a horizontal plane and a vertical plane.
  • a matrix value and a gain value related to delay information may be substituted by a value indicating a distance.
  • the rendering information 600 needs to be expressed by being converted into a matrix value to be applied to the rendered object audio signal and the reverberation signal of the rendered object audio signal corresponding to the rendering information 600 input in various types to be used as additional data of the reverberation signal of the object audio signal.
  • FIG. 7 is a diagram illustrating an audio coding method according to an embodiment.
  • an audio coding apparatus may include a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
  • the channel audio signal may be a generally used channel audio signal allocated to a channel of a predetermined reproduction device during reproduction.
  • the object audio signal may define a particular audio signal among a plurality of audio signals and use the particular audio signal as a subject performing rendering.
  • the reverberation signal of the object audio signal may be applied to the object audio signal and express a sound field feeling of the object audio signal.
  • the audio coding apparatus may encode the received channel audio signal, the object audio signal, and the reverberation signal of the object audio signal into an audio signal.
  • the audio coding apparatus may convert the audio signal into a bitstream.
  • the bitstream may include the audio signal including the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal, and rendering information 403 .
  • the audio coding apparatus may transmit the bitstream to generate multichannel scene information.
  • FIG. 8 is a diagram illustrating an audio decoding method according to an embodiment.
  • an audio decoding apparatus may receive a bitstream from an audio coding apparatus.
  • the received bitstream may include an audio signal and rendering information.
  • the audio decoding apparatus may decode the audio signal, thereby extracting a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal included in the audio signal.
  • the audio decoding apparatus may render the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal based on the rendering information included in the bitstream.
  • the reverberation signal of the object audio signal may be rendered according to an index of the corresponding object audio signal.
  • the reverberation signal of the object audio signal may be controlled in the same manner as the object audio signal being controlled, thereby providing a more realistic sound image.
  • the audio decoding apparatus may generate an output signal by rendering the decoded channel audio signal, object audio signal, and reverberation signal of the object audio signal.
  • the above-described embodiments of the present invention may be recorded in non-transitory computer-readable media including program instructions to implement various operations embodied by a computer.
  • the media may also include, alone or in combination with the program instructions, data files, data structures, and the like.
  • the program instructions recorded on the media may be those specially designed and constructed for the purposes of the embodiments, or they may be of the kind well-known and available to those having skill in the computer software arts.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Stereophonic System (AREA)

Abstract

An audio coding and decoding apparatus is disclosed. The audio coding apparatus may include an audio signal encoding unit to encode an audio signal; and a bitstream transmission unit to convert the audio signal into a bitstream and transmit the bitstream, wherein the audio signal comprises a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.

Description

TECHNICAL FIELD
The present invention relates to au audio coding and decoding apparatus using a reverberation signal of an object audio signal, and more particularly, to an audio coding and decoding apparatus which encodes and decodes audio using an audio signal including a reverberation signal of an object audio signal.
BACKGROUND ART
According to conventional methods, moving picture expert group (MPEG) spatial audio object coding (SAOC) and Dolby Atmos construct a sound scene using an input signal or an object, respectively.
MPEG SAOC considers an input audio signal as an object and receives the input audio signal. In addition, MPEG SAOC constructs the sound scene only with respect to input rendering information. In particular, MPEG SAOC is capable of transmission at a low bit rate and uses a spatial audio coding method as a high compression method.
Dolby Atmos refers to a multichannel audio format for theatres. Dolby Atmos transmits or stores a channel signal called ‘Beds’ and an object signal called ‘object’ and constructs the sound scene using metadata.
However, since the foregoing conventional methods construct the sound scene using the input audio signal or the object signal, in some cases, a sound scene not corresponding to an intention of content according to the input audio signal or the object signal may be included. This is because only base signals for constructing the sound scene are included.
Accordingly, there is a need for a method of constructing a more accurate sound scene corresponding to the intention of content according to the input audio signal or the object signal.
DISCLOSURE OF INVENTION Technical Goals
An aspect of the present invention provides an audio coding and decoding apparatus capable of reproducing an audio signal more efficiently and realistically, using a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
Another aspect of the present invention provides an audio coding and decoding apparatus capable of reconstructing a realistic sound scene according to a reverberation signal of an object audio signal, by rendering the object audio signal and the reverberation signal of the object audio signal.
Technical Solutions
According to an aspect of the present invention, there is provided an audio coding apparatus including an audio signal encoding unit to encode an audio signal, and a bitstream transmission unit to convert the audio signal into a bitstream and transmit the bitstream, wherein the audio signal comprises a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
According to an aspect of the present invention, there is provided an audio decoding apparatus including a bitstream receiving unit to receive a bitstream including an encoded audio signal, and an audio signal decoding unit to extract a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal from the bitstream by decoding the audio signal included in the bitstream.
The audio decoding apparatus may further include an audio rendering unit to render the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal based on the rendering information included in the bitstream.
According to an aspect of the present invention, there is provided an audio coding method including encoding an audio signal, and converting the audio signal into a bitstream and transmitting the bitstream, wherein the audio signal comprises a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
According to an aspect of the present invention, there is provided an audio decoding method including receiving a bitstream including an encoded audio signal, extracting a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal from the bitstream by decoding the audio signal included in the bitstream, and rendering the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal based on rendering information included in the bitstream.
Effects of Invention
According to an embodiment, an audio coding and decoding apparatus may be capable of reproducing an audio signal more efficiently and realistically, by using a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal, in reproducing a multichannel audio signal.
According to an embodiment, an audio coding and decoding apparatus may be to capable of reconstructing a realistic sound scene according to a reverberation signal of an object audio signal, by rendering the object audio signal and the reverberation signal of the object audio signal.
BRIEF DESCRIPTION OF DRAWINGS
FIG. 1 is a diagram illustrating an audio coding and decoding apparatus according to an embodiment.
FIG. 2 is a diagram illustrating an audio coding apparatus according to an embodiment.
FIG. 3 is a diagram illustrating an audio decoding apparatus according to an embodiment.
FIG. 4 is a diagram illustrating the audio coding apparatus of FIG. 2 in detail.
FIG. 5 is a diagram illustrating the audio decoding apparatus of FIG. 3 in detail.
FIG. 6 is a diagram illustrating a configuration of rendering information.
FIG. 7 is a diagram illustrating an audio coding method according to an embodiment.
FIG. 8 is a diagram illustrating an audio decoding method according to an embodiment.
BEST MODE FOR CARRYING OUT THE INVENTION
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
FIG. 1 is a diagram illustrating an audio coding and decoding apparatus according to an embodiment.
Referring to FIG. 1, an audio coding apparatus 101 may receive an audio signal which includes a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal. Here, the audio coding apparatus 101 may receive the audio signal by considering the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal as an object. The audio coding apparatus 101 is necessary to receive the audio signal including the foregoing three types of audio signal.
In addition, the audio coding apparatus 101 may receive rendering information. The rendering information, as additional data, may include rendering information based on a gain value and rendering information related to a time delay. In case of outputting the audio signal, the rendering information may a sound scene corresponding to the audio signal.
The audio coding apparatus 101 may encode the received audio signal, and convert the rendering information into a bit string. For example, the audio coding apparatus 101 may perform binary conversion to convert the rendering information into the bit string. In addition, the audio coding apparatus 101 may encode the audio signal and the rendering information simultaneously. Here, the audio coding apparatus 101 may include a block for converting the rendering information into the bit string.
The audio coding apparatus 101 may convert the encoded audio signal into the bitstream. The audio coding apparatus 101 may include a block capable of converting the rendering information into the bit string. The audio coding apparatus 101 may convert the rendering information and the encoded audio signal into the bitstream. The bitstream may include the rendering information and the encoded audio signal. In addition, the audio coding apparatus 101 may transmit the bitstream to an audio decoding apparatus 102.
The audio decoding apparatus 102 may receive the bitstream from the audio coding apparatus 101. The audio decoding apparatus 102 may extract the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal from the bitstream by decoding the audio signal included in the received bitstream. Additionally, the audio decoding apparatus 102 may render the extracted audio signal, object audio signal, and reverberation signal of the object audio signal, based on the rendering information included in the bitstream. The audio decoding apparatus 102 may output a rendered multichannel audio signal.
FIG. 2 is a diagram illustrating an audio coding apparatus 201 according to an embodiment.
Referring to FIG. 2, the audio coding apparatus 201 may include an audio signal encoding unit 202 and a bitstream transmission unit 203.
The audio signal encoding unit 202 may encode the audio signal. The audio signal may include a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal.
The channel audio signal may be a generally used channel audio signal and allocated to a channel of a random reproduction device when reproduced. Here, the channel audio signal may be a signal not varied by the rendering information. The channel audio signal may be expressed by a vector stream with respect to an N-number of channel audio signals using Equation 1.
X ch =[x 1 ch ,x 2 ch , . . . ,x N-1 ch]T[Equation 1]
The object audio signal may determine a particular audio signal among a plurality of audio signals as the object audio signal, and use the object audio signal as a subject to perform rendering. Here, the object audio signal may be a signal that may be defined in a predetermined spot through geometry analysis of the reproduction device. The object audio signal may be expressed by a matrix constituted by vector streams with respect to an M-number of object audio signals using Equation 2.
X obj =[x 1 obj ,x 2 obj , . . . ,x M-1 obj]T[Equation 2]
Here, Equation 2 may be used when rendering is performed independently from location information and delay information related to the object audio signal.
Here, the object audio signal may be expressed by the matrix because each object audio signal may include a plurality of channel audio signals. For example, when a first object audio signal x1 obj of the object audio signal includes stereo, the object audio signal may be expressed by Equation 3.
x 1 obj =[x 1 obj,1 ,x r obj,1]  [Equation 3]
A reverberation signal of the object audio signal is a reverberation signal applied to the object audio signal, which expresses a sound field feeling of the object audio signal. The reverberation signal of the object audio signal may include reverberation signals of the M-number of object audio signals, corresponding to the object audio signal. The reverberation signal of the object audio signal may be expressed by Equation 4.
X rev =[x 1 rev ,x 2 rev , . . . ,x M-1 rev]T[Equation 4]
In addition, in the same manner as the object audio signal, the reverberation signal of the object audio signal may include a plurality of channel audio signals. For example, the reverberation signal of the object audio signal including five 5.1 channels may be expressed by Equation 5.
x 1 rev =[x 1 rev,1 ,x r rev,1 ,x c rev,1 ,x ls rev,1 ,x rs rev,1]T[Equation 5]
Here, the audio signal encoding unit 202 may encode the audio signal by including a reverberation signal having various layouts with respect to the object audio signal.
The bitstream transmission unit 203 may convert the encoded audio signal into a bitstream. The bitstream transmission unit 203 may generate the bitstream from the encoded audio signal and the rendering information for outputting the audio signal. The rendering information may be additional data with respect to the audio signal. That is, the rendering information may be information applied to the audio signal to reproduce scene information related to a sound. The rendering information may include location information of an audio object, sound pressure information of the audio object, and delay information of the audio object. The rendering information may be expressed by Equation 6.
R(t)=P(t)G p(t)+D(t)G d(t)  [Equation 6]
R(t) may refer to the location information of the object audio signal. Gi(t) may refer to the sound pressure of the object audio signal. D(t) may refer to the delay of the object audio signal. G1(t) and G2(t) may be scale matrices for controlling the sound pressure with respect to the object audio signal. In addition, t may refer to an index related to time.
When rendering is performed with respect to the location information and the delay information simultaneously, the rendering may be expressed by Equation 7.
R(t)=PD(t)G pd(t)  [Equation 7]
The bitstream transmission 203 may transmit the bitstream to the audio decoding apparatus.
FIG. 3 is a diagram illustrating an audio decoding apparatus 301 according to an embodiment.
Referring to FIG. 3, the audio decoding apparatus 301 may include a bitstream receiving unit 302, an audio signal decoding unit 303, an audio rendering unit 304.
The bitstream receiving unit 302 may receive a bitstream including an encoded audio signal from an audio coding apparatus.
The audio signal decoding unit 303 may decode the audio signal included in the bitstream. In detail, the audio signal decoding unit 303 may extract a channel audio signal, an object audio signal, and a reverberation signal of the audio signal from the bitstream. For example, the audio signal decoding unit 303 may be expressed by Equation 8, Equation 9, and Equation 10, corresponding to the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal.
x ch =[X 1 ch ,x 1 ch , . . . ,x N-1 ch]T  [Equation 8]
x obj =[x 1 obj ,x 2 obj , . . . x M-1 obj]T  [Equation 9]
x rev =[x 1 rev ,x 2 rev , . . . ,x M-1 rev]T  [Equation 10]
The audio rendering unit 304 may render the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal, based on the rendering information included in the bitstream. The audio rendering unit 304 may construct a sound scene based on scene information related to the sound of the rendering information.
In detail, the audio rendering unit 304 may express a principle of rendering of the audio signal by Equation 11.
R ( t ) · X obj = [ P ( t ) G p ( t ) + D ( t ) G d ( t ) ] · X obj = P ( t ) [ G p ( t ) · X obj ] 1 + D ( t ) · [ G d ( t ) · X obj ] 2 [ Equation 11 ]
A process of applying a first term of Equation 11 will be described. The sound pressure of the object audio signal may be controlled. The process of controlling the object audio signal may be expressed by Equation 12.
G p ( t ) · X obj = [ g p , 0 0 0 g p , M - 1 ] [ x 0 obj x M - 1 obj ] = [ g p , 0 · x 0 obj g p , M - 1 · x M - 1 obj ] = X obj [ Equation 12 ]
x′obj with the sound pressure controlled may be allocated to a speaker position of a reproduction device, where output is actually performed by a sound image localization matrix P(t). Elements of the sound image localization matrix P(t) may be expressed by gain values of the sound pressure. Here, the gain value may include a real number between 0 and 1. In addition, when a number of channels capable of outputting is N, x′obj may be applied to the image localization matrix as in Equation 13.
P ( t ) G p ( t ) · X obj = [ p 0 , 0 p 0 , 1 p 0 , M - 1 p 1 , 0 p i , j p N - 1 , 0 p N - 1 , M - 1 ] [ g p , 0 · x 0 obj g p , j x j obj g p , M - 1 · x M - 1 obj ] [ Equation 13 ]
In Equation 13, when the object audio signal xj obj includes a J-number of layouts, the object audio signal xj obj may be expressed by Equation 14.
x j obj =[x 0 obj , . . . ,x J-1 obj]T[Equation 14]
As to the sound image localization matrix, calculation process of each element of the sound image localization matrix may be described through Equation 15.
p i , j · x j obj = [ p 0 i , j p L - 1 i , j ] [ x 0 obj , j x L - 1 obj , j ] = l = 0 L - 1 p l i , j · x l obj , j [ Equation 15 ]
Therefore, a signal output by the sound image localization matrix P(t) may be expressed by Equation 16.
P ( t ) G 1 ( t ) · X obj = [ p 0 , 0 p 0 , 1 p 0 , M - 1 p 1 , 0 p i , j p N - 1 , 0 p N - 1 , M - 1 ] [ g p , 0 · x 0 obj g p , j x j obj g p , M - 1 · x M - 1 obj ] = [ g p , 0 ( l = 0 L - 1 p l 0 , 0 · x l obj , 0 + + l = 0 L - 1 p l 0 , M - 1 · x l obj , M - 1 ) g p , j ( l = 0 L - 1 p l i , 0 · x l obj , 0 + + l = 0 L - 1 p l i , M - 1 · x l obj , M - 1 ) g p , M - 1 ( l = 0 L - 1 p l N - 1 , 0 · x l obj , 0 + + l = 0 L - 1 p l N - 1 , M - 1 · x l obj , M - 1 ) ] [ Equation 16 ]
A second term of Equation 10 may perform matrix calculation of a same dimension. The matrix calculation of the dimension may be expressed by Equation 17.
D ( t ) G d ( t ) · X obj = [ d 0 , 0 d 0 , 1 d 0 , M - 1 d 1 , 0 d i , j d N - 1 , 0 d N - 1 , M - 1 ] [ g d , 0 · x 0 obj g d , j x j obj g d , M - 1 · x M - 1 obj ] [ Equation 17 ]
In addition, the object audio signal xj obj of Equation 17 including the J-number of layouts may be expressed by Equation 18.
d i , j · x j obj = [ d 0 i , j d L - 1 i , j ] [ x 0 obj , j x L - 1 obj , j ] = l = 0 L - 1 p l i , j · x l obj , j = l = 0 L - 1 x l obj , j ( t - p l i , j ) [ Equation 18 ]
Here, since the delay calculation process of the object audio signal cannot be expressed through matrix multiplication, different from the sound image localization matrix application calculation, the delay calculation process may be expressed using an operator ∘. In addition, a signal output through the delay calculation matrix D(t) may be expressed by Equation 19.
D ( t ) G 1 ( t ) · X obj = [ d 0 , 0 d 0 , 1 d 0 , M - 1 d 1 , 0 d i , j d N - 1 , 0 d N - 1 , M - 1 ] · [ g d , 0 · x 0 obj g d , j x j obj g d , M - 1 · x M - 1 obj ] = [ g d , 0 ( l = 0 L - 1 x l obj , 0 ( t - d l 0 , 0 ) + + l = 0 L - 1 x l obj , , M - 1 ( t - d l 0 , M - 1 ) ) g d , j ( l = 0 L - 1 x l obj , 0 ( t - d l l , 0 ) + + l = 0 L - 1 x l obj , M - 1 ( t - d l l , M - 1 ) ) g d , M - 1 ( l = 0 L - 1 x l obj , 0 ( t - d l N - 1 , 0 ) + + l = 0 L - 1 x l obj , M - 1 ( t - d l N - 1 , M - 1 ) ) ] [ Equation 19 ]
The audio rendering unit 304 may apply the sound image localization matrix and the delay calculation matrix independently. When the audio rendering unit 304 applies the sound image localization matrix and the delay calculation matrix simultaneously, a matrix PD(t) may be expressed using Equation 20.
R ( t ) · X obj = PD ( t ) G pd ( t ) · X obj = [ p 0 , 0 d 0 , 0 p 0 , 1 d 0 , 1 p 0 , M - 1 d 0 , M - 1 p 1 , 0 d 1 , 0 p i , j d i , j p N - 1 , 0 d N - 1 , 0 p N - 1 , M - 1 d N - 1 , M - 1 ] [ g pd , 0 · x 0 obj g pd , j x j obj g pd , M - 1 · x M - 1 obj ] [ Equation 20 ]
Through the calculation of Equation 20, the audio rendering unit 304 may extract a result as shown in Equation 21.
PD ( t ) G pd ( t ) · X obj = [ g pd , 0 · ( l = 0 L - 1 p l 0 , 0 x l obj , 0 ( t - d l 0 , 0 ) + + l = 0 L - 1 p l 0 , M - 1 x l obj , M - 1 ( t - d l 0 , M - 1 ) ) g pd , j · ( l = 0 L - 1 p l i , 0 x l obj , 0 ( t - d l i , 0 ) + + l = 0 L - 1 p l i , M - 1 x l obj , M - 1 ( t - d l i , M - 1 ) ) g pd , M - 1 · ( l = 0 L - 1 p l N - 1 , 0 x l obj , 0 ( t - d l N - 1 , 0 ) + + l = 0 L - 1 p l N - 1 , M - 1 x l obj , M - 1 ( t - d l N - 1 , M - 1 ) ) ] [ Equation 21 ]
The audio rendering unit 304 may allocate the object audio signal to a channel signal which may be output, using the foregoing equation. In addition, the audio rendering unit 304 may combine the allocated object audio signal with the decoded channel audio signal. Additionally, the audio rendering unit 304 may generate an output signal to be finally output.
The audio rendering unit 304 may render the reverberation signal of the object audio signal as shown in Equation 22 or Equation 23.
R(tX rev =[P(t)G p(t)+D(t)G d(t)]X rev =P(t)[G p X·X rev ]+D(t)∘[G d(tX rev]  [Equation 22]
R(tX obj =PD(t)G pd(tX rev[Equation 23]
Rendering of the reverberation signal of the object audio signal using Equation 22 and Equation 23 may render the object audio signal. By rendering the reverberation signal of the object audio signal corresponding to the object audio signal, the sound scene with higher reality may be implemented.
In addition, when controlling the object audio signal, the audio rendering unit 304 may control the reverberation signal of the object audio signal corresponding to the object audio signal. For example, when intending to control during rendering of the object audio signal xj obj, the audio rendering unit 304 may allocate a solution of the gain value of Equation 11 as in gp,j=gd,j=0. In addition, the audio rendering unit 304 may control the reverberation signal corresponding an index of the object audio signal in the same manner as gpd,j=0 of Equation 11. Furthermore, the audio rendering unit 304 may allocate the solution of the gain value of Equation 22 as in gp,j=gd,j=0, or control the object audio signal as in gpd,j=0 of Equation 23.
The output signal to be finally output may be an integrated signal of the rendered object audio signal, the reverberation signal of the rendered object audio signal, and the decoded channel audio signal. The output signal may be expressed by Equation 24.
y ch =x′ ch +R obj(tX obj −R rev(tX rev  [Equation 24]
In Equation 24, the output signal may be separated into Robj(t) and Rrev(t). That is, the output signal may be transmitted through different methods as information on the rendered object audio signal and information on the reverberation signal of the object audio signal. Therefore, Equation 23 shows that the output signal is to be transmitted as Robj(t) and Rrev(t) as the rendering information.
In Equation 23, the decoded channel audio signal is denoted by x′ch since the decoded channel audio signal x′ch is expressed in the form of a downmixed signal when the number of channels for final output does not correspond to the decoded channel audio signals. For example, when the number of the decoded channel audio signals is N and the number of output signals output through the Robj(t) and Rrev(t) and the channels is K, xch may be converted into x′ch through a downmix matrix. That is, a number of dimensions of a row matrix of the Robj(t) and Rrev(t) may also be K.
Here, the downmix matrix may be expressed by Equation 25.
x′ ch =DMX(tx ch  [Equation 25]
Based on Equation 25, when the number of the decoded channel audio signals is N and the number of the output signals is K, the downmixing process may be expressed by Equation 26.
x ch = DMX ( t ) · x ch = [ c 0 , 0 c 0 , N - 1 c K - 1 , 0 c K - 1 , N - 1 ] [ x 0 x 1 x N - 1 ] [ Equation 26 ]
Here, when the number of dimensions of the row matrix of the Robj(t) and Rrev(t) is also N, the output signal may be expressed by Equation 27, by reflecting Equation 24 to Equation 23.
y ch =DMX(t)[x ch +R obj(tX obj +R rev(tX rev]  [Equation 27]
That is, after rendering with respect to the N-number of channel audio signals is performed, the output signal may be downmixed by using DMX(t). In addition, the time index t may be varied according to time of information of DMX(t).
The audio coding apparatus 101 and the audio decoding apparatus 102 may fully reflect a content production intention of an original sound engineer, using the reverberation signal of the object audio signal corresponding to the object audio signal. The audio coding apparatus 101 and the audio decoding apparatus 102 may control the reverberation signal of the object audio signal. Therefore, the audio coding apparatus 101 and the audio decoding apparatus 102 may include rendering information corresponding to the reverberation signal of the object audio signal, for additional control of the reverberation signal.
FIG. 4 is a diagram illustrating the audio coding apparatus of FIG. 2 in detail.
Referring to FIG. 4, the audio coding apparatus may include an audio signal encoding unit 401 and a bitstream transmission unit 402.
The audio signal encoding unit 401 may receive a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal. Here, the audio signal encoding unit 401 may implement a sound scene of a higher quality by receiving the reverberation signal of the object audio signal. Additionally, the audio signal encoding unit 401 may encode the received channel audio signal, object audio signal, reverberation signal of the object audio signal into an audio signal.
In addition, the audio coding apparatus may receive rendering information 403. The audio coding apparatus may include a block for converting the rendering information 403 into a binary form.
Here, when the audio signal encoding unit 401 includes the block for converting the rendering information 403, the audio signal encoding unit 401 may encode to the audio signal including the channel audio signal, the object audio signal, the reverberation signal of the object audio signal, and the rendering information 403.
The bitstream transmission unit 402 may convert the audio signal into a bitstream, and transmit the bitstream to the audio decoding apparatus. The bitstream may include the audio signal including the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal, and the rendering information 403. The bitstream transmission unit 402 may transmit the bitstream to generate multichannel scene information. The multichannel scene information may be generated based on the rendering information 403. The rendering information 403 may be used as additional data with respect to the reverberation signal of the object audio signal.
FIG. 5 is a diagram illustrating the audio decoding apparatus of FIG. 3 in detail.
The audio decoding apparatus may include a bitstream receiving unit 501, an audio signal decoding unit 502, and an audio rendering unit 503.
The bitstream receiving unit 501 may receive a bitstream from an audio coding apparatus. The received bitstream may include the audio signal and the rendering information.
The audio signal decoding unit 502 may decode the audio signal. That is, the audio signal decoding unit 502 may extract the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal included in the audio signal.
The audio rendering unit 503 may perform rendering with respect to the decoded channel audio signal, object audio signal, and reverberation signal of the object audio signal. The object audio signal may be rendered based on the rendering process of FIG. 3. When the object audio signal is rendered, the reverberation signal of the object audio signal may be rendered according to an index of the corresponding object audio signal. The reverberation signal of the object audio signal may be controlled in the same manner as the object reverberation signal being controlled, thereby providing a more realistic sound image.
The audio rendering unit 503 may generate the output signal by rendering the decoded channel audio signal, object audio signal, reverberation signal of the object audio signal. Here, the output signal may include the rendered object audio signal, the reverberation signal of the rendered object audio signal, and the decoded channel audio signal. The output signal may be output to channels of the multichannel audio signal.
FIG. 6 is a diagram illustrating a configuration of rendering information 600.
Referring to FIG. 6, the rendering information 600 may be expressed in a matrix form. Each matrix of the rendering information 600 may be expressed by a substitute value to express the rendering information. For example, location information of the object may be expressed by angles of a horizontal plane and a vertical plane. A matrix value and a gain value related to delay information may be substituted by a value indicating a distance. In addition, the rendering information 600 needs to be expressed by being converted into a matrix value to be applied to the rendered object audio signal and the reverberation signal of the rendered object audio signal corresponding to the rendering information 600 input in various types to be used as additional data of the reverberation signal of the object audio signal.
FIG. 7 is a diagram illustrating an audio coding method according to an embodiment.
In operation 701, an audio coding apparatus may include a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal. The channel audio signal may be a generally used channel audio signal allocated to a channel of a predetermined reproduction device during reproduction. The object audio signal may define a particular audio signal among a plurality of audio signals and use the particular audio signal as a subject performing rendering. The reverberation signal of the object audio signal may be applied to the object audio signal and express a sound field feeling of the object audio signal.
The audio coding apparatus may encode the received channel audio signal, the object audio signal, and the reverberation signal of the object audio signal into an audio signal.
In operation 702, the audio coding apparatus may convert the audio signal into a bitstream. The bitstream may include the audio signal including the channel audio signal, the object audio signal, and the reverberation signal of the object audio signal, and rendering information 403. The audio coding apparatus may transmit the bitstream to generate multichannel scene information.
FIG. 8 is a diagram illustrating an audio decoding method according to an embodiment.
In operation 801, an audio decoding apparatus may receive a bitstream from an audio coding apparatus. The received bitstream may include an audio signal and rendering information.
In operation 802, the audio decoding apparatus may decode the audio signal, thereby extracting a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal included in the audio signal.
In operation 803, the audio decoding apparatus may render the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal based on the rendering information included in the bitstream. When the object audio signal is rendered, the reverberation signal of the object audio signal may be rendered according to an index of the corresponding object audio signal. In addition, the reverberation signal of the object audio signal may be controlled in the same manner as the object audio signal being controlled, thereby providing a more realistic sound image. Furthermore, the audio decoding apparatus may generate an output signal by rendering the decoded channel audio signal, object audio signal, and reverberation signal of the object audio signal.
The above-described embodiments of the present invention may be recorded in non-transitory computer-readable media including program instructions to implement various operations embodied by a computer. The media may also include, alone or in combination with the program instructions, data files, data structures, and the like. The program instructions recorded on the media may be those specially designed and constructed for the purposes of the embodiments, or they may be of the kind well-known and available to those having skill in the computer software arts.
A number of examples have been described above. Nevertheless, it will be understood that various modifications may be made. For example, suitable results may be achieved if the described techniques are performed in a different order and/or if components in a described system, architecture, device, or circuit are combined in a different manner and/or replaced or supplemented by other components or their equivalents.
Accordingly, other implementations are within the scope of the following claims.

Claims (16)

The invention claimed is:
1. An audio coding apparatus comprising:
an audio signal encoding unit to encode an audio signal and a rendering information; and
a bitstream transmission unit to convert the audio signal and the rendering information into a bitstream and transmit the bitstream,
wherein the audio signal comprises a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal,
wherein the rendering information indicates sound scene information with respect to the object audio signal.
2. The audio coding apparatus of claim 1, wherein the reverberation signal of the object audio signal expresses a sound field feeling of the object audio signal.
3. The audio coding apparatus of claim 1, wherein the reverberation signal of the object audio signal comprises a plurality of channel signals.
4. The audio coding apparatus of claim 1, wherein the reverberation signal of the object audio signal provides various layouts with respect to the object audio signal.
5. The audio coding apparatus of claim 1, wherein the bitstream transmission unit generates the bitstream from the encoded audio signal and the rendering information for generation of the audio signal.
6. The audio coding apparatus of claim 1, wherein the rendering information comprises at least one of location information of an audio object, sound pressure information of the audio object, and delay information of the audio object.
7. An audio decoding apparatus comprising:
a bitstream receiving unit to receive a bitstream including an encoded audio signal and a rendering information; and
an audio signal decoding unit to extract a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal from the bitstream by decoding the audio signal included in the bitstream,
wherein the rendering information indicates sound scene information with respect to the object audio signal.
8. The audio decoding apparatus of claim 7, wherein the reverberation signal of the object audio signal expresses a sound field feeling of the object audio signal.
9. The audio decoding apparatus of claim 7, wherein the reverberation signal of the object audio signal comprises a plurality of channel signals.
10. The audio decoding apparatus of claim 8, wherein the reverberation signal of the object audio signal provides various layouts with respect to the object audio signal.
11. The audio decoding apparatus of claim 7, further comprising:
an audio rendering unit to render the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal based on the rendering information included in the bitstream.
12. The audio decoding apparatus of claim 11, wherein the rendering information comprises at least one of location information of an audio object, sound pressure information of the audio object, and delay information of the audio object.
13. The audio decoding apparatus of claim 11, wherein the audio rendering unit controls the reverberation signal of the object audio signal corresponding to the object audio signal, when controlling the object audio signal.
14. The audio decoding apparatus of claim 11, wherein the audio rendering unit controls the reverberation signal of the object audio signal in consideration of an index of the object audio signal corresponding to the reverberation signal of the object audio signal.
15. An audio decoding method comprising:
receiving a bitstream comprising an encoded audio signal and a rendering information;
extracting a channel audio signal, an object audio signal, and a reverberation signal of the object audio signal from the bitstream by decoding the audio signal included in the bitstream; and
rendering the extracted channel audio signal, object audio signal, and reverberation signal of the object audio signal based on the rendering information included in the bitstream, wherein the rendering information comprises sound scene information with respect to the object audio signal.
16. The audio decoding method of claim 15, wherein the reverberation signal of the object audio signal comprises a plurality of channel signals, expresses a sound field feeling of the object audio signal, and provides various layouts with respect to the object audio signal.
US14/435,372 2012-10-12 2013-07-19 Audio encoding/decoding device using reverberation signal of object audio signal Active US9595266B2 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
KR10-2012-0113604 2012-10-12
KR20120113604 2012-10-12
KR1020130069101A KR20140047509A (en) 2012-10-12 2013-06-17 Audio coding/decoding apparatus using reverberation signal of object audio signal
KR10-2013-0069101 2013-06-17
PCT/KR2013/006471 WO2014058138A1 (en) 2012-10-12 2013-07-19 Audio encoding/decoding device using reverberation signal of object audio signal

Publications (2)

Publication Number Publication Date
US20150279376A1 US20150279376A1 (en) 2015-10-01
US9595266B2 true US9595266B2 (en) 2017-03-14

Family

ID=50654074

Family Applications (1)

Application Number Title Priority Date Filing Date
US14/435,372 Active US9595266B2 (en) 2012-10-12 2013-07-19 Audio encoding/decoding device using reverberation signal of object audio signal

Country Status (2)

Country Link
US (1) US9595266B2 (en)
KR (4) KR20140047509A (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102243395B1 (en) * 2013-09-05 2021-04-22 한국전자통신연구원 Apparatus for encoding audio signal, apparatus for decoding audio signal, and apparatus for replaying audio signal
KR102465286B1 (en) * 2015-06-17 2022-11-10 소니그룹주식회사 Transmission device, transmission method, reception device and reception method
US10325610B2 (en) * 2016-03-30 2019-06-18 Microsoft Technology Licensing, Llc Adaptive audio rendering
CN115334444A (en) 2018-04-11 2022-11-11 杜比国际公司 Method, apparatus and system for pre-rendering signals for audio rendering

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080071549A1 (en) 2004-07-02 2008-03-20 Chong Kok S Audio Signal Decoding Device and Audio Signal Encoding Device
US20090210238A1 (en) 2007-02-14 2009-08-20 Lg Electronics Inc. Methods and Apparatuses for Encoding and Decoding Object-Based Audio Signals
KR20090110242A (en) 2008-04-17 2009-10-21 삼성전자주식회사 Method and apparatus for processing audio signal
US20100145487A1 (en) 2007-06-08 2010-06-10 Hyen-O Oh Method and an apparatus for processing an audio signal
US20110040396A1 (en) 2009-08-14 2011-02-17 Srs Labs, Inc. System for adaptively streaming audio objects
US20110166681A1 (en) 2005-11-01 2011-07-07 Electronics And Telecommunications Research Institute System and method for transmitting/receiving object-based audio
US20120093321A1 (en) 2010-10-13 2012-04-19 Samsung Electronics Co., Ltd. Apparatus and method for encoding and decoding spatial parameter
US20140133683A1 (en) * 2011-07-01 2014-05-15 Doly Laboratories Licensing Corporation System and Method for Adaptive Audio Signal Generation, Coding and Rendering
US20140350944A1 (en) * 2011-03-16 2014-11-27 Dts, Inc. Encoding and reproduction of three dimensional audio soundtracks
US20150235645A1 (en) * 2012-08-07 2015-08-20 Dolby Laboratories Licensing Corporation Encoding and Rendering of Object Based Audio Indicative of Game Audio Content
US20150271620A1 (en) * 2012-08-31 2015-09-24 Dolby Laboratories Licensing Corporation Reflected and direct rendering of upmixed content to individually addressable drivers
US20150350804A1 (en) * 2012-08-31 2015-12-03 Dolby Laboratories Licensing Corporation Reflected Sound Rendering for Object-Based Audio

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2461321B1 (en) * 2009-07-31 2018-05-16 Panasonic Intellectual Property Management Co., Ltd. Coding device and decoding device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080071549A1 (en) 2004-07-02 2008-03-20 Chong Kok S Audio Signal Decoding Device and Audio Signal Encoding Device
US20110166681A1 (en) 2005-11-01 2011-07-07 Electronics And Telecommunications Research Institute System and method for transmitting/receiving object-based audio
US20090210238A1 (en) 2007-02-14 2009-08-20 Lg Electronics Inc. Methods and Apparatuses for Encoding and Decoding Object-Based Audio Signals
US20100145487A1 (en) 2007-06-08 2010-06-10 Hyen-O Oh Method and an apparatus for processing an audio signal
KR20090110242A (en) 2008-04-17 2009-10-21 삼성전자주식회사 Method and apparatus for processing audio signal
US20110040396A1 (en) 2009-08-14 2011-02-17 Srs Labs, Inc. System for adaptively streaming audio objects
US20120093321A1 (en) 2010-10-13 2012-04-19 Samsung Electronics Co., Ltd. Apparatus and method for encoding and decoding spatial parameter
US20140350944A1 (en) * 2011-03-16 2014-11-27 Dts, Inc. Encoding and reproduction of three dimensional audio soundtracks
US20140133683A1 (en) * 2011-07-01 2014-05-15 Doly Laboratories Licensing Corporation System and Method for Adaptive Audio Signal Generation, Coding and Rendering
US20150235645A1 (en) * 2012-08-07 2015-08-20 Dolby Laboratories Licensing Corporation Encoding and Rendering of Object Based Audio Indicative of Game Audio Content
US20150271620A1 (en) * 2012-08-31 2015-09-24 Dolby Laboratories Licensing Corporation Reflected and direct rendering of upmixed content to individually addressable drivers
US20150350804A1 (en) * 2012-08-31 2015-12-03 Dolby Laboratories Licensing Corporation Reflected Sound Rendering for Object-Based Audio

Also Published As

Publication number Publication date
KR102478163B1 (en) 2022-12-16
KR20240144870A (en) 2024-10-04
KR102710843B1 (en) 2024-09-27
KR20230007971A (en) 2023-01-13
KR20140047509A (en) 2014-04-22
KR20210151741A (en) 2021-12-14
US20150279376A1 (en) 2015-10-01

Similar Documents

Publication Publication Date Title
KR102131748B1 (en) Method and apparatus for encoding and decoding successive frames of an ambisonics representation of a 2- or 3-dimensional sound field
TWI744341B (en) Distance panning using near / far-field rendering
JP6288100B2 (en) Audio encoding apparatus and audio decoding apparatus
EP2273492B1 (en) Method and apparatus for generating additional information bit stream of multi-object audio signal
JP2018174590A (en) Processing of spatially spread or large audio object
KR102710843B1 (en) Audio coding/decoding apparatus using reverberation signal of object audio signal
JP7272269B2 (en) SIGNAL PROCESSING APPARATUS AND METHOD, AND PROGRAM
US10575111B2 (en) Audio encoding apparatus and method, audio decoding apparatus and method, and audio reproducing apparatus
KR102335911B1 (en) Audio coding/decoding apparatus using reverberation signal of object audio signal
KR20140017344A (en) Apparatus and method for audio signal processing
TW202123220A (en) Multichannel audio encode and decode using directional metadata

Legal Events

Date Code Title Description
AS Assignment

Owner name: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTIT

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BEACK, SEUNG KWON;SEO, JEONG IL;LEE, TAE JIN;AND OTHERS;SIGNING DATES FROM 20150429 TO 20150504;REEL/FRAME:035729/0489

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2552); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

Year of fee payment: 8