KR20230011054A

KR20230011054A - Method for counting cough by analyzing acoustic signal, server and non-transitory computer-readable recording medium performing the same

Info

Publication number: KR20230011054A
Application number: KR1020210091624A
Authority: KR
Inventors: 송지영; 정지영; 강상연; 두경연; 김온섭; 이아라
Original assignee: 다인기술 주식회사
Priority date: 2021-07-13
Filing date: 2021-07-13
Publication date: 2023-01-20

Abstract

A method for accurately counting coughs by analyzing acoustic signals according to an embodiment of the present application comprises the steps of: extracting one or more onset signals from the acoustic signals, wherein the onset signals include a signal corresponding to an attack of a sound, and have a predetermined length in a time domain; obtaining a spectrogram corresponding to the extracted onset signals; by using a cough discrimination model, determining whether the obtained spectrogram is a cough section; and calculating the total number of coughs in the acoustic signals based on a result determined for the spectrogram of the onset signals. The obtaining step and the determining step are performed for each onset signal of the extracted one or more onset signals.

Description

Method for counting cough by analyzing acoustic signal, server and non-transitory computer-readable recording medium performing the same}

실시예는 음향 신호에 기초하여 기침을 계수하는 방법, 이를 수행하는 서버 및 비일시성의 컴퓨터 판독 가능 기록 매체에 관한 것이다.Embodiments relate to a method for counting coughs based on acoustic signals, a server performing the same, and a non-transitory computer readable recording medium.

기침의 횟수는 기침의 정도를 객관적으로 평가할 수 있는 가장 기본적이고중요한 지표이나, 훈련된 인력이 기침 소리를 들으며 직접 횟수를 세는 방식이 유일한 실정이다. The number of coughs is the most basic and important indicator that can objectively evaluate the degree of coughing, but the only method in which trained personnel directly count the number of times while listening to the sound of coughing is the only situation.

구체적으로, 지속적인 관리가 필요한 호흡기 질환의 특성상, 병원에서의 24시간 녹음을 통한 기침 분석 외에 정량적인 모니터링 플랫폼 필요하지만, 현재는 1) 의료기관 내원 시 환자의 주관적 느낌과 기억에 의존한 기침의 빈도(Frequency), 강도 (Intensity), 기침으로 인한 생활의 불편(Disruption) 등을 설문/문진을 통하여 중증도를 진단하고 진료를 진행하거나, 2) 병원 진료 과정, 글로벌 제약사들의 임상 시험 과정에서도 기침의 정도를 평가 시 환자에게 녹음기를 부착하여 수십분-24시간 소리를 녹음한 후 훈련된 인력이 녹음 파일을 들으며 기침 횟수를 계수하고 있어, 객관성과 효율성이 담보된 의료 정보가 환자/의료진 모두에게 부족한 상황이다.Specifically, due to the nature of respiratory diseases that require continuous management, a quantitative monitoring platform is needed in addition to cough analysis through 24-hour recordings in hospitals, but currently 1) the frequency of cough depending on the patient's subjective feeling and memory when visiting a medical institution ( Frequency), intensity, and discomfort in life due to coughing are diagnosed through questionnaires/questions and treatment is performed, or 2) the degree of coughing is evaluated in the course of hospital treatment and clinical trials by global pharmaceutical companies. During the evaluation, a tape recorder is attached to the patient to record the sound for tens of minutes to 24 hours, and then trained personnel listen to the recorded file and count the number of coughs. Therefore, medical information that guarantees objectivity and efficiency is lacking for both patients and medical staff.

녹음 파일의 파형을 시각화하여 계수하는 방식이 제안되었으나, 이 방법도 마찬가지로 인력이 파형을 보며 직접 횟수를 계수하는 방식으로, 객관성과 효율성에 대한 근본적인 문제는 해결되지 못하고 있다("How to count coughs? Counting by ear, the effect of visual data and the evaluation of an automated cough monitor," R. D. Turner and G. H. Bothamley, Respiratory Medicine, Dec 2014 https://www.sciencedirect.com/science/article/pii/S0954611114003357).A method of visualizing and counting the waveform of a recorded file has been proposed, but this method is also a method of directly counting the number of times while watching the waveform, and the fundamental problem of objectivity and efficiency has not been solved ("How to count coughs? Counting by ear, the effect of visual data and the evaluation of an automated cough monitor," R. D. Turner and G. H. Bothamley, Respiratory Medicine, Dec 2014 https://www.sciencedirect.com/science/article/pii/S0954611114003357).

이에 따라, 음향 신호에 기초하여 기침을 정확하게 계수하는 방법의 개발이 필요한 실정이다.Accordingly, it is necessary to develop a method for accurately counting cough based on the acoustic signal.

실시예는 음향 신호에 기초하여 기침을 정확하게 계수하는 방법을 제공하는 것을 일 목적으로 한다.An object of the embodiment is to provide a method for accurately counting cough based on an acoustic signal.

본 출원의 일 실시예에 따른 음향 신호를 분석하여 기침을 계수하는 방법에 있어서, 상기 음향 신호에서 하나 이상의 온셋 신호를 추출하는 단계-상기 온셋 신호는 소리의 어텍(Attack)에 대응되는 신호를 포함하고, 시간 도메인에서의 기결정된 길이를 가짐-; 상기 추출된 온셋 신호에 대응되는 스펙트로그램을 획득하는 단계; 기침 판별 모델을 이용하여, 상기 획득된 스펙트로그램이 기침 구간인지 판별하는 단계; 및 상기 온셋 신호의 스펙트로그램에 대하여 판별된 결과에 기초하여, 상기 음향 신호에서의 전체 기침 횟수를 계산하는 단계;를 포함하고, 상기 획득하는 단계 및 상기 판별하는 단계는, 상기 추출된 하나 이상의 온셋 신호의 각 온셋 신호에 대해서 수행되고, 상기 전체 기침 횟수를 계산하는 단계에서는, 제1 온셋 신호의 시점과 제2 온셋 신호의 시점이 기준 시간을 초과하는 만큼 이격되어 있으면 둘로 계수되고, 상기 제1 온셋 신호의 시점과 상기 제2 온셋 신호의 시점이 기준 시간 이내이면 하나로 계수되며, 상기 제1 온셋 신호 및 상기 제2 온셋 신호는 기침으로 판별된 신호인, 방법이 제공될 수 있다.In the method for counting cough by analyzing an acoustic signal according to an embodiment of the present application, extracting one or more onset signals from the acoustic signal - the onset signal includes a signal corresponding to an attack of sound and has a predetermined length in the time domain; obtaining a spectrogram corresponding to the extracted onset signal; determining whether the obtained spectrogram is a cough interval by using a cough discrimination model; and calculating the total number of coughs in the acoustic signal based on a result determined for the spectrogram of the onset signal, wherein the obtaining and determining steps include the extracted one or more onsets. It is performed for each onset signal of the signal, and in the step of calculating the total number of coughs, if the time point of the first onset signal and the time point of the second onset signal are spaced apart by more than a reference time, they are counted as two. If the time point of the onset signal and the time point of the second onset signal are within a reference time, it is counted as one, and the first onset signal and the second onset signal are signals determined as coughing.

본 출원의 일 실시예에 따른 음향 신호를 분석하여 기침을 계수하는 시스템에 있어서, 외부 장치에서 녹음된 소리를 포함하는 음향 신호를 획득하는 통신부; 기침 판별 모델을 로딩하기 위한 인스트럭션을 저장하는 메모리부; 상기 음향 신호에서 하나 이상의 온셋 신호를 추출하고-상기 온셋 신호는 소리의 어텍(Attack)에 대응되는 신호를 포함하고, 시간 도메인에서의 기결정된 길이를 가짐-, 상기 추출된 온셋 신호에 대응되는 스펙트로그램을 획득하고, 상기 기침 판별 모델을 이용하여, 상기 획득된 스펙트로그램이 기침 구간인지 판별하고, 상기 온셋 신호의 스펙트로그램에 대하여 판별된 결과에 기초하여, 상기 음향 신호에서의 전체 기침 횟수를 계산하도록 구성되는 제어부; 를 포함하는, 서버가 제공될 수 있다.A system for counting cough by analyzing a sound signal according to an embodiment of the present application, comprising: a communication unit for acquiring a sound signal including a sound recorded by an external device; a memory unit for storing instructions for loading a cough discrimination model; Extracting one or more onset signals from the sound signal, wherein the onset signal includes a signal corresponding to an attack of sound and has a predetermined length in the time domain, and a spectrometer corresponding to the extracted onset signal gram is acquired, using the cough discrimination model, it is determined whether the acquired spectrogram is a cough interval, and the total number of coughs in the acoustic signal is calculated based on the result determined for the spectrogram of the onset signal. a controller configured to; Including, a server may be provided.

실시예에 따르면, 음향 신호에 기초하여 기침을 계수함에 있어, 처리해야하는 연산량을 감소시키면서도 정확하게 기침을 계수하는 방법을 제공한다.According to the embodiment, in counting coughs based on sound signals, a method for accurately counting coughs while reducing the amount of calculations to be processed is provided.

본 출원의 효과가 상술한 효과들로 제한되는 것은 아니며, 언급되지 아니한 효과들은 본 명세서 및 첨부된 도면으로부터 본 출원이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.The effects of the present application are not limited to the above-mentioned effects, and effects not mentioned will be clearly understood by those skilled in the art from this specification and the accompanying drawings.

도 1은 본 출원의 일 실시예에 따른 기침을 계수하기 위한 전체 시스템(10)의 구성을 개략적으로 나타내는 도면이다.
도 2는 본 출원의 일 실시예에 따른 서버(2000)의 구성요소를 설명하기 위한 도면이다.
도 3은 본 출원의 일 실시예에 따른 기침을 계수하는 방법을 설명하기 위한 순서도이다.
도 4는 본 출원의 일 실시예에 따른 온셋 신호를 검출하는 방법을 설명하기 위한 도면이다.
도 5는 본 출원의 일 실시예에 따른 제어부(2300)에 포함된 온셋 신호를 검출하는 모듈을 설명하기 위한 도면이다.
도 6은 본 출원의 일 실시예에 따른 기침 판별 동작을 설명하기 위한 도면이다.
도 7은 본 출원의 일 실시예에 따른 기침 계수 방법에 관하여 설명하기 위한 도면이다.Figure 1 is a diagram schematically showing the configuration of the entire system 10 for counting cough according to an embodiment of the present application.
2 is a diagram for explaining components of a server 2000 according to an embodiment of the present application.
3 is a flowchart illustrating a method of counting cough according to an embodiment of the present application.
4 is a diagram for explaining a method of detecting an onset signal according to an embodiment of the present application.
5 is a diagram for explaining a module for detecting an onset signal included in the control unit 2300 according to an embodiment of the present application.
6 is a diagram for explaining a cough determination operation according to an embodiment of the present application.
7 is a view for explaining a cough counting method according to an embodiment of the present application.

본 출원의 상술한 목적, 특징들 및 장점은 첨부된 도면과 관련된 다음의 상세한 설명을 통해 보다 분명해질 것이다. 다만, 본 출원은 다양한 변경을 가할 수 있고 여러 가지 실시예들을 가질 수 있는 바, 이하에서는 특정 실시예들을 도면에 예시하고 이를 상세히 설명하고자 한다. The foregoing objects, features and advantages of the present application will become more apparent from the following detailed description taken in conjunction with the accompanying drawings. However, the present application can apply various changes and can have various embodiments. Hereinafter, specific embodiments will be illustrated in the drawings and described in detail.

도면들에 있어서, 층 및 영역들의 두께는 명확성을 기하기 위하여 과장되어진 것이며, 또한, 구성요소(element) 또는 층이 다른 구성요소 또는 층의 "위(on)" 또는 "상(on)"으로 지칭되는 것은 다른 구성요소 또는 층의 바로 위 뿐만 아니라 중간에 다른 층 또는 다른 구성요소를 개재한 경우를 모두 포함한다. 명세서 전체에 걸쳐서 동일한 참조번호들은 원칙적으로 동일한 구성요소들을 나타낸다. 또한, 각 실시예의 도면에 나타나는 동일한 사상의 범위 내의 기능이 동일한 구성요소는 동일한 참조부호를 사용하여 설명한다.In the drawings, the thickness of layers and regions is exaggerated for clarity, and elements or layers may be "on" or "on" other elements or layers. What is referred to includes all cases where another layer or other component is intervened in the middle as well as immediately above another component or layer. Like reference numerals designate essentially like elements throughout the specification. In addition, components having the same function within the scope of the same idea appearing in the drawings of each embodiment are described using the same reference numerals.

본 출원과 관련된 공지 기능 혹은 구성에 대한 구체적인 설명이 본 출원의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 본 명세서의 설명 과정에서 이용되는 숫자(예를 들어, 제1, 제2 등)는 하나의 구성요소를 다른 구성요소와 구분하기 위한 식별기호에 불과하다.If it is determined that a detailed description of a known function or configuration related to the present application may unnecessarily obscure the subject matter of the present application, the detailed description thereof will be omitted. In addition, numbers (eg, first, second, etc.) used in the description process of this specification are only identifiers for distinguishing one component from another component.

또한, 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다.In addition, the suffixes "module" and "unit" for the components used in the following description are given or used together in consideration of ease of writing the specification, and do not have meanings or roles that are distinguished from each other by themselves.

여기서, 상기 온셋 신호의 시간 도메인에서의 길이는 상기 기준 시간보다 긴, 방법이 제공될 수 있다.Here, a method in which the length of the onset signal in the time domain is longer than the reference time may be provided.

여기서, 상기 온셋 신호를 추출하는 단계는, 상기 음향 신호에서 온셋 지점을 검출하는 단계; 및 상기 검출된 온셋 지점을 시점으로 상기 기결정된 길이의 시간 구간에 대응되는 신호를 추출하는 단계;를 포함하는, 방법이 제공될 수 있다. Here, the extracting of the onset signal may include: detecting an onset point in the sound signal; and extracting a signal corresponding to the time interval of the predetermined length from the detected onset point as a starting point.

여기서, 상기 스펙트로그램을 획득하는 단계는, 상기 추출된 온셋 신호를 주파수 도메인으로 변환하여 스펙트로그램을 획득하는 단계이고, 상기 추출된 온셋 신호에 대한 퓨리에 변환을 포함하는, 방법이 제공될 수 있다.Here, the obtaining of the spectrogram may include obtaining a spectrogram by transforming the extracted onset signal into a frequency domain, and including performing Fourier transform on the extracted onset signal.

여기서, 상기 스펙트로그램을 획득하는 단계는, 상기 음향 신호를 주파수 도메인으로 변환하여 획득된 전체 스펙트로그램에서 상기 추출된 온셋 신호에 대응되는 스펙트로그램을 추출하는 단계를 포함하는, 방법이 제공될 수 있다.Here, the obtaining of the spectrogram may include extracting a spectrogram corresponding to the extracted onset signal from the entire spectrogram obtained by converting the acoustic signal into a frequency domain. .

여기서, 상기 기침 판별 모델은 스펙트로그램 데이터를 입력받아 기침 또는 비기침으로 분류하도록 학습된 분류모델이고, 상기 스펙트로그램 데이터는, 시간도메인에서 기 결정된 길이를 가지는 스펙트로그램 이미지인, 방법이 제공될 수 있다.Here, the cough discrimination model is a classification model learned to classify cough or non-cough by receiving spectrogram data, and the spectrogram data is a spectrogram image having a predetermined length in the time domain. there is.

여기서, 상기 기침 판별 모델은 상기 스펙트로그램 데이터 및 상기 스펙트로그램데이터에 라벨링된 태깅 정보를 포함하는 학습 데이터 셋을 이용하여 학습되고, 상기 태깅 정보는 상기 스펙트로그램 데이터가 기침에 대응되는 소리를 포함하는지 여부에 관한 정보를 포함하는, 방법이 제공될 수 있다.Here, the cough discrimination model is learned using a learning data set including the spectrogram data and tagging information labeled in the spectrogram data, and the tagging information determines whether the spectrogram data includes a sound corresponding to cough. A method may be provided, including information about whether or not.

여기서, 상기 판별하는 단계는, 상기 획득된 스펙트로그램에 대하여 리사이징(Resizing), 스케일링(Scaling) 및 RGB 변환 중 적어도 하나의 전처리를 수행하는 단계; 및 전처리된 스펙트로그램을 상기 기침 판별 모델에 적용하여, 상기 스펙트로그램이 기침 구간인지 판별하는 단계;를 포함하는, 방법이 제공될 수 있다.Here, the determining step may include performing at least one preprocessing of resizing, scaling, and RGB conversion on the obtained spectrogram; And applying the preprocessed spectrogram to the cough discrimination model, determining whether the spectrogram is a cough interval; including, a method may be provided.

여기서, 상기 계산하는 단계는, 기침으로 판별된 온셋 신호 들에 대하여, 시간 도메인에서 인접한 두 온셋 신호의 시점 사이의 간격이 상기 기준 시간 이내인지 판단하는 단계;를 포함하는, 방법이 제공될 수 있다.Here, the calculating step may include determining whether an interval between time points of two adjacent onset signals in the time domain is within the reference time with respect to onset signals determined as coughing. .

여기서, 상기 방법을 실행하기 위한 컴퓨터 프로그램을 기록하는 비일시성의 컴퓨터 판독 가능 기록 매체가 제공될 수 있다.Here, a non-transitory computer readable recording medium recording a computer program for executing the method may be provided.

여기서, 상기 제어부는, 기침으로 판별된 온셋 신호 들에 대하여, 시간 도메인에서 인접한 두 온셋 신호의 시점 사이의 간격이 상기 기준 시간 이내인지 판단하도록 구성되는, 서버가 제공될 수 있다.Here, a server may be provided in which the control unit is configured to determine whether an interval between time points of two adjacent onset signals in the time domain is within the reference time with respect to onset signals determined as coughing.

여기서, 상기 제어부는, 상기 획득된 스펙트로그램에 대하여 리사이징(Resizing), 스케일링(Scaling) 및 RGB 변환 중 적어도 하나의 전처리를 수행하고, 전처리된 스펙트로그램을 상기 기침 판별 모델을 이용하여, 상기 획득된 스펙트로그램이 기침 구간인지 판별하도록 구성되는, 서버가 제공될 수 있다.Here, the control unit performs at least one preprocessing of resizing, scaling, and RGB conversion on the obtained spectrogram, and uses the preprocessed spectrogram to determine the cough discrimination model. A server may be provided, configured to determine whether the spectrogram is a cough interval.

여기서, 상기 기침 판별 모델은, 스펙트로그램 데이터를 입력받아 기침 또는 비기침으로 분류하도록 학습된 분류모델이고, 상기 스펙트로그램 데이터는, 시간도메인에서 기 결정된 길이를 가지는 스펙트로그램 이미지인, 서버가 제공될 수 있다.Here, the cough discrimination model is a classification model learned to classify cough or non-cough by receiving spectrogram data, and the spectrogram data is a spectrogram image having a predetermined length in the time domain. can

여기서, 상기 기침 판별 모델은 상기 스펙트로그램 데이터 및 상기 스펙트로그램데이터에 라벨링된 태깅 정보를 포함하는 학습 데이터 셋을 이용하여 학습되고, 상기 태깅 정보는 상기 스펙트로그램 데이터가 기침에 대응되는 소리를 포함하는지 여부에 관한 정보를 포함하는, 서버가 제공될 수 있다.Here, the cough discrimination model is learned using a learning data set including the spectrogram data and tagging information labeled in the spectrogram data, and the tagging information determines whether the spectrogram data includes a sound corresponding to cough. A server may be provided, including information on whether or not.

도 1은 본 출원의 일 실시예에 따른 기침을 계수하기 위한 전체 시스템(10)의 구성을 개략적으로 나타내는 도면이다.Figure 1 is a diagram schematically showing the configuration of the entire system 10 for counting cough according to an embodiment of the present application.

본 출원의 일 실시예에 따르면, 시스템(10)은 디바이스(1000) 및 서버(2000)를 포함할 수 있다. 다만, 도 1에 도시된 구성요소들이 필수적인 것은 아니고, 시스템(10)은 그보다 많은 구성요소를 갖거나 그보다 적은 구성요소를 가질 수 있다.According to one embodiment of the present application, system 10 may include device 1000 and server 2000 . However, the components shown in FIG. 1 are not essential, and the system 10 may have more or fewer components.

디바이스(1000)는 서버(2000)와 네트워크를 통해 연결되어, 필요한 데이터를 송/수신할 수 있다. 여기서, 네트워크는 근거리 통신망(LAN, Local Area Network), 도시권 통신망(MAN, Metropolitan Area Network), 광역 통신망(WAN, Wide Area Network), 와이파이(Wi-Fi), 와이파이 다이렉트(Wi-Fi Direct), LTE 다이렉트(LTE Direct), 및/또는 블루투스(Bluetooth)를 포함할 수 있고, 이에 한정되지 않는다.The device 1000 may be connected to the server 2000 through a network to transmit/receive necessary data. Here, the network is a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), Wi-Fi, Wi-Fi Direct, It may include LTE Direct, and/or Bluetooth, but is not limited thereto.

디바이스(1000)는 사람의 신체의 일부에 고정되어, 정해진 시간동안 소리를 녹음하는 기능을 수행할 수 있다. 디바이스(1000)의 마이크와 같은 입력 장치를 통해 외부의 소리가 정해진 시간 길이(예, 24시간)로 획득되어 녹음될 수 있고, 녹음된 소리를 포함하는 음향 신호는 네트워크를 통해 서버(2000)로 전송될 수 있다. 이 때, 디바이스(1000)는 마이크가 사람의 얼굴을 향하는 배치로, 사람의 배측에 고정되거나, 팔에 고정될 수 있다. 디바이스(1000)를 고정할 때에는 통상적으로 이용되는 벨트, 암밴드 등이 이용될 수 있다.The device 1000 may perform a function of recording sound for a predetermined time by being fixed to a part of a person's body. External sound may be acquired and recorded for a predetermined length of time (eg, 24 hours) through an input device such as a microphone of the device 1000, and the sound signal including the recorded sound is sent to the server 2000 through a network. can be transmitted In this case, the device 1000 may be fixed to the stomach of the person or fixed to the arm, with the microphone facing the person's face. When fixing the device 1000, a commonly used belt, armband, or the like may be used.

디바이스(1000)는 서버(2000)에 접속하여 데이터를 전송할 수 있는 기능이적용된 디지털 기기로, 메모리 수단을 구비하고 마이크로 프로세서를 탑재하여 연산 능력을 갖춘 디지털 기기일 수 있다. 일 예로, 디바이스(1000)는 스마트 글래스, 스마트 워치, 스마트 밴드, 스마트 링, 스마트 넥클리스 등과 같은 웨어러블 디바이스이거나 스마트폰, 스마트 패드, 데스크탑 컴퓨터, 노트북 컴퓨터, 워크스테이션, PDA, 웹 패드, 이동 전화기 등과 같은 다소 전통적인 디바이스일 수 있다.The device 1000 is a digital device to which a function capable of transmitting data by accessing the server 2000 is applied, and may be a digital device equipped with a memory unit and a microprocessor equipped with an arithmetic capability. For example, the device 1000 is a wearable device such as smart glasses, a smart watch, a smart band, a smart ring, a smart necklace, or the like, or a smart phone, a smart pad, a desktop computer, a notebook computer, a workstation, a PDA, a web pad, a mobile phone, and the like. It may be a more or less traditional device such as

서버(2000)는 음향 신호를 분석하여 기침을 계수하는 기능을 수행할 수 있다. 서버(2000)는 음향 신호에서 하나 이상의 온셋 신호를 추출하고, 추출된 온셋 신호에 기초하여 획득된 입력 데이터를 기침 판별 모델에 적용함으로써 상기 음향 신호의 전체 기침 횟수를 계산할 수 있다.The server 2000 may perform a function of counting cough by analyzing the acoustic signal. The server 2000 may calculate the total number of coughs of the acoustic signal by extracting one or more onset signals from the acoustic signal and applying input data obtained based on the extracted onset signal to a cough discrimination model.

본 출원의 일 실시예에 따르면, 서버(2000)는 하나 이상의 온셋 신호를 추출하여 기침/비기침을 판별하여, 전체 음향 신호 모두를 기침 판별 모델에 적용하는 동작을 생략할 수 있어, 기침 판별에 필요한 서버의 연산량을 급격하게 감소시킬 수 있는 장점이 도출될 수 있다. 또한, 본 출원의 일 실시예에 따르면, 서버(2000)는 전체 기침 횟수를 계산함에 있어 기 저장된 알고리즘을 통해 인접한 두 기침으로 판별된 온셋 신호의 시점 간 이격 시간을 토대로 하나 또는 둘로 계수하여, 기침 소리가 길게 녹음된 하나의 기침이 중복 계수되는 문제없이 정확한 전체 기침 횟수의 계산을 가능하게 하는 장점이 도출될 수 있다.According to an embodiment of the present application, the server 2000 extracts one or more onset signals to determine cough/non-cough, and the operation of applying all of the acoustic signals to the cough discrimination model can be omitted. An advantage that can drastically reduce the amount of computation required of the server can be derived. In addition, according to an embodiment of the present application, in calculating the total number of coughs, the server 2000 counts the number of coughs as one or two based on the separation time between time points of onset signals determined as two adjacent coughs through a pre-stored algorithm. The advantage of enabling accurate calculation of the total number of coughs without the problem of duplicate counting of one long-recorded cough can be derived.

서버(2000)의 기능에 관하여는 아래에서 더 자세하게 알아보기로 한다. Functions of the server 2000 will be described in more detail below.

한편, 본 명세서에서는, 기침을 계수하는 구체적인 방법이 서버(2000)에 의해 수행되는 것으로 설명되나, 이러한 설명은 예시적인 것이고 본 명세서에서 서버(2000)의 동작으로 설명되는 기능은 전체 또는 일부가 디바이스(1000)에 의해 수행될 수 있음은 당업자에게 자명하다.On the other hand, in the present specification, although a specific method of counting cough is described as being performed by the server 2000, this description is exemplary and the function described as the operation of the server 2000 in this specification is in whole or in part a device It is obvious to those skilled in the art that it can be performed by (1000).

또한, 본 명세서에서 개시하는 기침을 계수하는 방법이 디바이스(1000)에 의해 수행되는 경우, 디바이스(1000)에는 방법을 수행하기 위한 애플리케이션 프로그램이 더 포함되어 있을 수 있다. 이러한 애플리케이션은 디바이스(1000) 내에서 프로그램 모듈의 형태로 존재할 수 있다. 이러한 프로그램 모듈의 성격은 후술할 바와 같은 통신부(2100), 메모리부(2200) 및 제어부(2300)와 전반적으로 유사할 수 있다. 여기서, 애플리케이션은 그 적어도 일부가 필요에 따라 그것과 실질적으로 동일하거나 균등한 기능을 수행할 수 있는 하드웨어 장치나 펌웨어 장치로 치환될 수도 있다.In addition, when the cough counting method disclosed in this specification is performed by the device 1000, the device 1000 may further include an application program for performing the method. These applications may exist in the form of program modules within the device 1000 . Characteristics of these program modules may be generally similar to those of the communication unit 2100, the memory unit 2200, and the control unit 2300, which will be described later. Here, at least a part of the application may be replaced with a hardware device or a firmware device capable of performing substantially the same or equivalent functions as necessary.

도 2는 본 출원의 일 실시예에 따른 서버(2000)의 구성요소를 설명하기 위한 도면이다. 도 2를 참조하면, 서버(2000)는 통신부(2100), 메모리부(2200) 및 제어부(2300)를 포함할 수 있다. 다만, 도 2에 도시된 구성요소들이 필수적인 것은 아니고, 서버(2000)는 그보다 많은 구성요소를 갖거나 그보다 적은 구성요소를 가질 수 있다. 2 is a diagram for explaining components of a server 2000 according to an embodiment of the present application. Referring to FIG. 2 , a server 2000 may include a communication unit 2100, a memory unit 2200, and a control unit 2300. However, the components shown in FIG. 2 are not essential, and the server 2000 may have more or fewer components.

본 출원의 일 실시예에 따르면, 서버(2000)의 각 구성요소는 물리적으로 하나의 서버에 포함될 수도 있고, 각각의 기능 별로 분산된 분산 서버일 수 있으며, 이에 한정되지 않는다. According to an embodiment of the present application, each component of the server 2000 may be physically included in one server or may be a distributed server distributed for each function, but is not limited thereto.

본 출원의 일 실시예에 따르면, 서버(2000)의 통신부(2100), 메모리부(2200) 및 제어부(2300)는 그 중 적어도 일부가 외부 시스템(미도시됨)과 통신하는 프로그램 모듈들일 수 있다. 이러한 프로그램 모듈들은 운영 시스템, 응용 프로그램 모듈 및 기타 프로그램 모듈의 형태로 서버(2000)에 포함될 수 있으며, 물리적으로는 여러 가지 공지의 기억 장치 상에 저장될 수 있다. 또한, 이러한 프로그램 모듈들은 서버(2000)와 통신 가능한 원격 기억 장치에 저장될 수도 있다. 한편, 이러한 프로그램 모듈들은 본 출원에 따라 후술할 특정 업무를 수행하거나 특정 추상 데이터 유형을 실행하는 루틴, 서브 루틴, 프로그램, 오브젝트, 컴포넌트, 데이터 구조 등을 포괄하지만, 이에 제한되지는 않는다.According to an embodiment of the present application, the communication unit 2100, the memory unit 2200, and the control unit 2300 of the server 2000 may be program modules, at least some of which communicate with an external system (not shown). . These program modules may be included in the server 2000 in the form of an operating system, application program modules, and other program modules, and may be physically stored on various known storage devices. Also, these program modules may be stored in a remote storage device capable of communicating with the server 2000 . Meanwhile, these program modules include, but are not limited to, routines, subroutines, programs, objects, components, data structures, etc. that perform specific tasks or execute specific abstract data types according to the present application.

통신부(2100)는 서버(2000)가 외부 장치(예를 들어, 디바이스(1000)와 데이터를 송/수신하는 역할을 수행할 수 있다. The communication unit 2100 may serve to transmit/receive data between the server 2000 and an external device (eg, the device 1000).

통신부(2100)는 통신을 가능하게 하는 하나 이상의 모듈을 포함할 수 있다. 통신부(2100)는 유선 방식을 통해 외부 기기와 통신할 수 있도록 하는 모듈을 포함할 수 있다. 또는, 통신부(2100)는 무선 방식을 통해 외부 기기와 통신할 수 있도록 하는 모듈을 포함할 수 있다. 또는, 통신부(2100)는 유선 방식을 통해 외부 기기와 통신할 수 있도록 하는 모듈 및 통신부(2100)는 무선 방식을 통해 외부 기기와 통신할 수 있도록 하는 모듈을 포함할 수 있다.The communication unit 2100 may include one or more modules enabling communication. The communication unit 2100 may include a module enabling communication with an external device through a wired method. Alternatively, the communication unit 2100 may include a module enabling communication with an external device through a wireless method. Alternatively, the communication unit 2100 may include a module enabling communication with an external device through a wired method and a module allowing communication with an external device through a wireless method.

구체적인 예를 들어, 통신부(2100)는 LAN(Local Area Network)를 통해 인터넷 등에 접속하는 유선통신 모듈, 이동 통신 기지국을 거쳐 이동 통신 네트워크에 접속하여 데이터를 송수신하는 LTE(Long Term Evolution) 등의 이동 통신 모듈, 와이파이(Wi-Fi) 같은 WLAN(Wireless Local Area Network) 계열의 통신 방식이나 블루투스(Bluetooth), 직비(Zigbee)와 같은 WPAN(Wireless Personal Area Network) 계열의 통신 방식을 이용하는 근거리 통신 모듈, GPS(Global Positioning System)과 같은 GNSS(Global Navigation Satellite System)을 이용하는 위성 통신 모듈 또는 이들의 조합으로 구성될 수 있다.For example, the communication unit 2100 is a wired communication module that accesses the Internet through a LAN (Local Area Network), a mobile communication module that accesses a mobile communication network via a mobile communication base station and transmits and receives data, such as Long Term Evolution (LTE). Communication module, a short-distance communication module using a WLAN (Wireless Local Area Network) communication method such as Wi-Fi or a WPAN (Wireless Personal Area Network) communication method such as Bluetooth or Zigbee, It may be composed of a satellite communication module using a global navigation satellite system (GNSS) such as a global positioning system (GPS) or a combination thereof.

본 출원의 일 실시예에 따르면, 통신부(2100)는 디바이스(1000)를 통해 음향 신호를 수신할 수 있다. 본 출원의 일 실시예에 따르면, 통신부(2100)는 디바이스(1000)에서 수신된 음향 신호에 대한 분석을 수행하여, 음향 신호에 대한 전체 기침 횟수를 디바이스(1000)로 전송할 수 있다. According to an embodiment of the present application, the communication unit 2100 may receive a sound signal through the device 1000 . According to an embodiment of the present application, the communication unit 2100 may analyze the acoustic signal received from the device 1000 and transmit the total number of coughs for the acoustic signal to the device 1000 .

메모리부(2200)는, 서버(2000)가 동작하는 데 필요한 각종 데이터 및 프로그램을 저장하고 있을 수 있다. 일 예로, 메모리부(2200)에는 음향 신호에 기초하여 기침을 계수하기 위한 서버(2000)를 구동하기 위한 운용 프로그램(OS: Operating System), 기침을 계수하는 방법이 수행되기 위해 서버(2000)에서 구동되어야 하거나 이용되는 각종 프로그램, 그리고 이들 프로그램에 의해 참조될 미디어에 관한 각종 데이터 등이 저장될 수 있다.The memory unit 2200 may store various data and programs necessary for the server 2000 to operate. For example, in the memory unit 2200, an operating system (OS) for driving the server 2000 for counting coughs based on sound signals and a method for counting coughs are performed in the server 2000. Various programs to be driven or used, various data related to media to be referred to by these programs, and the like can be stored.

메모리부(2200)는 데이터를 임시적으로 또는 반영구적으로 저장할 수 있다. 메모리부(2200)의 예로는 하드디스크(HDD: Hard Disk Drive), SSD(Solid State Drive), 플래쉬 메모리(1400, flash memory), 롬(ROM: Read-Only Memory), 램(RAM: Random Access Memory) 또는 클라우드 스토리지(Cloud Storage) 등이 있을 수 있다. 또한, 메모리부(2200)는 데이터를 저장하기 위한 데이터베이스를 구축하여 저장할 수 있고, 이에 한정되지 않고, 데이터를 저장하기 위한 다양한 모듈로 구현될 수 있다.The memory unit 2200 may temporarily or semi-permanently store data. Examples of the memory unit 2200 include a hard disk drive (HDD), a solid state drive (SSD), a flash memory 1400, a read-only memory (ROM), and a random access RAM (RAM). Memory) or cloud storage. In addition, the memory unit 2200 may build and store a database for storing data, but is not limited thereto, and may be implemented with various modules for storing data.

본 출원의 일 실시예에 따르면, 메모리부(2200)는 기침 판별 모델을 저장할 수 있다. 본 출원의 다른 실시예에 따르면, 메모리부(2200)는 기침 판별 모델을 로딩하기 위한 인스트럭션(Instruction)을 저장할 수 있다. According to an embodiment of the present application, the memory unit 2200 may store a cough discrimination model. According to another embodiment of the present application, the memory unit 2200 may store instructions for loading a cough discrimination model.

기침 판별 모델은 지도 학습 알고리즘(예를 들어, 로지스틱 회귀(Logistic Regression), 서포트 벡터 머신(Support Vector Machine; SVM), 랜덤 포레스트(Random Forest) 등), 비지도 학습알고리즘, 인공 신경망(Artificial Neural Networks; ANN) 등의 머신 러닝 알고리즘으로 구성되거나, 완전 연결 네트워크(Fully-Connected Network), 합성곱 신경망(Convolutional Neural Network; CNN) 등의 딥 러닝 알고리즘으로 구성되는 것일 수 있다. 한편, 본 출원의 일 실시예에 따른 기침 판별 모델이 반드시 위의 열거된 것에만 한정되는 것은 아니며, 본 발명의 목적을 달성할 수 있는 범위 내에서 다양하게 변경될 수 있다.Cough discrimination models are supervised learning algorithms (e.g., Logistic Regression, Support Vector Machine (SVM), Random Forest, etc.), unsupervised learning algorithms, artificial neural networks ANN) or a deep learning algorithm such as a Fully-Connected Network or Convolutional Neural Network (CNN). On the other hand, the cough discrimination model according to an embodiment of the present application is not necessarily limited to those listed above, and may be variously modified within the scope of achieving the object of the present invention.

본 출원의 일 실시예에 따르면, 기침 판별 모델은 스펙트로그램 데이터를 입력받아 기침 구간인지에 대한 표지를 출력하도록 학습된 분류모델일 수 있다. 일 예로, 기침 판별 모델은 스펙트로그램 데이터를 입력받아 기침 또는 비기침으로 분류하도록 학습된 분류모델일 수 있다. 여기서, 스펙트로그램 데이터는, 시간도메인에서 기 결정된 길이를 가지는 스펙트로그램 이미지일 수 있다.According to an embodiment of the present application, the cough discrimination model may be a classification model learned to receive spectrogram data and output a marker for whether it is a cough section. For example, the cough discrimination model may be a classification model learned to classify cough or non-cough by receiving spectrogram data. Here, the spectrogram data may be a spectrogram image having a predetermined length in the time domain.

본 출원의 일 실시예에 따르면, 기침 판별 모델은 상기 스펙트로그램 데이터 및 상기 스펙트로그램 데이터에 라벨링된 태깅 정보를 포함하는 학습 데이터 셋을 이용하여 학습될 수 있다. 이 때, 태깅 정보는 상기 스펙트로그램 데이터가 기침에 대응되는 소리를 포함하는지 여부에 관한 정보를 포함할 수 있다. 이 때, 태깅 정보는, 의료 전문가에 의해서 사전 수행된 정보를 토대로 선택된 값일 수 있다. 일 예로, 기침에 대응되는 소리인지에 대한 라벨링은 의료 전문가에 의해 사전 수행된 데이터에 기초하여 결정될 수 있다.According to an embodiment of the present application, a cough discrimination model may be learned using a training data set including the spectrogram data and tagging information labeled on the spectrogram data. In this case, the tagging information may include information about whether the spectrogram data includes a sound corresponding to coughing. In this case, the tagging information may be a value selected based on information previously performed by a medical expert. For example, labeling whether a sound corresponds to coughing may be determined based on data pre-performed by a medical professional.

본 출원의 일 실시예에 따른 기침 판별 모델을 학습시키는 과정은, 서버(2000)에 포함되는 제어부(2300)에 의해 수행되거나, 서버(2000)와는 별개의 엔터티(예를 들어, 서버(2000)와 구분되는 학습 서버 등)에서 수행될 수 있다.The process of learning the cough discrimination model according to an embodiment of the present application is performed by the control unit 2300 included in the server 2000, or by an entity separate from the server 2000 (eg, the server 2000). It can be performed in a learning server, etc., which are distinct from

그외에도, 메모리부(2200)에는 온셋 검출 모듈, 온셋 신호 추출 모듈, 스펙트로그램 획득 모듈(미도시), 데이터 전처리 모듈 및/또는 기침 판별 모듈(기침 판별 모델을 포함함)이 저장될 수 있다. 각 모듈은 기능적으로 구분되는 별개의 모듈로 서버(2000)에 존재할 수 있고, 또는, 복수의 기능적 모듈의 역할을 하나의 주체가 수행하여 물리적으로 일체된 모듈로 서버(2000)에 존재할 수 있다. 각 모듈의 그 기능과 동작에 대해서는 아래에서 설명하는 기침 계수 방법에 의해 명확하게 이해될 것 인바, 그에 대한 구체적인 설명은 생략하기로 한다.In addition, an onset detection module, an onset signal extraction module, a spectrogram acquisition module (not shown), a data preprocessing module, and/or a cough discrimination module (including a cough discrimination model) may be stored in the memory unit 2200. Each module may exist in the server 2000 as a separate functionally separated module, or may exist in the server 2000 as a physically integrated module by one subject performing the role of a plurality of functional modules. The function and operation of each module will be clearly understood by the cough counting method described below, and a detailed description thereof will be omitted.

제어부(2300)는 서버(2000)의 전반적인 동작을 총괄하고 제어하는 기능을 수행할 수 있다. 제어부(2300)는 각종 정보의 연산 및 처리를 수행하고 서버(2000)의 구성요소들의 동작을 제어할 수 있다. The controller 2300 may perform a function of overseeing and controlling overall operations of the server 2000 . The controller 2300 may perform calculations and processing of various types of information and may control operations of elements of the server 2000 .

제어부(2300)는 하드웨어, 소프트웨어 또는 이들의 조합에 따라 컴퓨터나 이와 유사한 장치로 구현될 수 있다. 제어부(2300)는 하드웨어적으로 전기적인 신호를 처리하여 제어 기능을 수행하는 CPU 칩 등의 전자 회로 형태로 제공될 수 있으며, 소프트웨어적으로는 하드웨어적인 제어부를 구동시키는 프로그램 형태로 제공될 수 있다. The control unit 2300 may be implemented as a computer or similar device according to hardware, software, or a combination thereof. The control unit 2300 may be provided in the form of an electronic circuit such as a CPU chip that performs a control function by processing electrical signals in hardware, and may be provided in the form of a program that drives the hardware control unit in terms of software.

본 출원의 일 실시예에 따르면, 제어부(2300)는 통신부(2100)를 통해 수신된 음향 신호에서 하나 이상의 온셋 신호를 추출할 수 있다. 제어부(2300)는 온셋 검출 모듈 및 온셋 신호 추출 모듈을 이용하여, 음향 신호에서 하나 이상의 온셋 신호를 추출할 수 있다. 온셋 신호는 소리의 어텍(Attack)에 대응되는 신호를 포함하고, 시간 도메인에서의 기결정된 길이를 가질 수 있다.According to an embodiment of the present application, the controller 2300 may extract one or more onset signals from the sound signal received through the communication unit 2100. The controller 2300 may extract one or more onset signals from the sound signal by using the onset detection module and the onset signal extraction module. The onset signal includes a signal corresponding to an attack of sound and may have a predetermined length in the time domain.

본 출원의 일 실시예에 따르면, 제어부(2300)는 상기 추출된 온셋 신호에 대응되는 스펙트로그램을 획득할 수 있다. 제어부(2300)는 스펙트로그램 변환 모듈을 이용하여 상기 추출된 온셋 신호에 대응되는 스펙트로그램을 획득할 수 있다. 또는 제어부(2300)는 스펙트로그램 변환 모듈을 이용하여 음향 신호에 대한 전체 스펙트로그램을 획득하고, 전체 스펙트로그램 중 추출된 온셋 신호에 대응되는 부분만 잘라내어(Crop), 상기 추출된 온셋 신호에 대응되는 스펙트로그램을 획득할 수 있다.According to an embodiment of the present application, the controller 2300 may obtain a spectrogram corresponding to the extracted onset signal. The controller 2300 may obtain a spectrogram corresponding to the extracted onset signal by using a spectrogram conversion module. Alternatively, the control unit 2300 acquires the entire spectrogram of the acoustic signal using the spectrogram conversion module, crops only the part corresponding to the extracted onset signal from the entire spectrogram, and A spectrogram can be obtained.

본 출원의 일 실시예에 따르면, 제어부(2300)는 상기 획득된 스펙트로그램이 기침 구간인지 판별할 수 있다. 제어부(2300)는 기침 판별 모델에 스펙트로그램에 기초한 입력 데이터를 적용하여, 기침 구간인지 판별할 수 있다.According to an embodiment of the present application, the control unit 2300 may determine whether the acquired spectrogram is a cough section. The controller 2300 may apply input data based on the spectrogram to the cough discrimination model to determine whether it is a cough interval.

본 출원의 일 실시예에 따르면, 제어부(2300)는 온셋 신호의 스펙트로그램에 대하여 판별된 결과에 기초하여, 상기 음향 신호에서의 전체 기침 횟수를 계산할 수 있다. 이 때, 제어부(2300)는 온셋 신호에 대응되는 스펙트로그램을 획득하고, 획득된 스펙트로그램에 기초하여 기침구간인지 판별하는 동작을, 추출된 온셋 신호의 개수만큼 수행할 수 있다. 전체 기침 횟수를 계산하는 단계에서는, 제어부(2300)는 기침으로 판별된 온셋 신호의 시점을 비교할 수 있다. 제어부(2300)는 제1 온셋 신호의 시점과 제2 온셋 신호의 시점이 기준 시간을 초과하는 만큼 이격되어 있으면 둘로 계수하고, 상기 제1 온셋 신호의 시점과 상기 제2 온셋 신호의 시점이 기준 시간 이내이면 하나로 계수할 수 있고, 이 때, 제1 온셋 신호 및 상기 제2 온셋 신호는 기침으로 판별된 신호이다.According to an embodiment of the present application, the controller 2300 may calculate the total number of coughs in the sound signal based on a result determined for the spectrogram of the onset signal. At this time, the controller 2300 may obtain a spectrogram corresponding to the onset signal, and perform an operation of determining whether a coughing section is obtained based on the obtained spectrogram by the number of extracted onset signals. In the step of calculating the total number of coughs, the controller 2300 may compare the time points of onset signals determined as coughing. If the time points of the first onset signal and the time points of the second onset signal are separated by an amount exceeding the reference time, the controller 2300 counts them as two, and the time points of the first onset signal and the time points of the second onset signal are the reference time. If it is within, it can be counted as one, and at this time, the first onset signal and the second onset signal are signals determined as coughing.

이하에서는, 음향 신호에 기초하여 기침을 계수하는 동작에 대해서 구체적으로 설명하기로 한다. 다만, 이하에서는, 특별한 언급이 없는 경우, 서버(2000)의 동작은 제어부(2300)의 제어에 의해 수행되는 것으로 해석될 수 있다.Hereinafter, an operation of counting cough based on an acoustic signal will be described in detail. However, in the following, unless otherwise specified, the operation of the server 2000 may be interpreted as being performed under the control of the controller 2300.

도 3은 본 출원의 일 실시예에 따른 기침을 계수하는 방법을 설명하기 위한 순서도이다.3 is a flowchart illustrating a method of counting cough according to an embodiment of the present application.

서버(2000)는 온셋 신호를 추출(S1000)할 수 있다. 서버(2000)는 통신부(2100)를 통해 수신된 음향 신호에 기초하여 온셋 신호를 추출할 수 있다. 서버(2000)는 수신된 음향 신호에서 온셋 신호를 추출하거나, 수신된 음향 신호에 대하여 노이즈 필터링을 수행하여 온셋 신호를 추출할 수 있다. The server 2000 may extract the onset signal (S1000). The server 2000 may extract an onset signal based on the sound signal received through the communication unit 2100 . The server 2000 may extract an onset signal from the received acoustic signal or extract the onset signal by performing noise filtering on the received acoustic signal.

도 4는 본 출원의 일 실시예에 따른 온셋 신호를 검출하는 방법을 설명하기 위한 도면이다.4 is a diagram for explaining a method of detecting an onset signal according to an embodiment of the present application.

본 출원의 일 실시예에 따르면, 서버(2000)는 음향 신호(Sound Signal, SS)에서 온셋 지점(Onset Point, OP)을 추출할 수 있다. According to an embodiment of the present application, the server 2000 may extract an onset point (OP) from a sound signal (Sound Signal, SS).

서버(2000)는 샘플링 시간(예, 0.01s)마다 온셋 지점(OP)인지 판별하고, 온셋 지점(OP)으로 판별된 경우, 음향 신호(SS)에서 해당 시점부터 미리 결정된 길이의 시간(Onset length, OL)에 대응되는 시간 구간에의 신호를 잘라내어(Crop) 온셋 신호(Onset Signal, OS)를 획득할 수 있다. 온셋 신호(OS)는 소리의 어텍(Attack)에 대응되는 신호를 포함할 수 있다. 온셋 신호(OS)는 시간 도메인에서의 기결정된 길이(OL)를 가질 수 있다. 이 때, 획득되는 온셋 신호(OS)의 개수는, 판별되는 온셋 지점(OP)의 개수와 동일할 수 있다.The server 2000 determines whether it is an onset point (OP) for every sampling time (eg, 0.01 s), and if it is determined as an onset point (OP), a time of a predetermined length from the corresponding point in the sound signal (SS) (Onset length , OL) to obtain an onset signal (Onset Signal, OS) by cropping the signal in the time interval corresponding to (OL). The onset signal OS may include a signal corresponding to an attack of sound. The onset signal OS may have a predetermined length OL in the time domain. In this case, the number of acquired onset signals OS may be equal to the number of determined onset points OP.

온셋 지점(OP)인지 판별하는 방법은, 정해진 샘플링 시간마다 획득된 온셋 강도가 기준 강도보다 큰 지점을 온셋 지점(OP)으로 선택하는 형태일 수 있다. 구체적인 예를 들어, 온셋 강도는 Mel-filter Bank를 통과한 Mel-Spectrogram에 기초하여 온셋 강도를 구할 수 있다. 구체적인 예를 들어, 제어부(2300)는 획득된 음향 신호(SS)를 둘 이상의 주파수 구간으로 구분하고, 구분된 각각의 신호를 Mel-filter Bank를 통과시켜 구해진 값을 더하여 온셋 강도를 구할 수 있다. 제어부(2300)는 구해진 온셋 강도가 기준 강도보다 큰 경우 온셋 지점(OP)인 것으로 판별할 수 있다. 제어부(2300)는 구해진 온셋 강도가 기준 강도보다 크고, 직전에 구해진 온셋 강도가 기준 강도보다 작은 경우 온셋 지점(OP)인 것으로 판별할 수 있다. 여기서, 기준 강도는 미리 정해진 상수값일 수 있다. 또는 기준 강도는 음향 신호(SS) 전체 또는 판별하는 지점 주변의 온셋 강도를 참조하여 가변적으로 정의될 수 있다.A method of determining whether the onset point is the OP may be a form of selecting a point where the onset intensity obtained for each predetermined sampling time is greater than the reference intensity as the onset point OP. For a specific example, the onset intensity may be obtained based on a Mel-Spectrogram passed through a Mel-filter bank. For example, the control unit 2300 may divide the obtained acoustic signal SS into two or more frequency intervals, pass each of the divided signals through a Mel-filter bank, and add values obtained to obtain onset strength. When the obtained onset intensity is greater than the reference intensity, the controller 2300 may determine that the onset point is an OP. The controller 2300 may determine that the onset point is an onset point (OP) when the obtained onset intensity is greater than the reference intensity and the recently obtained onset intensity is less than the reference intensity. Here, the reference intensity may be a predetermined constant value. Alternatively, the reference intensity may be variably defined with reference to the entire acoustic signal SS or the onset intensity around a point to be determined.

도 5는 본 출원의 일 실시예에 따른 제어부(2300)에 포함된 온셋 신호를 검출하는 모듈을 설명하기 위한 도면이다.5 is a diagram for explaining a module for detecting an onset signal included in the control unit 2300 according to an embodiment of the present application.

본 출원의 일 실시예에 따르면, 온셋 검출 모듈(2310)은 수신된 음향 신호에 기초하여 온셋 지점을 검출하는 동작을 수행할 수 있다. 온셋 검출 모듈(2310)은 음향 신호의 전체 신호를 정해진 간격으로 스캐닝하면서 온셋 지점을 검출할 수 있다. 온셋 검출 모듈(2310)은 음향 신호에서 온셋 지점을 검출할 수 있고, 검출된 온셋 지점은 복수개일 수 있다.According to an embodiment of the present application, the onset detection module 2310 may perform an operation of detecting an onset point based on the received sound signal. The onset detection module 2310 may detect an onset point while scanning all signals of the acoustic signal at predetermined intervals. The onset detection module 2310 may detect an onset point in the sound signal, and there may be a plurality of detected onset points.

온셋 신호 추출 모듈(2320)은 온셋 검출 모듈(2310)에 의해 검출된 온셋 지점에서 온셋 신호를 추출할 수 있다. 온셋 신호 추출 모듈(2320)은 검출된 온셋 지점 마다 온셋 신호를 추출할 수 있다. 온셋 신호 추출 모듈(2320)은 검출된 온셋 지점을 시점으로 하여 기 결정된 시간 길이만큼을 신호를 잘라내어 온셋 신호를 추출할 수 있다. The onset signal extraction module 2320 may extract an onset signal from an onset point detected by the onset detection module 2310 . The onset signal extraction module 2320 may extract an onset signal for each detected onset point. The onset signal extraction module 2320 may extract the onset signal by cutting the signal for a predetermined length of time using the detected onset point as a starting point.

본 출원의 일 실시예에 따르면, 온셋 검출 모듈(2310) 및 온셋 신호 추출 모듈(2320)의 동작으로, 음향 신호로부터 제1 온셋 신호, 제2 온셋 신호, ?? 제N 온셋 신호가 추출될 수 있다. 여기서, 온셋 검출 모듈(2310)이 스캐닝을 수행하는 정해진 간격이 온셋 신호의 길이보다 작을 수 있고, 이 때, 제1 온셋 신호와 제2 온셋 신호는 중첩될 수 있다. According to an embodiment of the present application, the first onset signal, the second onset signal, ?? An Nth onset signal may be extracted. Here, a predetermined interval at which the onset detection module 2310 scans may be shorter than the length of the onset signal, and in this case, the first onset signal and the second onset signal may overlap.

다시 도 3을 참조하면, 서버(2000)는 스펙트로그램을 획득(S2000)할 수 있다.Referring back to FIG. 3 , the server 2000 may obtain a spectrogram (S2000).

일 예로, 서버(2000)는 상기 추출된 온셋 신호를 기초로 스펙트로그램을 획득할 수 있다. 다른 예로, 서버(2000)는 음향 신호를 기초로 획득된 전체 스펙트로그램에서 추출된 온셋 신호에 대응되는 스펙트로그램을 추출할 수 있다.For example, the server 2000 may obtain a spectrogram based on the extracted onset signal. As another example, the server 2000 may extract a spectrogram corresponding to an onset signal extracted from an entire spectrogram obtained based on a sound signal.

시간 도메인에서의 진폭값을 가지는 신호를 주파수 도메인에서의 진폭값을 가지는 스펙트로그램으로 변환하는 방법은 통상적으로 알려진 방법에 따라 수행될 수 있다. 일 예로, 변환 대상 신호를 정해진 시간 구간으로 분할하고, 분할된 신호를 고속 푸리에 변환을 통해 개별 사인파로 분해하고, 분해된 주파수에 따른 크기 신호를 시간에 따른 주파수로 나타내면서 색상을 통해 진폭을 표시하는 형태로 스펙트로그램이 변환될 수 있다. 이에 한정되지 않고, 본 출원에서의 '스펙트로그램'은 통상적인 의미의 모든 스펙트로그램을 포함할 수 있다. 구체적인 예를 들어, 본 출원에서의 스펙트로그램은 주파수가 멜스케일로 변환된 멜 스펙트로그램일 수 있다.A method of converting a signal having an amplitude value in the time domain into a spectrogram having an amplitude value in the frequency domain may be performed according to a commonly known method. For example, the signal to be converted is divided into a predetermined time interval, the divided signal is decomposed into individual sine waves through fast Fourier transform, and the magnitude according to the frequency of the decomposition is displayed as a frequency according to time and the amplitude is displayed through color. The spectrogram can be converted to the form It is not limited thereto, and 'spectrogram' in the present application may include all spectrograms in a conventional sense. As a specific example, the spectrogram in the present application may be a mel spectrogram obtained by converting a frequency into a mel scale.

본 출원의 일 실시예에 따르면, 음향 신호로부터 추출된 제1 온셋 신호, 제2 온셋 신호, ... 제N 온셋 신호에 각각 대응되는 제1 스펙트로그램, 제2 스펙트로그램, ... 제N 스펙트로그램이 획득될 수 있다. 스펙트로그램 획득 모듈은, 온셋 신호 추출 모듈(2320)에 의해 획득된 온셋 신호의 개수만큼 스펙트로그램을 획득할 수 있다. 일 예로, 스펙트로그램 획득 모듈은, 온셋 신호 추출 모듈(2320)에 의해 획득된 온셋 신호의 개수만큼, 온셋 신호를 스펙트로그램으로 변환하는 동작을 수행할 수 있다. 다른 예로, 스펙트로그램 획득 모듈은, 음향 신호에 대해서 전체 스펙트로그램으로 변환하는 동작을 수행하고, 온셋 신호 추출 모듈(2320)에 의해 획득된 온셋 신호의 개수만큼 전체 스펙트로그램에서 온셋 신호에 대응되는 스펙트로그램을 추출하는 동작을 수행할 수 있다. According to an embodiment of the present application, a first spectrogram, a second spectrogram, ... N th spectrogram corresponding to a first onset signal, a second onset signal, ... an N th onset signal extracted from a sound signal, respectively. A spectrogram can be obtained. The spectrogram acquisition module may obtain as many spectrograms as the number of onset signals acquired by the onset signal extraction module 2320 . For example, the spectrogram acquisition module may perform an operation of converting onset signals into spectrograms as many as the number of onset signals obtained by the onset signal extraction module 2320 . As another example, the spectrogram acquisition module performs an operation of converting the acoustic signal into a full spectrogram, and converts the entire spectrogram into a spectrogram corresponding to the onset signal as many as the number of onset signals acquired by the onset signal extraction module 2320. An operation of extracting a gram may be performed.

서버(2000)는 기침구간인지 판별(S3000)할 수 있다. 서버(2000)는 S2000 단계에서 획득된 스펙트로그램에 기초하여, 기침구간인지 판별(S3000)할 수 있다. Server 2000 may determine whether the cough section (S3000). Server 2000 based on the spectrogram obtained in step S2000, it may determine whether the cough interval (S3000).

도 6은 본 출원의 일 실시예에 따른 기침 판별 동작을 설명하기 위한 도면이다. 6 is a diagram for explaining a cough determination operation according to an embodiment of the present application.

*기침 판별 모듈(2330)은 온셋 신호에 대응되는 스펙트로그램에 기초하여 기침구간 인지를 판별할 수 있다. 기침 판별 모듈(2330)은 스펙트로그램 데이터를 입력받아 기침 또는 비기침으로 분류하도록 학습된 기침 판별 모델을 포함할 수 있다. * Cough determination module 2330 may determine whether the cough section is based on the spectrogram corresponding to the onset signal. The cough discrimination module 2330 may include a cough discrimination model trained to receive spectrogram data and classify it as a cough or non-cough.

스펙트로그램 데이터는, 시간도메인에서 기 결정된 길이를 가지는 스펙트로그램일 수 있다. 스펙트로그램 데이터는, S1000단계에서 추출된 온셋 신호에 대응되는 스펙트로그램일 수 있다. The spectrogram data may be a spectrogram having a predetermined length in the time domain. The spectrogram data may be a spectrogram corresponding to the onset signal extracted in step S1000.

필요한 경우, 기침 판별 모듈(2330)에 스펙트로그램 데이터가 입력되기 이전에 데이터 전처리 모듈(2340)에 의해 전처리가 수행될 수 있다. 스펙트로그램 데이터는, S1000단계에서 추출된 온셋 신호에 대응되는 스펙트로그램에 대하여 데이터 전처리 모듈(2340)을 통해 리사이징(Resizing), 스케일링(Scaling) 및 gray to RGB 변환 중 적어도 하나의 전처리가 수행된 데이터일 수 있다. 구체적인 예를 들어, 스펙트로그램에 대하여 데이터 전처리 모듈(2340)을 통해 리사이징(Resizing)이 수행될 수 있다.If necessary, preprocessing may be performed by the data preprocessing module 2340 before the spectrogram data is input to the cough discrimination module 2330. The spectrogram data is data for which at least one preprocessing of resizing, scaling, and gray to RGB conversion has been performed on the spectrogram corresponding to the onset signal extracted in step S1000 through the data preprocessing module 2340. can be For a specific example, resizing may be performed on the spectrogram through the data preprocessing module 2340.

본 출원의 일 실시예에 따르면, 기침 판별 모델은 상기 스펙트로그램 데이터 및 상기 스펙트로그램 데이터에 라벨링된 태깅 정보를 포함하는 학습 데이터 셋을 이용하여 학습될 수 있다. 이 때, 태깅 정보는 상기 스펙트로그램 데이터가 기침에 대응되는 소리를 포함하는지 여부에 관한 정보를 포함할 수 있다. According to an embodiment of the present application, a cough discrimination model may be learned using a training data set including the spectrogram data and tagging information labeled on the spectrogram data. In this case, the tagging information may include information about whether the spectrogram data includes a sound corresponding to coughing.

구체적인 실험예에 있어서, EfficientNet을 이용해서 기침 판별 모델을 구축하였다. In a specific experimental example, a cough discrimination model was constructed using EfficientNet.

실험예 1에 따르면, Efficient Net B5를 이용하여, 입력 데이터로 0.5s의 시간 길이를 가지는 스펙트로그램을 이용하였고, 출력 데이터로 녹음된 소리를 듣고 사람이 평가한 기침/비기침의 표지를 반영하였다. 입력 데이터의 경우 하나의 스펙트로그램을 주파수 구간별로 나누어 총 512개의 입력 노드에 나누어 입력하였고, 모델에 입력되기 전 스펙트로그램에 대하여 300*300 사이즈로 Resize를 수행하고, 최대값으로 나누는 Scaling을 수행하고, Gray to RGB 변환을 수행하였다. 총 350,000개의 학습 데이터를 이용해서 기침 판별 모델에 대한 학습을 수행하였고, 이 때, 기침 판별 모델은 정밀도(Precision) 0.84, 재현율(Recall) 0.93의 정확도를 나타냈다.According to Experimental Example 1, using the Efficient Net B5, a spectrogram with a time length of 0.5 s was used as input data, and the cough/non-cough signature evaluated by a person listening to the recorded sound was reflected as output data. . In the case of input data, one spectrogram was divided by frequency interval and inputted by dividing it into a total of 512 input nodes. Resize the spectrogram to a size of 300 * 300 before being input to the model, perform scaling by dividing by the maximum value, , Gray to RGB conversion was performed. The cough discrimination model was trained using a total of 350,000 learning data, and at this time, the cough discrimination model showed accuracy of 0.84 precision and recall 0.93.

실험예 2에 따르면, Efficient Net B3를 이용하여, 입력 데이터로 0.5s의 시간 길이를 가지는 스펙트로그램을 이용하였고, 출력 데이터로 녹음된 소리를 듣고 사람이 평가한 기침/비기침의 표지를 반영하였다. 입력 데이터의 경우 하나의 스펙트로그램을 주파수 구간별로 나누어 총 512개의 입력 노드에 나누어 입력하였고, 모델에 입력되기 전 스펙트로그램에 대하여 300*300 사이즈로 Resize를 수행하고, 최대값으로 나누는 Scaling을 수행하고, Gray to RGB 변환을 수행하였다. 총 350,000개의 학습 데이터를 이용해서 기침 판별 모델에 대한 학습을 수행하였고, 이 때, 기침 판별 모델은 정밀도(Precision) 0.92, 재현율(Recall) 0.9의 정확도를 나타냈다. According to Experimental Example 2, using the Efficient Net B3, a spectrogram with a time length of 0.5 s was used as input data, and the cough/non-cough signature evaluated by a person listening to the recorded sound was reflected as output data. . In the case of input data, one spectrogram was divided by frequency interval and inputted by dividing it into a total of 512 input nodes. Resize the spectrogram to a size of 300 * 300 before being input to the model, perform scaling by dividing by the maximum value, , Gray to RGB conversion was performed. The cough discrimination model was trained using a total of 350,000 learning data, and at this time, the cough discrimination model showed accuracy of 0.92 precision and recall 0.9.

일반적으로 Efficient Net B5가 더 정확하다고 알려져 있으나, 실제로 실험을 진행해보니 Efficient Net B3가 정밀도가 더 좋은 것을 확인하였고, 따라서, 기침 계수 모델을 구현함에 있어서 데이터 사이즈에 따른 모델 크기의 선택이 정확도에 영향을 미칠 수 있음을 확인하였다.In general, it is known that Efficient Net B5 is more accurate, but in actual experiments, it was confirmed that Efficient Net B3 has better accuracy. Therefore, in implementing the cough coefficient model, the selection of the model size according to the data size affects the accuracy. It was confirmed that it can affect

다시 도 3을 참조하면, 서버(2000)는 전체기침횟수를 계산(S4000)할 수 있다. 서버(2000)는 S3000 단계에서 획득된, 적어도 하나의 온셋 신호의 스펙트로그램에 대하여 판별된 결과에 기초하여, 상기 음향 신호에서의 전체 기침 횟수를 계산할 수 있다.Referring back to FIG. 3 , the server 2000 may calculate the total number of coughs (S4000). The server 2000 may calculate the total number of coughs in the sound signal based on a result determined for the spectrogram of at least one onset signal obtained in step S3000.

본 출원의 일 실시예에 따르면, S2000 및 S3000 단계는 추출된 하나이상의 온셋 신호의 각 온셋 신호에 대해서 수행될 수 있다. S2000 및 S3000 단계는 S1000에서 획득된 온셋 신호의 수에 대응되는 횟수만큼 수행될 수 있다. According to an embodiment of the present application, steps S2000 and S3000 may be performed for each onset signal of one or more extracted onset signals. Steps S2000 and S3000 may be performed as many times as the number of onset signals obtained in S1000.

서버(2000)는 S3000에서 기침으로 판별된 온셋 신호에 기초하여 전체 기침 횟수를 판별할 수 있다. 서버(2000)는 기침으로 판별된 온셋 신호를 계수하여 전체 기침 횟수를 판별할 수 있다.The server 2000 may determine the total number of coughs based on the onset signals determined as coughs in S3000. The server 2000 may determine the total number of coughs by counting onset signals determined as coughs.

기침은 급격한 음압의 변화를 수반하기 때문에, 온셋 지점을 판별하여 온셋 신호를 추출하면 전체 음향 신호에서 기침 구간을 포착할 수 있다. 따라서, 기침 구간을 포착하여 기침 판별 모델에 적용, 연산량을 감소시키기 위해 이러한 동작을 수행할 수 있다.Since coughing is accompanied by a rapid change in sound pressure, the coughing section can be captured from the entire sound signal by determining the onset point and extracting the onset signal. Therefore, such an operation may be performed to reduce the amount of calculation by capturing the cough section and applying it to the cough discrimination model.

다만, 이경우, 1) 온셋 지점의 검출을 사용되는 통상적인 알고리즘에서 연속하는 두 샘플링 구간 사이의 시간 도메인에서의 중첩 영역이 존재하고, 2) 하나의 기침에서 둘 이상의 온셋 지점이 검출될 수 있는 기침 소리의 특성상, 정확한 기침의 계수를 위해서는 후처리를 수행할 필요가 있다.However, in this case, 1) there is an overlapping region in the time domain between two consecutive sampling intervals in a typical algorithm used to detect onset points, and 2) a cough in which two or more onset points can be detected in one cough Due to the nature of the sound, it is necessary to perform post-processing for accurate cough counting.

본 출원의 일 실시예에 따르면, 서버(2000)는 기침으로 판별된 인접한 두 온셋 신호의 시점간의 이격 시간이 기준 시간(예, 0.5s) 이내일 때, 하나로 계수하도록 구현될 수 있다. According to an embodiment of the present application, the server 2000 may be implemented to count as one when the separation time between two adjacent onset signal time points determined as cough is within a reference time (eg, 0.5 s).

도 7은 본 출원의 일 실시예에 따른 기침 계수 방법에 관하여 설명하기 위한 도면이다.7 is a view for explaining a cough counting method according to an embodiment of the present application.

도 7을 참조하면, 좌측 열의 '시점'은 음향 신호에서 온셋 신호의 시점의 시간값을 나타내는 것이다. 다시 말해, 온셋 지점이 음향 신호의 시작으로부터 15.5초에 검출되었다면, 온셋 신호의 시점은 00:00:15.5로 나타날 수 있다. Referring to FIG. 7 , 'time point' in the left column represents a time value of a time point of an onset signal in a sound signal. In other words, if the onset point is detected at 15.5 seconds from the start of the sound signal, the time point of the onset signal may appear as 00:00:15.5.

우측 열의 '태그'는 온셋 신호에 기초하여 기침 판별 모델이 판별한 결과값을 나타내는 것이다. 다시 말해, 온셋 신호를 기침 판별 모델에 입력하여 기침으로 분류(또는 예측)되었다면, 온셋 신호의 태그는 '기침'으로 나타날 수 있다.'Tag' in the right column represents the result value determined by the cough discrimination model based on the onset signal. In other words, if the onset signal is input to the cough discrimination model and classified (or predicted) as cough, the tag of the onset signal may appear as 'cough'.

서버(2000)는 인접한 두 온셋 신호의 시점 간의 이격 시간이 기준 시간 이내일 때 하나의 기침으로 계수할 수 있다. 즉, 기침으로 판별된 제1 온셋 신호(00:00:15.5), 기침으로 판별된 제2 온셋 신호(00:00:15.7), 기침으로 판별된 제3 온셋 신호(00:17:20), 기침으로 판별된 제4 온셋 신호(00:17:20.5) 및 기침으로 판별된 제5 온셋 신호(00:22:00)에서, 서버(2000)는 제1 온셋 신호의 시점와 제2 온셋 신호의 시점이 0.2sec 만큼 이격되어 기준 시간(0.25sec) 보다 작을 때, 하나의 기침으로 계수할 수 있다. 서버(2000)는 제3 온셋 신호의 시점와 제4 온셋 신호의 시점이 0.5sec 만큼 이격되어 기준 시간(0.25sec) 보다 클 때, 두개의 기침으로 계수할 수 있다. The server 2000 may count as one cough when the separation time between two adjacent onset signal time points is within the reference time. That is, a first onset signal determined as cough (00:00:15.5), a second onset signal determined as cough (00:00:15.7), a third onset signal determined as cough (00:17:20), In the fourth onset signal (00:17:20.5) determined as cough and the fifth onset signal (00:22:00) determined as cough, the server 2000 determines the timing of the first onset signal and the timing of the second onset signal. When it is spaced apart by 0.2 sec and is shorter than the reference time (0.25 sec), it can be counted as one cough. The server 2000 may count as two coughs when the time of the third onset signal and the time of the fourth onset signal are separated by 0.5 sec and are greater than the reference time (0.25 sec).

본 출원의 일 실시예에 따르면, 서버(2000)는 제2 온셋 신호의 시점과 제3 온셋 신호의 시점의 이격 시간을 획득하지 않을 수 있다. 다시 말해, 서버(2000)는 이전 온셋 신호의 시점와 이후 온셋 신호의 시점 사이의 이격 시간이 기준 시간보다 이내에서 하나로 계수한 경우, 이후 온셋 신호를 사실상 '삭제'한 것과 유사하기 때문에, 이후 온셋 신호의 시점과 이후 온셋 신호의 다음차 온셋 신호의 시점을 비교하지 않을 수 있다. 따라서, 서버(2000)는 제2 온셋 신호의 시점과 제3 온셋 신호의 시점의 이격 시간을 획득하지 않고, 제3 온셋 신호와 제4 온셋 신호의 시점을 비교하여 기침 계수 절차를 이어갈 수 있다.According to an embodiment of the present application, the server 2000 may not acquire a separation time between a time point of the second onset signal and a time point of the third onset signal. In other words, when the server 2000 counts the separation time between the start point of the previous onset signal and the start point of the next onset signal as one within the reference time, since it is similar to 'deleting' the next onset signal, the next onset signal A time point of the next onset signal and a time point of the next onset signal may not be compared. Accordingly, the server 2000 may continue the cough counting procedure by comparing the timings of the third onset signal and the fourth onset signal without acquiring a separation time between the timing of the second onset signal and the timing of the third onset signal.

본 출원의 일 실시예에 따르면, 기준 시간은 0.5sec보다 작을 수 있다. 이는, 사람이 0.5초내에 2번의 기침을 완료하는 경우는 거의 없기 때문일 수 있다. 따라서, 본 출원의 일 실시예에 따르면 기준 시간은 0.25sec로 설정될 수 있고, 이에 한정되지 않고, 필요한 경우(예를 들어, 연속된 기침의 특징을 고려하는 경우)에는 그에 따른 조정된 값을 사용할 수 있다. 본 출원의 일 실시예에 따르면, 온셋 길이는 0.5sec보다 클 수 있다. 이는, 사람들은 대부분의 기침을 0.5초 이내에 완료하기 때문에, 0.5초는 기침의 유무를 판별하기에 충분한 길이이기 때문일 수 있다. 따라서, 본 출원의 일 실시예에 따르면 온셋 길이는 0.5sec로 설정될 수 있고, 이에 한정되지 않고, 필요한 경우(예를 들어, 기침의 종류를 판별하는 경우)에는 그에 따른 조정된 값을 사용할 수 있다. 온셋 신호의 시간 도메인에서의 길이(즉, 온셋 길이)는 기준 시간보다 길 수 있다. According to one embodiment of the present application, the reference time may be less than 0.5 sec. This may be because a person rarely completes two coughs within 0.5 seconds. Therefore, according to an embodiment of the present application, the reference time may be set to 0.25 sec, but is not limited thereto, and if necessary (eg, considering the characteristics of continuous cough), the adjusted value accordingly can be used According to an embodiment of the present application, the onset length may be greater than 0.5 sec. This may be because people complete most coughs within 0.5 seconds, so 0.5 seconds is a sufficient length to determine the presence or absence of coughing. Therefore, according to an embodiment of the present application, the onset length may be set to 0.5 sec, but is not limited thereto, and an adjusted value may be used if necessary (for example, when determining the type of cough). there is. The length of the onset signal in the time domain (ie, the onset length) may be longer than the reference time.

이상 설명된 본 발명에 따른 실시예들은 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 비일시성의 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 비일시성의 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 비일시성의 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다. 비일시성의 컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Embodiments according to the present invention described above may be implemented in the form of program instructions that can be executed through various computer components and recorded in a non-temporary computer readable recording medium. The non-transitory computer readable recording medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the non-transitory computer readable recording medium may be specially designed and configured for the present invention or may be known and usable to those skilled in the art of computer software. Examples of non-transitory computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROM and DVD, and magneto-optical media such as floptical disks ( magneto-optical media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter or the like as well as machine language codes such as those produced by a compiler. The hardware device may be configured to act as one or more software modules to perform processing according to the present invention and vice versa.

이상에서 본 발명이 구체적인 구성요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나, 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명이 상기 실시예들에 한정되는 것은 아니며, 본 발명이 속하는 기술분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형을 꾀할 수 있다.In the above, the present invention has been described by specific details such as specific components and limited embodiments and drawings, but these are provided to help a more general understanding of the present invention, and the present invention is not limited to the above embodiments. , Those skilled in the art to which the present invention pertains may seek various modifications and variations from these descriptions.

따라서, 본 발명의 사상은 상기 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐만 아니라 이 특허청구범위와 균등하게 또는 등가적으로 변형된 모든 것들은 본 발명의 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the above-described embodiments, and not only the claims described later, but also all modifications equivalent or equivalent to these claims belong to the scope of the spirit of the present invention. will do it

Claims

In the method of counting cough by analyzing the acoustic signal,
extracting at least one onset signal from the sound signal, wherein the onset signal includes a signal corresponding to an attack of sound and has a predetermined length in a time domain;
obtaining a spectrogram corresponding to the extracted onset signal;
determining whether the obtained spectrogram is a cough interval by using a cough discrimination model; and
Based on a result determined for the spectrogram of the onset signal, calculating the total number of coughs in the acoustic signal;
The acquiring step and the determining step are performed for each onset signal of the extracted one or more onset signals,
In the step of calculating the total number of coughs,
If the time point of the first onset signal and the time point of the second onset signal are separated by an amount exceeding the reference time, they are counted as two;
If the timing of the first onset signal and the timing of the second onset signal are within a reference time, it is counted as one;
The first onset signal and the second onset signal are signals determined as coughing.

According to claim 1,
The length of the onset signal in the time domain is longer than the reference time.

According to claim 1,
The step of extracting the onset signal,
detecting an onset point in the sound signal; and
and extracting a signal corresponding to the time interval of the predetermined length with the detected onset point as a starting point.

According to claim 3,
Obtaining the spectrogram,
converting the extracted onset signal into a frequency domain to obtain a spectrogram, comprising performing Fourier transform on the extracted onset signal.

According to claim 3,
Obtaining the spectrogram,
And extracting a spectrogram corresponding to the extracted onset signal from an entire spectrogram obtained by transforming the acoustic signal into a frequency domain.

According to claim 1,
The cough discrimination model is a classification model learned to classify cough or non-cough by receiving spectrogram data,
Wherein the spectrogram data is a spectrogram image having a predetermined length in the time domain.

According to claim 6,
The cough discrimination model is learned using a training data set including the spectrogram data and tagging information labeled on the spectrogram data,
The tagging information includes information about whether the spectrogram data includes a sound corresponding to coughing.

According to claim 1,
The determining step is
performing at least one preprocessing of resizing, scaling, and RGB conversion on the obtained spectrogram; and
Applying the preprocessed spectrogram to the cough discrimination model, determining whether the spectrogram is a cough interval; method comprising.

According to claim 1,
The calculation step is
With respect to the onset signals determined as cough, determining whether an interval between time points of two adjacent onset signals in the time domain is within the reference time.

A non-transitory computer readable recording medium storing a computer program for executing the method of claim 1.

In the system for counting cough by analyzing the acoustic signal,
a communication unit that obtains a sound signal including a sound recorded by an external device;
a memory unit for storing instructions for loading a cough discrimination model;
extracting one or more onset signals from the acoustic signal, wherein the onset signal includes a signal corresponding to an attack of sound and has a predetermined length in the time domain;
obtaining a spectrogram corresponding to the extracted onset signal;
Using the cough discrimination model, it is determined whether the obtained spectrogram is a cough section,
a control unit configured to calculate the total number of coughs in the acoustic signal based on a result determined for the spectrogram of the onset signal; Including, server.

According to claim 11,
The control unit,
With respect to onset signals determined as cough, the server configured to determine whether an interval between time points of two adjacent onset signals in a time domain is within the reference time.

According to claim 11,
The control unit performs at least one preprocessing of resizing, scaling, and RGB conversion on the obtained spectrogram,
A server configured to use the preprocessed spectrogram to determine whether the acquired spectrogram is a cough interval by using the cough discrimination model.

According to claim 11,
The cough discrimination model is a classification model learned to classify cough or non-cough by receiving spectrogram data,
The spectrogram data is a spectogram image having a predetermined length in the time domain.

According to claim 14,
The cough discrimination model is learned using a training data set including the spectrogram data and tagging information labeled on the spectrogram data,
The tagging information includes information on whether the spectrogram data includes a sound corresponding to coughing, the server.