US20220384040A1 - Machine Learning Model Based Condition and Property Detection - Google Patents
Machine Learning Model Based Condition and Property Detection Download PDFInfo
- Publication number
- US20220384040A1 US20220384040A1 US17/752,741 US202217752741A US2022384040A1 US 20220384040 A1 US20220384040 A1 US 20220384040A1 US 202217752741 A US202217752741 A US 202217752741A US 2022384040 A1 US2022384040 A1 US 2022384040A1
- Authority
- US
- United States
- Prior art keywords
- dataset
- model
- predetermined data
- stage
- data attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/60—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
- G16H40/67—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for remote operation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
Definitions
- Condition and property detection such as various types of diagnostics for instance, may be performed in a variety of ways that can differ considerably depending on the condition or property serving as the subject of analysis, but often rely on manual processes.
- medical diagnostics often require extraction and testing of a blood or tissue sample, or expert review and interpretation of images or test results generated by sophisticated testing equipment such as computerized tomography (CT) or magnetic resonance imaging (MRI) scanners, electrocardiogram (ECG) machines, and the like.
- CT computerized tomography
- MRI magnetic resonance imaging
- ECG electrocardiogram
- diagnostics performed on industrial equipment or other machines typically require human inspection, or at the very least review of sensor data by a trained human technician.
- a common element among many is the need for a human having some level of expertise to participate in the process.
- FIG. 1 shows a diagram of an exemplary system for performing machine learning (ML) model based condition and property detection, according to one implementation
- FIG. 2 shows a diagram of another exemplary implementation of a system for performing ML model based condition and property detection
- FIG. 3 illustrates a process for generating a dataset for use in training a first stage of a ML model for performing condition and property detection, according to one implementation
- FIG. 4 illustrates a process for training a first stage of a ML model for performing condition and property detection, according to one implementation
- FIG. 5 illustrates a process for generating a dataset for use in training a second stage of a ML model for performing condition and property detection, according to one implementation
- FIG. 6 illustrates a process for training a second stage of a ML model for performing condition and property detection, according to one implementation
- FIG. 7 illustrates an exemplary ML model trained to perform condition and property detection, according to one implementation
- FIG. 8 illustrates an exemplary ML model for performing condition and property detection that includes multiple first and second stages, according to one implementation
- FIG. 9 illustrates an exemplary ML architecture for performing condition and property detection in which multiple ML model pipelines having different first and second stages are utilized in parallel, according to one implementation.
- FIG. 10 shows a flowchart outlining an exemplary method for performing ML model based condition and property detection, according to one implementation.
- the present application discloses systems and methods for performing machine learning model based condition and property detection. It is noted that although the present condition and property detection solution is described below in detail by reference to FIGS. 3 , 4 , 5 , 6 , 7 , 8 , and 9 illustrating an exemplary use case in which a dataset including time-based audio of a human voice is used to predict the presence of an infectious disease marker, the present novel and inventive principles may be advantageously applied to various types of data to predict a wide variety of properties of interest. For instance, the present concepts can be readily adapted for use with substantially any type of dataset or data stream that can be granulized into tagged segment types, such as time-based media in the form of audio, video.
- audio-video (AV) content or time-based diagnostic test data such as electrocardiogram. (ECG) data or electroencephalogram (EEG) data, to name a few examples.
- ECG electrocardiogram.
- EEG electroencephalogram
- Specific use cases for the present novel and inventive concepts may include the operating performance of machinery such as home appliances, industrial equipment, ventilation systems, and vehicles, for instance.
- Examples of additional medical applications may include prediction of non-infectious disease states, prediction of chronic medical conditions such as dementia and schizophrenia, prediction of the presence of or recovery from musculoskeletal injury, immune system status, and stroke status, again to name merely a few examples.
- Specific example use cases for the present novel and inventive concepts may include using video to predict early onset Alzheimer's disease or Parkinson's disease, or o predict a leg injury in a subject, for instance, based on walking or other movement by the subject.
- video may be used to predict that a subject has had a stroke based on upper body movements or facial movements or expressions by the subject.
- AV content or audio content may be used to diagnose malfunction of an appliance, such as a washing machine, tip or the need to replace a timing belt or other drive component of a car.
- the systems and methods disclosed by the present application may be substantially or fully automated.
- the terms “automation,” “automated,” and “automating” refer to systems and processes that do not require the participation of a human user, such as a human system administrator.
- a human user such as a human system administrator.
- an engineer or medical professional may review the performance of the automated systems operating according to the automated processes described herein, that human involvement is optional.
- the processes described in the present application may be performed under the control of hardware processing components of the disclosed systems.
- a “machine learning model,” or “ML model,” refers to a mathematical model for making future predictions based on patterns learned from samples of data obtained from a set of trusted known matches and known mismatches, known as training data, Various learning algorithms can be used to map correlations between input data and output data. These correlations form the mathematical model that can be used to make future predictions on new input data.
- a predictive model may include one or more logistic regression models, Bayesian models, or artificial. neural networks (NNs), for example.
- machine learning models may be designed to progressively improve their performance of a specific task.
- a NN is a type of machine learning model in which patterns or learned representations of observed data are processed using highly connected computational layers that map the relationship between inputs and outputs.
- a “deep neural network” in the context of deep learning, may refer to a NN that utilizes multiple hidden layers between input and output layers, which may allow for learning based on features not explicitly-defined in raw data.
- a feature labeled or described as a NN refers to a deep neural network.
- NNs may be utilized to perform image processing, audio processing, or natural-language processing, for example.
- FIG. 1 shows exemplary system. 100 for performing ML model based condition and property detection, according to one implementation.
- system 100 includes computing platform 102 having processing hardware 104 and system memory 106 implemented as a non-transitory storage medium.
- system memory 106 stores software code 108 which may include one or more ML models.
- system 100 is implemented within a use environment including user systems 140 a, 140 b, 140 c, and 140 d (hereinafter “user systems 140 a - 140 d ”) providing respective datasets 120 a, 120 b, 120 c, and 120 d (hereinafter “datasets 120 a - 120 d ”), which may include time-based media for example, to system 100 via communication network 130 .
- ALSO shown in FIG. 1 are display 148 of user system 140 a, and network communication links 132 of communication network 130 interactively connecting system 100 with user systems 140 a - 140 d.
- system memory 106 may take the form of any computer-readable non-transitory storage medium.
- a computer-readable non-transitory storage medium may correspond to various types of media, such as volatile media and non-volatile media, for example.
- Volatile media may include dynamic memory, such as dynamic random access memory (dynamic RAM), while non-volatile memory may include optical, magnetic, or electrostatic storage devices.
- Common forms of computer-readable non-transitory storage media include, for example, optical discs such as DVDs, RAM, programmable read-only memory (PROM), erasable PROM (EPROM), and FLASH memory.
- Processing hardware 104 of system 100 may include multiple hardware processing units, such as one or more central processing units, one or more graphics processing units, and one or more tensor processing units, one or more field-programmable gate arrays (FPGAs), custom are for machine-learning training or inferencing, and an application programming interface (API) server, for example.
- CPU central processing unit
- GPU graphics processing unit
- TPU tensor processing unit
- a CPU includes an Arithmetic Logic Unit (ALU) for carrying out the arithmetic and logical operations of computing platform 102 , as well as a Control Unit (CU) for retrieving programs, such as software code 108 , from system memory 106 , while a GPU may be implemented to reduce the processing overhead of the CPU by performing computationally intensive graphics or other processing tasks.
- a TPU is an application-specific integrated circuit (ASIC) configured specifically for artificial intelligence (AI) processes such as machine learning.
- ASIC application-specific integrated circuit
- system 100 may include one or more computing platforms corresponding to computing platform 102 , such as computer servers for example, which may be co-located, or may form an interactively linked but distributed system, such as a cloud-based system, for instance.
- processing hardware 104 and system memory 106 may correspond to distributed processor and memory resources within system 100 .
- computing platform 102 may correspond to one or lore web servers accessible over a packet--switched network such as the Internet, for example.
- computing platform 102 may correspond to one or more computer servers supporting a wide area network (WAN), a local area network (LAN), or included in another type of private or limited distribution network.
- system 100 may be implemented virtually, such as in a data center.
- system 100 may be implemented in software, or as virtual machines.
- user systems 140 a - 140 d are shown variously as smartphone computer 140 a, video camera 140 b , microphone 140 c, and machine or diagnostic device 140 d, in FIG. 1 , those representations are provided merely by way of example.
- user systems 140 a - 140 d may take the form of any mobile or stationary devices capable of obtaining datasets 120 a - 120 d.
- user systems 140 a - 140 d may be any suitable mobile or stationary computing devices or systems that implement data processing capabilities sufficient to provide a user interface, support connections to communication network 130 , and implement the functionality ascribed to user systems 140 a - 140 d herein.
- one or more of user systems 140 a - 140 d may take the form of a desktop computer, laptop computer, tablet computer, or an implanted medical device, such as a. pacemaker or pump, to name a few examples.
- one or more of user systems 140 a - 140 d may take the form of a smart wearable device, such as a smartwatch, for example.
- display 148 may take the for n of a liquid crystal display (LCD), light-emitting diode (LED) display, organic light-emitting diode (OLED) display, quantum dot (QD) display, or any other suitable display screen that perform a physical transformation of signals to light.
- LCD liquid crystal display
- LED light-emitting diode
- OLED organic light-emitting diode
- QD quantum dot
- FIG. 2 shows another exemplary system, i.e., user system 240 , for performing ML model based condition and property detection, according to another implementation.
- user system 240 includes computing platform 242 having processing hardware 244 , memory 246 implemented as a nor-transitory storage medium storing user software application 250 , and display 248 .
- display 248 may be physically integrated with user system 240 or may be communicatively coupled to but physically separate from user system 240 .
- user system 240 is implemented as a smartphone, laptop computer, or tablet computer
- display 248 will typically he integrated with user system 240 .
- display 248 may take the form of a monitor separate from computing platform 242 in the form of a computer tower.
- User system 240 corresponds in general to any or all of user systems 140 a - 140 d in FIG. 1
- display 248 corresponds in general to display 148
- user systems 140 a - 140 d and display 148 may share any of the characteristics attributed to user system 240 and display 248 by the present disclosure, and vice versa. That is to say, like display 148 , display 248 may take the form of an LCD, LED display, OLED display, or QD display, for example.
- one or more of user system 140 a - 140 d may include features corresponding respectively to computing platform 242 , processing hardware 244 , and memory 246 storing user software application 250 .
- User system processing hardware 244 may include multiple hardware processing units, such as one or more CPUs, ogre or more GPUs one or more TPUs, and one or more FPGAs, for example, as those features are defined above.
- user software application 250 may be a thin client application of software code 108 , in FIG. 1 .
- user software application 250 may enable user system 240 to obtain any of datasets 120 a - 120 d, and to provide that dataset to system 100 for processing.
- user software application 250 may include substantially all of the features and functionality of software code 108 . That is to say, in some implementations, user system 240 may perform any or all of the operations attributed to system 100 by the present disclosure.
- user software application 250 is located in memory 246 of user system 240 , subsequent to transfer of user software application 250 to user system 240 via an external flash drive or dongle, or over a packet-switched network, such as the Internet, for example.
- user software application 250 may be persistently stored in memory 246 and may be executed locally on user system 240 by user system processing hardware 244 .
- FIGS. 3 , 4 , 5 , 6 , 7 , 8 , and 9 illustrate a specific example use case wherein audio properties of a dataset including time-based media in the form of recorded human vocalizations are used to predict the presence of an infectious disease, in one particular use case the presence of a. respiratory illness, such as influenza or Coronavirus Disease 2019 (COVID-19).
- the prediction solution disclosed in the present application overcomes the aforementioned deficiencies in the conventional art by implementing a multi-step ML model based process that can predict disease presence by automatically segmenting unstructured vocal sample data into normalized datasets, each data element being of specific audio segment types that are determined to be optimal for prediction of COVID-19 infection, and using those datasets to train disease predictors to be. more flexible and precise than conventional approaches allow.
- the present prediction solution can take a dataset of unstructured voice samples that are tagged with COVID-19 status (collected from hospitals, for example), extract the “mmmmm” utterance segments into a normalized dataset that retains the corresponding COVID-19 status tags, and use this dataset to train and deploy a composite ML model that can predict COVID-19 status, based upon input of an unstructured vocal sample.
- the vocal utterance identified in the present application as “mmmmm” refers to a sustained consonantal sound known formally as the “voiced bilabial nasal,” identified by the symbol (m) in the International Phonetic Alphabet. That is to say, the vocal utterance “mmmmm” is produced by sustaining the sound of the English letter “m” at the end of English word “them.”
- this example relies on prior knowledge that a segment type based on the “mmmmm” utterance would be useful for ML model based prediction of COVID-19
- the present prediction process can be performed with various different segment types in parallel, such as isolating utterances of each vocal phoneme into separate datasets, for example, in order to determine which segment types are most effective for prediction of COVID-19. This may be advantageous in cases where no a priori or existing a posteriori knowledge exists, as well as for identifying segments which can expand data collection opportunities.
- a segment based on a phoneme sound such as “o ⁇ ” would be collectible from normal speech (and thus amenable to ambient, passive data collection approaches), but a segment based on a coughing sound would only be collectible from people who are symptomatic or from those who are instructed to cough via a coached data collection process.
- DS1 a dataset of various vocal utterances, including examples of the sound “mmmmm” as well as many other sounds, where each data. element is tagged with a label that specifies whether the element Is or Is Not “mmmmm.”
- This initial dataset (hereinafter “DS1” as identified in FIG. 3 ) can either be acquired or created.
- DS1 may then be used to train a ML model (hereinafter “ML1” as also identified in FIG. 3 ) which takes as input audio segment elements in the format of DS1 and outputs a likelihood value for the input element being “mmmmm,” based on techniques such as spectrogram analysis, Hidden Markov models, and the like. This is illustrated by FIG.
- ML1 may be a feature of software code 108 of system 100 , or of user software application 250 of user system 240 .
- DS1 may be acquired or created, and may be used to train ML1, by software code 108 , executed by processing hardware 104 of system 100 , or by user software application 250 , executed by user system processing hardware 244 .
- a new ML model (hereinafter “ML2” as identified in FIG. 4 ) may be created, which can predict the bounding timestamps of segments within an input audio file that are Yes for VC, in this case the “mmmmm” sound, with sufficiently high likelihood.
- ML2 may be created based upon ML1 and machine learning techniques such as sliding window convolutions, where stride length may be a tunable hyperparameter of the model.
- ML2 can be trained using another dataset (hereinafter “DS2” as also identified in FIG. 4 ) of unstructured audio speech files having temporal segments tagged with VC status. As with DS1, DS2 can either be acquired or created.
- ML2 may also be a feature of software code 108 of system 100 , or of user software application 250 of user system 240 .
- ML2 may be created, and may be trained using DS 2 , by software code 108 , executed by processing hardware 104 of system 100 , or by user software application 250 , executed by user system processing hardware 244 .
- dataset DS4 may be used to train another ML model (hereinafter “ML3” as identified in FIG. 6 ) that outputs a prediction for the likelihood of COVID-19 given an input in the format of the elements of DS4, “mmmmm” in this exemplary use case.
- ML3 can be created and trained without prior knowledge. However, efficacy may be improved by creating ML3 based on existing knowledge of acoustic biomarkers likely to be relevant for unique and specific prediction of COVID-19, even amongst asymptomatic carriers, and audio processing techniques such as Fast Fourier Transforms and Mel-frequency Cepstrum analysis.
- acoustic biomarkers may be manifestations of one or more of COVID-19 related neuromuscular vocal cord impairment, respiratory degradation, or changes in intonation, to name a few.
- ML3 may also be a feature of software code 108 of system 100 , or of user software application 250 of user system 240 .
- procedure 4 described above may be performed by software code 108 , executed by processing hardware 104 of system 100 , or by user software application 250 , executed by user system processing hardware 244 .
- the final ML model based COVID-19 predictor (hereinafter “ML+” as identified in FIG. 7 ) may be created based on a combination of the ML2 and ML3 models.
- ML+ may take the form of an integrated ML model having ML2 as a first, i.e., input stage, and having ML3 as a second, i.e., output stage.
- ML+ can take as input unstructured audio recording data (such as a person speaking normally into a mobile phone) and output a prediction of COVID-19 status, such as a likelihood score.
- ML+ which could be deployed via user software application 250 , for example, is given input that it will first extract one or more region(s) which are predicted by ML2 to be of the “mmmmm” segment type. That or those region(s) may then be provided as input into ML3.
- the resulting ML3 COVID 19 status predictions can then be utilized to determine output of ML+ with respect to COVID-19 status: such as “Positive” or “Negative,” or as a likelihood score.
- the likelihood predictions could be averaged to determine the ML+ output prediction, but a more complicated process could be utilized as well.
- ML+ can be improved over time by direct training via datasets in the form of DS3, or through independent improvements in ML1, ML2, or ML3.
- any ground truth data from additional COVID-19 test results such as polymerase chain reaction (PCR) tests for example, can be used to augment dataset DS3, and consequently refine ML3 and parameters, such as acceptable prediction likelihood thresholds, averaging processes used for multiple ML3 predictions in ML+, and the like.
- ML+ may be a feature of software code 108 of system 100 , or of user software application 250 of user system 240 .
- procedure 5 described above may be performed by software code 108 , executed by processing hardware 104 of system 100 , or by user software application 250 , executed by user system processing hardware 244 .
- procedures 1 through 5 may be performed on system 100 , while in other implementations procedures 1 through 5 may be performed on user system 240 .
- ML+ may be deployed to user software application 250 on user system 240 after its creation on system 100 .
- ML+ may be deployed to system 100 after its creation elsewhere, and software code 108 , when executed by processing hardware 104 , may utilize that pre-existing ML+ to predict the respiratory infection (e.g., COVID-19) status using unstructured voice samples.
- procedures 1 through 4 can be used to generate multiple different instances of ML3 in parallel.
- procedures 1 through 4 could be performed for each vocal phoneme instead of just the sound “mmmmm,” for example, and the most efficacious vocal segment types for prediction of COVID-19 could be determined. This could be accomplished by creating DS1 datasets tagged with Yes/No for each of the different segment types, or all at once in a single non-binary segmenter via the procedures depicted in FIG. 3 and FIG. 4 . This would lead to extracting datasets relating to each audio segment type in procedure 3, and so forth.
- FIG. 8 An example of generating multiple different instances of ML3 is shown in FIG. 8 , in which each of machine learning pipelines 800 a, 800 b , and 800 c produce different instantiations of ML3 independently of one another and in parallel. It is noted that although FIG. 8 shows three parallel machine learning pipelines 800 a, 800 b , and 800 c, that representation is merely exemplary. In other use cases in which multiple instances of ML3 are generated, as few as two parallel machine learning pipelines, or more than three machine learning pipelines, may be employed.
- ML+ can be enhanced as the result of the generation of multiple instances of ML3 using different instances of ML2 as shown in FIG. 8 .
- a version of ML+ finalized for deployment could make use of those parallel ML2 and ML3 portions of machine learning pipelines 800 a, 800 b , and 800 c in any desired combination, as further shown in FIG. 9 .
- the process depicted in FIG. 8 and FIG. 9 does not require prior knowledge or presumption, it can be useful for the prediction of diseases and physical or mental conditions that may have different, novel, and unknown vocal or non-vocal signatures, as well as the determination of the vocal or other data segment types that are optimal for the prediction or diagnosis of such diseases and conditions. These may include non-infectious diseases and chronic conditions such as dementia, schizophrenia, and Parkinson's disease, for example.
- non-disease characteristics can be predicted through this process as well, such as biological sex, age, and stress level. Because non-disease factors such as increased stress level are correlated with diminished immune system function, the prediction process described above can result not only in the creation of effective predictors of disease, but also predictors of disease susceptibility.
- the present ML model based diagnostic solution is configured to detect human manifestations of the disease state, it is agnostic, and therefore remains effective as a diagnostic tool, even when infectious vectors mutate.
- the present ML model based diagnostic solution can be expected to be, and remain, robustly reliable against viral sub-variants.
- any PII acquired by user software application 250 may be sequestered on user system 240 and be unavailable to system 100 or other external agents.
- FIG. 10 shows flowchart 1060 presenting an exemplary method performing ML model based condition and property detection, according to one implementation, With respect to the method outlined in FIG. 10 , it is noted that certain details and features have been left out of flowchart 1060 in order not to obscure the discussion of the inventive features in the present application.
- datasets 120 a - 120 d may include a variety of different data types, including e-based media in the form of audio, video, AN content, sensor data, and test result data, to name a few examples.
- datasets 120 a - 120 d may include a variety of different data types, including e-based media in the form of audio, video, AN content, sensor data, and test result data, to name a few examples.
- e-based media in the form of audio, video, AN content, sensor data, and test result data, to name a few examples.
- the dataset received in action 1062 may be generated or obtained by one of respective user systems 140 a / 240 , 140 b / 1240 , 140 c / 240 , or 140 d / 240 , and may be received by system 100 from the one of respective user systems 140 a / 240 , 140 b / 240 , 140 c / 240 , or 140 d / 240 via communication network 130 and network communication links 132 .
- the dataset may be received in action 1062 by soft ode 108 , executed by processing hardware 104 of computing platform 102 .
- the diagnostic processing of the one of datasets 120 a - 120 d may be performed locally on one of respective user systems 140 a / 240 , 140 b / 240 , 140 c / 240 , or 140 d / 240
- the dataset received in action 1062 may be received by user software application 250 , executed by user system processing hardware 244 .
- Flowchart 1060 further includes performing an analysis of the dataset received in action 1062 , using a first stage, i.e., ML2 of trained ML model ML+, to detect the presence of a predetermined data attribute (action 1064 ).
- processing hardware 104 may execute software code 108
- user system processing hardware 244 may execute user software application 250 to determine whether the dataset received in action 1062 includes the sound “mmmmm” and the bounding timestamps of regions that include that characteristic oaf interest.
- the predetermined data attribute having its presence analyzed in action 1064 may audio attribute of the dataset received in action 1062 .
- that audio attribute may be derived from one or more of speech, a non-verbal utterance, or a pulmonary expulsion, such as a cough for example.
- the predetermined attribute the predetermined data attribute having its presence analyzed in action 1064 may be a visual attributed in the form of a human tremor or tic.
- processing hardware 104 may execute software code 108
- user system processing hardware 244 may execute user software application 250 to utilize a visual analyzer included as a feature of software code 108 or user software application 250 , an audio analyzer included as a feature of software code 108 or user software application 250 , or such a visual analyzer and audio analyzer, to perform the analysis of the received dataset in action 1064 .
- a visual analyzer included as a feature of software code 108 or user software application 250 may be configured to apply computer vision or other Al techniques to the dataset received in action 1062 , or may be implemented as a NN or other type of ML model. Such a visual analyzer may be configured or trained to recognize physical movements and their frequency, for example.
- An audio analyzer included as a feature of software code 108 or user so -are application 250 may also be implemented as a NN or other ML model.
- a visual analyzer and an audio analyzer may be used in combination to the received dataset.
- the received dataset will typically include multiple video frames, multiple audio frames, or multiple video frames and multiple audio frames.
- processing hardware 104 may execute software code 108
- user system processing hardware 244 may execute user software application 250 to perform the visual analysis of the received dataset, the audio analysis of the received dataset, or both the visual analysis and the audio analysis, on a frame-by-frame basis. That is to say, in various implementations, the analysis of the received dataset n action 1064 may be performed by software code 108 , executed by processing hardware 104 of system 102 , or by user software application 250 , executed by user system processing hardware 244 .
- performing the analysis of the dataset in action 1064 may include detecting, using first stage ML2 of trained ML model ML+, one or more temporal segments of the received dataset that include the predetermined data attribute.
- first stage ML2 of trained ML model ML+ is configured to detect one or more temporal segments of the received dataset that include the predetermined data attribute
- first stage ML2 may be trained using a dataset, DS2, that has been annotated to identify the predetermined data attribute, to detect temporal segments of a test dataset that include the predetermined data attribute.
- DS2 may be created or obtained by system 100 or user system(s) 140 a - 140 d / 240 .
- DS2 may be generated by training another ML model to detect the presence of the predetermined data attribute in other test data, and using an output of that ML model to train yet another ML model to predict bounding timestamps for a temporal segment of that other test data that include the predetermined data attribute.
- the training of first stage ML2 using DS2, the generation of DS2, or the generation of DS2 and the training of first stage ML2 using DS2, may be performed by software code 108 , executed by processing hardware 104 of system 102 , or by user software application 250 , executed by user system processing hardware 244 .
- Flowchart 1060 further includes predicting, using second stage ML3 of trained ML model ML+ when the analysis of the dataset performed in action 1064 detects the presence of the predetermined data attribute, a probability that the predetermined data attribute is indicative of a condition or a property (action 1066 ).
- a probability that the predetermined data attribute is indicative of a condition or a property may be one of a physical condition, a disease state, or a chronic medical condition, for example, as noted above.
- condition may be the operating performance of a machine, such as its output, energy consumption, heat generation, or overall efficiency, for example.
- first stage ML2 of trained ML model ML+ is configured to detect one or more temporal segments of the received dataset that include the predetermined data attribute
- predicting the probability that the predetermined data attribute is indicative of the condition or the property using ML3 in action 1066 may include predicting whether at least one of those one or more temporal segments including the predetermined data attribute is indicative of the condition or the property.
- second stage ML3 may be trained using a dataset, DS4, which has been annotated to correlate the predetermined data attribute with one of the condition or the property, to predict whether a temporal segment including the predetermined data attribute is indicative of the condition or the property.
- Trained ML model ML+ may then be validated using validation data having a known ground truth, by delivering the validation data as an input to first stage ML2 and obtaining a prediction for the condition or the property as an output from second stage ML3,
- training of ML3, as well as validation of trained ML model ML+ may be perforated by software code 108 , executed by processing hardware 104 of system 102 , or by user software application 250 , executed by user system processing hardware 244 .
- first stage ML2 and second stage ML3 of trained ML model ML+ may be trained using a federated learning process, as known in the art.
- actions 1062 , 1064 , and 1066 may be performed in an automated process from which human participation may be omitted.
- the present application discloses systems and methods for performing ML model based condition and property detection.
- the present ML model based diagnostic solution can render real-time disease state predictions for asymptomatic as well as symptomatic disease carriers in a manner that does not require special equipment or specially trained personnel, can be deployed rapidly, ubiquitously, and in a privacy-preserving way.
- the present application discloses a ML model based condition and property detection solution that can be deployed on any computer or smartphone either within its own application or embedded within another application. Consequently, the present ML model based condition and property detection solution can advantageously be deployed in an active manner, such as part of a multi-step screening process at a public or private event, or in any venue, such as an airport or cruise ship, for example, designed to host large groups.
- the present ML model based condition and property detection solution may be deployed in an ambient manner (working in the background of a mobile phone software application for example) and thereby create a system that can not only provide notice to the individual user, but may also, when the user opts in or otherwise gives informed consent, contribute to national or global real-time status/outbreak warning systems.
- this use case can be implemented in a privacy preserving way, because, as noted above, this ML model based condition and property detection solution can be deployed locally on each device, not requiring the sending of audio data or PII to an external server in order to render a disease state or other prediction. Additionally, because the present ML model based condition and property detection solution can employ a multi-step automated segmentation process, as described above, which allows for unstructured input data to be usable for both training and prediction purposes, it advantageously produces normalized datasets that are ideally suited for machine learning.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Public Health (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Epidemiology (AREA)
- Software Systems (AREA)
- Primary Health Care (AREA)
- Pathology (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Business, Economics & Management (AREA)
- Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
Description
- The present application claims the benefit of and priority to a pending Provisional Patent Application Ser. No. 63/194,018 filed on May 27, 2021, and titled “Condition and Media Property Prediction via Machine Learning Model Based Temporal Segmentation of Media,” which is hereby incorporated fully by reference into the present application.
- Condition and property detection, such as various types of diagnostics for instance, may be performed in a variety of ways that can differ considerably depending on the condition or property serving as the subject of analysis, but often rely on manual processes. As one example, medical diagnostics often require extraction and testing of a blood or tissue sample, or expert review and interpretation of images or test results generated by sophisticated testing equipment such as computerized tomography (CT) or magnetic resonance imaging (MRI) scanners, electrocardiogram (ECG) machines, and the like. As another example, diagnostics performed on industrial equipment or other machines typically require human inspection, or at the very least review of sensor data by a trained human technician. Despite the diversity of the diagnostic techniques in use, a common element among many is the need for a human having some level of expertise to participate in the process. However, given the costliness of such human involvement, there exists a need in the art for automated solutions capable of inferentially interpreting diagnostic data.
-
FIG. 1 shows a diagram of an exemplary system for performing machine learning (ML) model based condition and property detection, according to one implementation; -
FIG. 2 shows a diagram of another exemplary implementation of a system for performing ML model based condition and property detection; -
FIG. 3 illustrates a process for generating a dataset for use in training a first stage of a ML model for performing condition and property detection, according to one implementation; -
FIG. 4 illustrates a process for training a first stage of a ML model for performing condition and property detection, according to one implementation; -
FIG. 5 illustrates a process for generating a dataset for use in training a second stage of a ML model for performing condition and property detection, according to one implementation; -
FIG. 6 illustrates a process for training a second stage of a ML model for performing condition and property detection, according to one implementation; -
FIG. 7 illustrates an exemplary ML model trained to perform condition and property detection, according to one implementation; -
FIG. 8 illustrates an exemplary ML model for performing condition and property detection that includes multiple first and second stages, according to one implementation; -
FIG. 9 illustrates an exemplary ML architecture for performing condition and property detection in which multiple ML model pipelines having different first and second stages are utilized in parallel, according to one implementation; and -
FIG. 10 shows a flowchart outlining an exemplary method for performing ML model based condition and property detection, according to one implementation. - The following description contains specific information pertaining to implementations in the present disclosure. One skilled in the art will recognize that the present disclosure may be implemented in a manner different from that specifically discussed herein. The drawings in the present application and their accompanying detailed description are directed to merely exemplary implementations. Unless noted otherwise, like or corresponding elements among the figures may he indicated by :like or corresponding reference numerals. Moreover, the drawings and illustrations in the present application are generally not to scale, and are not intended to correspond to actual relative dimensions.
- The present application discloses systems and methods for performing machine learning model based condition and property detection. It is noted that although the present condition and property detection solution is described below in detail by reference to
FIGS. 3, 4, 5, 6, 7, 8, and 9 illustrating an exemplary use case in which a dataset including time-based audio of a human voice is used to predict the presence of an infectious disease marker, the present novel and inventive principles may be advantageously applied to various types of data to predict a wide variety of properties of interest. For instance, the present concepts can be readily adapted for use with substantially any type of dataset or data stream that can be granulized into tagged segment types, such as time-based media in the form of audio, video. audio-video (AV) content, or time-based diagnostic test data such as electrocardiogram. (ECG) data or electroencephalogram (EEG) data, to name a few examples. Specific use cases for the present novel and inventive concepts may include the operating performance of machinery such as home appliances, industrial equipment, ventilation systems, and vehicles, for instance. Examples of additional medical applications may include prediction of non-infectious disease states, prediction of chronic medical conditions such as dementia and schizophrenia, prediction of the presence of or recovery from musculoskeletal injury, immune system status, and stroke status, again to name merely a few examples. - Specific example use cases for the present novel and inventive concepts may include using video to predict early onset Alzheimer's disease or Parkinson's disease, or o predict a leg injury in a subject, for instance, based on walking or other movement by the subject. Alternatively, or in addition, video may be used to predict that a subject has had a stroke based on upper body movements or facial movements or expressions by the subject. As yet another alternative, or additionally, AV content or audio content may be used to diagnose malfunction of an appliance, such as a washing machine, tip or the need to replace a timing belt or other drive component of a car. Nevertheless, it is emphasized that any particular use case described or alluded to in the present application is not to be interpreted as limiting.
- In some implementations, the systems and methods disclosed by the present application may be substantially or fully automated. As used in the present application, the terms “automation,” “automated,” and “automating” refer to systems and processes that do not require the participation of a human user, such as a human system administrator. Although, in some implementations, an engineer or medical professional may review the performance of the automated systems operating according to the automated processes described herein, that human involvement is optional. Thus the processes described in the present application may be performed under the control of hardware processing components of the disclosed systems.
- It is noted that the present media property prediction solution is machine learning model based. As defined in the present application, a “machine learning model,” or “ML model,” refers to a mathematical model for making future predictions based on patterns learned from samples of data obtained from a set of trusted known matches and known mismatches, known as training data, Various learning algorithms can be used to map correlations between input data and output data. These correlations form the mathematical model that can be used to make future predictions on new input data. Such a predictive model may include one or more logistic regression models, Bayesian models, or artificial. neural networks (NNs), for example. In addition, machine learning models may be designed to progressively improve their performance of a specific task.
- A NN is a type of machine learning model in which patterns or learned representations of observed data are processed using highly connected computational layers that map the relationship between inputs and outputs. A “deep neural network” (deep NN), in the context of deep learning, may refer to a NN that utilizes multiple hidden layers between input and output layers, which may allow for learning based on features not explicitly-defined in raw data. As used in the present application, a feature labeled or described as a NN refers to a deep neural network. In various implementations, NNs may be utilized to perform image processing, audio processing, or natural-language processing, for example.
-
FIG. 1 shows exemplary system. 100 for performing ML model based condition and property detection, according to one implementation. As shown inFIG. 1 ,system 100 includescomputing platform 102 havingprocessing hardware 104 andsystem memory 106 implemented as a non-transitory storage medium. According to the present exemplary implementation,system memory 106stores software code 108 which may include one or more ML models. - As further shown in
FIG. 1 ,system 100 is implemented within a use environment including 140 a, 140 b, 140 c, and 140 d (hereinafter “user systems 140 a-140 d”) providinguser systems 120 a, 120 b, 120 c, and 120 d (hereinafter “datasets 120 a-120 d”), which may include time-based media for example, torespective datasets system 100 viacommunication network 130. ALSO shown inFIG. 1 are display 148 ofuser system 140 a, andnetwork communication links 132 ofcommunication network 130 interactively connectingsystem 100 with user systems 140 a-140 d. - Although the present application refers to
software code 108 as being stored insystem memory 106 for conceptual clarity, more generally,system memory 106 may take the form of any computer-readable non-transitory storage medium. The expression “computer-readable non-transitory storage medium”, as used in the present application, refers to any Medium excluding a carrier wave or other transitory signal that provides instructions to processinghardware 104 ofcomputing platform 102 or to respective processing hardware of user systems 140 a-140 d. Thus, a computer-readable non-transitory storage medium may correspond to various types of media, such as volatile media and non-volatile media, for example. Volatile media may include dynamic memory, such as dynamic random access memory (dynamic RAM), while non-volatile memory may include optical, magnetic, or electrostatic storage devices. Common forms of computer-readable non-transitory storage media include, for example, optical discs such as DVDs, RAM, programmable read-only memory (PROM), erasable PROM (EPROM), and FLASH memory. -
Processing hardware 104 ofsystem 100 may include multiple hardware processing units, such as one or more central processing units, one or more graphics processing units, and one or more tensor processing units, one or more field-programmable gate arrays (FPGAs), custom are for machine-learning training or inferencing, and an application programming interface (API) server, for example. By way of definition, as used in the present application, the terms “central processing unit” (CPU), “graphics processing unit” (GPU), and “tensor processing unit” (TPU) have their customary meaning in the art. That is to say, a CPU includes an Arithmetic Logic Unit (ALU) for carrying out the arithmetic and logical operations ofcomputing platform 102, as well as a Control Unit (CU) for retrieving programs, such assoftware code 108, fromsystem memory 106, while a GPU may be implemented to reduce the processing overhead of the CPU by performing computationally intensive graphics or other processing tasks. A TPU is an application-specific integrated circuit (ASIC) configured specifically for artificial intelligence (AI) processes such as machine learning. - Although
FIG. 1 depictssingle computing platform 102,system 100 may include one or more computing platforms corresponding tocomputing platform 102, such as computer servers for example, which may be co-located, or may form an interactively linked but distributed system, such as a cloud-based system, for instance. As a result,processing hardware 104 andsystem memory 106 may correspond to distributed processor and memory resources withinsystem 100. In one such implementation,computing platform 102 may correspond to one or lore web servers accessible over a packet--switched network such as the Internet, for example. Alternatively,computing platform 102 may correspond to one or more computer servers supporting a wide area network (WAN), a local area network (LAN), or included in another type of private or limited distribution network. Furthermore, in some implementations,system 100 may be implemented virtually, such as in a data center. For example, in some implementations,system 100 may be implemented in software, or as virtual machines. - It is further noted that, although user systems 140 a-140 d are shown variously as
smartphone computer 140 a,video camera 140 b,microphone 140 c, and machine ordiagnostic device 140 d, inFIG. 1 , those representations are provided merely by way of example. In other implementations, user systems 140 a-140 d may take the form of any mobile or stationary devices capable of obtaining datasets 120 a-120 d. When implemented as smart devices, for example, user systems 140 a-140 d may be any suitable mobile or stationary computing devices or systems that implement data processing capabilities sufficient to provide a user interface, support connections tocommunication network 130, and implement the functionality ascribed to user systems 140 a-140 d herein. That is to say, in other implementations, one or more of user systems 140 a-140 d may take the form of a desktop computer, laptop computer, tablet computer, or an implanted medical device, such as a. pacemaker or pump, to name a few examples. In addition, or alternatively, in some implementations one or more of user systems 140 a-140 d may take the form of a smart wearable device, such as a smartwatch, for example. It is also noted thatdisplay 148 may take the for n of a liquid crystal display (LCD), light-emitting diode (LED) display, organic light-emitting diode (OLED) display, quantum dot (QD) display, or any other suitable display screen that perform a physical transformation of signals to light. -
FIG. 2 shows another exemplary system, i.e.,user system 240, for performing ML model based condition and property detection, according to another implementation. As shown inFIG. 2 ,user system 240 includescomputing platform 242 havingprocessing hardware 244,memory 246 implemented as a nor-transitory storage medium storinguser software application 250, anddisplay 248. It is noted that, in various implementations,display 248 may be physically integrated withuser system 240 or may be communicatively coupled to but physically separate fromuser system 240. For example, whereuser system 240 is implemented as a smartphone, laptop computer, or tablet computer,display 248 will typically he integrated withuser system 240. By contrast, whereuser system 240 is implemented as a desktop computer,display 248 may take the form of a monitor separate fromcomputing platform 242 in the form of a computer tower. -
User system 240 corresponds in general to any or all of user systems 140 a-140 d inFIG. 1 , whiledisplay 248 corresponds in general to display 148. Thus, user systems 140 a-140 d and display 148 may share any of the characteristics attributed touser system 240 anddisplay 248 by the present disclosure, and vice versa. That is to say, likedisplay 148,display 248 may take the form of an LCD, LED display, OLED display, or QD display, for example. Moreover, although not shown inFIG. 1 , one or more of user system 140 a-140 d may include features corresponding respectively tocomputing platform 242,processing hardware 244, andmemory 246 storinguser software application 250. - User
system processing hardware 244 may include multiple hardware processing units, such as one or more CPUs, ogre or more GPUs one or more TPUs, and one or more FPGAs, for example, as those features are defined above. - With respect to
user software application 250, it is noted that in some implementations,user software application 250 may be a thin client application ofsoftware code 108, inFIG. 1 . In those implementations,user software application 250 may enableuser system 240 to obtain any of datasets 120 a-120 d, and to provide that dataset tosystem 100 for processing. However, in other implementations,user software application 250 may include substantially all of the features and functionality ofsoftware code 108. That is to say, in some implementations,user system 240 may perform any or all of the operations attributed tosystem 100 by the present disclosure. - According to the exemplary implementation shown in
FIG. 2 ,user software application 250 is located inmemory 246 ofuser system 240, subsequent to transfer ofuser software application 250 touser system 240 via an external flash drive or dongle, or over a packet-switched network, such as the Internet, for example. Once present onuser system 240,user software application 250 may be persistently stored inmemory 246 and may be executed locally onuser system 240 by usersystem processing hardware 244. -
System 100 anduser system 240 are further described below by reference toFIGS. 3, 4, 5, 6, 7, 8, and 9 , which illustrate a specific example use case wherein audio properties of a dataset including time-based media in the form of recorded human vocalizations are used to predict the presence of an infectious disease, in one particular use case the presence of a. respiratory illness, such as influenza or Coronavirus Disease 2019 (COVID-19). - By way of background, existing voice-based methods for detecting COVID-19 require analysis of pre-identified utterances of interest, such as coughs, manual segmentation of existing ground truth audio data sets by human researchers in order to isolate such utterances, and test subjects who are required to perform these specific forced utterances, e.g., forced coughing. These conventional approaches hamper the creation of an effective voice-based COVID-19 detector for several reasons. For example, by limiting analysis to pre-identified utterances of interest the possible solutions obtainable are restricted to only those that can arise from preconceived hypotheses, thereby hindering serendipity. In addition, manual segmentation of data prevents end-to-end processes from being automated, which impedes the rapid iterations and convergence typically made possible by machine-learning approaches. Moreover, requiring collection of uncommon utterances limits data collection. opportunities to laboratory scenarios or coached data collection initiatives, while requiring collection of symptom-based utterances restricts opportunities to collect data from asymptomatic disease carriers.
- In the exemplary use case of a novel infectious disease, such as COVID-19, for which an extensive knowledge base is under development, the prediction solution disclosed in the present application overcomes the aforementioned deficiencies in the conventional art by implementing a multi-step ML model based process that can predict disease presence by automatically segmenting unstructured vocal sample data into normalized datasets, each data element being of specific audio segment types that are determined to be optimal for prediction of COVID-19 infection, and using those datasets to train disease predictors to be. more flexible and precise than conventional approaches allow.
- For example: if it is known that analysis of the audio properties of a certain type of vocal utterance, such a “mmmmm,” for instance, is most effective for prediction of COVID-19, the present prediction solution can take a dataset of unstructured voice samples that are tagged with COVID-19 status (collected from hospitals, for example), extract the “mmmmm” utterance segments into a normalized dataset that retains the corresponding COVID-19 status tags, and use this dataset to train and deploy a composite ML model that can predict COVID-19 status, based upon input of an unstructured vocal sample. It is noted that the vocal utterance identified in the present application as “mmmmm” refers to a sustained consonantal sound known formally as the “voiced bilabial nasal,” identified by the symbol (m) in the International Phonetic Alphabet. That is to say, the vocal utterance “mmmmm” is produced by sustaining the sound of the English letter “m” at the end of English word “them.”
- Although this example relies on prior knowledge that a segment type based on the “mmmmm” utterance would be useful for ML model based prediction of COVID-19, the present prediction process can be performed with various different segment types in parallel, such as isolating utterances of each vocal phoneme into separate datasets, for example, in order to determine which segment types are most effective for prediction of COVID-19. This may be advantageous in cases where no a priori or existing a posteriori knowledge exists, as well as for identifying segments which can expand data collection opportunities. For instance, a segment based on a phoneme sound such as “oΩ” would be collectible from normal speech (and thus amenable to ambient, passive data collection approaches), but a segment based on a coughing sound would only be collectible from people who are symptomatic or from those who are instructed to cough via a coached data collection process.
- To illustrate the process outlined above in detail, consider the exemplary use case in which the objective is to create a voice-based predictor for COVID-19 based on input of normal speech, and that there is reason to believe that analysis of the vocal resonances in the sound “mmmmm” will be particularly useful for prediction of COVID-19. Under those circumstances the present prediction solution may proceed as follows:
- 1: Referring to
FIG. 3 , beginning with a dataset of various vocal utterances, including examples of the sound “mmmmm” as well as many other sounds, where each data. element is tagged with a label that specifies whether the element Is or Is Not “mmmmm.” This initial dataset (hereinafter “DS1” as identified inFIG. 3 ) can either be acquired or created. DS1 may then be used to train a ML model (hereinafter “ML1” as also identified inFIG. 3 ) which takes as input audio segment elements in the format of DS1 and outputs a likelihood value for the input element being “mmmmm,” based on techniques such as spectrogram analysis, Hidden Markov models, and the like. This is illustrated byFIG. 3 with “mmmmm” being the characteristic of interest (hereinafter denoted by “VC”). It is noted that, in some implementations, ML1 may be a feature ofsoftware code 108 ofsystem 100, or ofuser software application 250 ofuser system 240. DS1 may be acquired or created, and may be used to train ML1, bysoftware code 108, executed by processinghardware 104 ofsystem 100, or byuser software application 250, executed by usersystem processing hardware 244. - 2: Referring to
FIG. 4 , a new ML model (hereinafter “ML2” as identified inFIG. 4 ) may be created, which can predict the bounding timestamps of segments within an input audio file that are Yes for VC, in this case the “mmmmm” sound, with sufficiently high likelihood. ML2 may be created based upon ML1 and machine learning techniques such as sliding window convolutions, where stride length may be a tunable hyperparameter of the model. ML2 can be trained using another dataset (hereinafter “DS2” as also identified inFIG. 4 ) of unstructured audio speech files having temporal segments tagged with VC status. As with DS1, DS2 can either be acquired or created. It is noted that ML2 may also be a feature ofsoftware code 108 ofsystem 100, or ofuser software application 250 ofuser system 240. ML2 may be created, and may be trained using DS2, bysoftware code 108, executed by processinghardware 104 ofsystem 100, or byuser software application 250, executed by usersystem processing hardware 244. -
- 3: Referring to
FIG. 5 , another dataset (hereinafter “DS3” as identified inFIG. 5 ) may be acquired, which cc ns elements that are unstructured vocal recordings annotated with ground truth data of the recorded person's COVID-19 status. Datasets such as DS3 already exist from hospitals, universities and the like. The elements of DS3 can be used as inputs into ML2, which identifies audio segment regions that are “mmmmm,” which can then be extracted into a new a dataset (hereinafter “DS4” as also identified inFIG. 5 ), which includes elements that not only are “mmmmm” but each of which inherits the COVID-19 status tag from the recording that it was extracted from. This is illustrated byFIG. 5 , with COVID-19 status denoted as VC-P. Likeprocedures 1 and 2 described above, the present procedure may be performed bysoftware code 108, executed by processinghardware 104 ofsystem 100, or byuser software application 250, executed by usersystem processing hardware 244.
- 3: Referring to
- 4: Referring to
FIG. 6 , dataset DS4 may be used to train another ML model (hereinafter “ML3” as identified inFIG. 6 ) that outputs a prediction for the likelihood of COVID-19 given an input in the format of the elements of DS4, “mmmmm” in this exemplary use case. ML3 can be created and trained without prior knowledge. However, efficacy may be improved by creating ML3 based on existing knowledge of acoustic biomarkers likely to be relevant for unique and specific prediction of COVID-19, even amongst asymptomatic carriers, and audio processing techniques such as Fast Fourier Transforms and Mel-frequency Cepstrum analysis. Examples of such acoustic biomarkers may be manifestations of one or more of COVID-19 related neuromuscular vocal cord impairment, respiratory degradation, or changes in intonation, to name a few. It is noted that ML3 may also be a feature ofsoftware code 108 ofsystem 100, or ofuser software application 250 ofuser system 240. Moreover, procedure 4 described above may be performed bysoftware code 108, executed by processinghardware 104 ofsystem 100, or byuser software application 250, executed by usersystem processing hardware 244. - 5: Referring to
FIG. 7 , the final ML model based COVID-19 predictor (hereinafter “ML+” as identified inFIG. 7 ) may be created based on a combination of the ML2 and ML3 models. It is noted that, in some implementations, ML+ may take the form of an integrated ML model having ML2 as a first, i.e., input stage, and having ML3 as a second, i.e., output stage. ML+ can take as input unstructured audio recording data (such as a person speaking normally into a mobile phone) and output a prediction of COVID-19 status, such as a likelihood score. ML+, which could be deployed viauser software application 250, for example, is given input that it will first extract one or more region(s) which are predicted by ML2 to be of the “mmmmm” segment type. That or those region(s) may then be provided as input into ML3. The resulting ML3 COVID 19 status predictions can then be utilized to determine output of ML+ with respect to COVID-19 status: such as “Positive” or “Negative,” or as a likelihood score. As a simple example, the likelihood predictions could be averaged to determine the ML+ output prediction, but a more complicated process could be utilized as well. - The performance of ML+ can be improved over time by direct training via datasets in the form of DS3, or through independent improvements in ML1, ML2, or ML3. In addition, any ground truth data from additional COVID-19 test results, such as polymerase chain reaction (PCR) tests for example, can be used to augment dataset DS3, and consequently refine ML3 and parameters, such as acceptable prediction likelihood thresholds, averaging processes used for multiple ML3 predictions in ML+, and the like. ML+ may be a feature of
software code 108 ofsystem 100, or ofuser software application 250 ofuser system 240. Moreover, procedure 5 described above may be performed bysoftware code 108, executed by processinghardware 104 ofsystem 100, or byuser software application 250, executed by usersystem processing hardware 244. - As noted above, in some implementations,
procedures 1 through 5 may be performed onsystem 100, while inother implementations procedures 1 through 5 may be performed onuser system 240. However, in other implementations, ML+ may be deployed touser software application 250 onuser system 240 after its creation onsystem 100. In still other implementations, ML+ may be deployed tosystem 100 after its creation elsewhere, andsoftware code 108, when executed by processinghardware 104, may utilize that pre-existing ML+ to predict the respiratory infection (e.g., COVID-19) status using unstructured voice samples. - In the above-described case prior knowledge that the vocal resonances in the sound “mmmmm” would be particularly useful for prediction of COVID-19 was presumed. In the absence of such knowledge, or as a supplement to it, it is noted that
procedures 1 through 4 can be used to generate multiple different instances of ML3 in parallel. In such an implementation,procedures 1 through 4 could be performed for each vocal phoneme instead of just the sound “mmmmm,” for example, and the most efficacious vocal segment types for prediction of COVID-19 could be determined. This could be accomplished by creating DS1 datasets tagged with Yes/No for each of the different segment types, or all at once in a single non-binary segmenter via the procedures depicted inFIG. 3 andFIG. 4 . This would lead to extracting datasets relating to each audio segment type in procedure 3, and so forth. - An example of generating multiple different instances of ML3 is shown in
FIG. 8 , in which each of 800 a, 800 b, and 800 c produce different instantiations of ML3 independently of one another and in parallel. It is noted that althoughmachine learning pipelines FIG. 8 shows three parallel 800 a, 800 b, and 800 c, that representation is merely exemplary. In other use cases in which multiple instances of ML3 are generated, as few as two parallel machine learning pipelines, or more than three machine learning pipelines, may be employed.machine learning pipelines - As an example, in some implementations, ML+ can be enhanced as the result of the generation of multiple instances of ML3 using different instances of ML2 as shown in
FIG. 8 . For example, a version of ML+ finalized for deployment could make use of those parallel ML2 and ML3 portions of 800 a, 800 b, and 800 c in any desired combination, as further shown inmachine learning pipelines FIG. 9 . - Because the process depicted in
FIG. 8 andFIG. 9 does not require prior knowledge or presumption, it can be useful for the prediction of diseases and physical or mental conditions that may have different, novel, and unknown vocal or non-vocal signatures, as well as the determination of the vocal or other data segment types that are optimal for the prediction or diagnosis of such diseases and conditions. These may include non-infectious diseases and chronic conditions such as dementia, schizophrenia, and Parkinson's disease, for example. When informed consent is obtained, non-disease characteristics can be predicted through this process as well, such as biological sex, age, and stress level. Because non-disease factors such as increased stress level are correlated with diminished immune system function, the prediction process described above can result not only in the creation of effective predictors of disease, but also predictors of disease susceptibility. - Moreover, for the specific use case of diagnosing COVID19 and other infectious diseases, because the present ML model based diagnostic solution is configured to detect human manifestations of the disease state, it is agnostic, and therefore remains effective as a diagnostic tool, even when infectious vectors mutate. Thus, in contrast to rapid antigen tests for COVID-19, which are to some extent variant specific, and tend to fail when the severe acute respiratory syndrome coronavirus 2 (BARS-CoV-2) causing COVID-19 mutates, the present ML model based diagnostic solution can be expected to be, and remain, robustly reliable against viral sub-variants.
- As an additional advantage with respect to acquisition and management of personally identifiable information (PII) or other sensitive personal information, in implementations in which ML+ is deployed to
user software application 250, any PII acquired byuser software application 250 may be sequestered onuser system 240 and be unavailable tosystem 100 or other external agents. - The functionality of
system 100, user system(s) 140 a-140 d/240,software code 108, anduser software application 250 shown variously inFIGS. 1 and 2 will be further described by reference toFIG. 10 .FIG. 10 shows flowchart 1060 presenting an exemplary method performing ML model based condition and property detection, according to one implementation, With respect to the method outlined inFIG. 10 , it is noted that certain details and features have been left out offlowchart 1060 in order not to obscure the discussion of the inventive features in the present application. - Referring to
FIG. 10 in combination withFIGS. 1 and 2 flowchart 1060 begins with receiving one of datasets 120 a-120 d (action 1062). As noted above, datasets 120 a-120 d may include a variety of different data types, including e-based media in the form of audio, video, AN content, sensor data, and test result data, to name a few examples. In some implementations, as shown inFIG. 1 , the dataset received inaction 1062 may be generated or obtained by one ofrespective user systems 140 a/240, 140 b/1240, 140 c/240, or 140 d/240, and may be received bysystem 100 from the one ofrespective user systems 140 a/240, 140 b/240, 140 c/240, or 140 d/240 viacommunication network 130 and network communication links 132. In those implementations, the dataset may be received inaction 1062 bysoft ode 108, executed by processinghardware 104 ofcomputing platform 102. - Alternatively, and as noted above, in some implementations, the diagnostic processing of the one of datasets 120 a-120 d may be performed locally on one of
respective user systems 140 a/240, 140 b/240, 140 c/240, or 140 d/240 In these implementations, the dataset received inaction 1062 may be received byuser software application 250, executed by usersystem processing hardware 244. -
Flowchart 1060 further includes performing an analysis of the dataset received inaction 1062, using a first stage, i.e., ML2 of trained ML model ML+, to detect the presence of a predetermined data attribute (action 1064). For example, in the case of the COVID-19diagnostic procedure described above, processinghardware 104 may executesoftware code 108, or usersystem processing hardware 244 may executeuser software application 250 to determine whether the dataset received inaction 1062 includes the sound “mmmmm” and the bounding timestamps of regions that include that characteristic oaf interest. - Thus, in some implementations, the predetermined data attribute having its presence analyzed in
action 1064 may audio attribute of the dataset received inaction 1062. In implementations in which the predetermined data attribute is an audio attribute, that audio attribute may be derived from one or more of speech, a non-verbal utterance, or a pulmonary expulsion, such as a cough for example. Alternatively, or in addition, the predetermined attribute the predetermined data attribute having its presence analyzed inaction 1064 may be a visual attributed in the form of a human tremor or tic. - Thus, in some implementations,
processing hardware 104 may executesoftware code 108, or usersystem processing hardware 244 may executeuser software application 250 to utilize a visual analyzer included as a feature ofsoftware code 108 oruser software application 250, an audio analyzer included as a feature ofsoftware code 108 oruser software application 250, or such a visual analyzer and audio analyzer, to perform the analysis of the received dataset inaction 1064. - In various implementations, a visual analyzer included as a feature of
software code 108 oruser software application 250 may be configured to apply computer vision or other Al techniques to the dataset received inaction 1062, or may be implemented as a NN or other type of ML model. Such a visual analyzer may be configured or trained to recognize physical movements and their frequency, for example. - An audio analyzer included as a feature of
software code 108 or user so -areapplication 250 may also be implemented as a NN or other ML model. As noted above, in some implementations, a visual analyzer and an audio analyzer may be used in combination to the received dataset. It is noted that the received dataset will typically include multiple video frames, multiple audio frames, or multiple video frames and multiple audio frames. In some of those use cases,processing hardware 104 may executesoftware code 108, or usersystem processing hardware 244 may executeuser software application 250 to perform the visual analysis of the received dataset, the audio analysis of the received dataset, or both the visual analysis and the audio analysis, on a frame-by-frame basis. That is to say, in various implementations, the analysis of the receiveddataset n action 1064 may be performed bysoftware code 108, executed by processinghardware 104 ofsystem 102, or byuser software application 250, executed by usersystem processing hardware 244. - In some implementations, performing the analysis of the dataset in
action 1064 may include detecting, using first stage ML2 of trained ML model ML+, one or more temporal segments of the received dataset that include the predetermined data attribute. In some implementations in which first stage ML2 of trained ML model ML+ is configured to detect one or more temporal segments of the received dataset that include the predetermined data attribute, first stage ML2 may be trained using a dataset, DS2, that has been annotated to identify the predetermined data attribute, to detect temporal segments of a test dataset that include the predetermined data attribute. As noted above, DS2 may be created or obtained bysystem 100 or user system(s) 140 a-140 d/240. In implementations in which DS2 is created bysystem 100 or user system(s) 140 a-140 d/240, DS2 may be generated by training another ML model to detect the presence of the predetermined data attribute in other test data, and using an output of that ML model to train yet another ML model to predict bounding timestamps for a temporal segment of that other test data that include the predetermined data attribute. - In some implementations, the training of first stage ML2 using DS2, the generation of DS2, or the generation of DS2 and the training of first stage ML2 using DS2, may be performed by
software code 108, executed by processinghardware 104 ofsystem 102, or byuser software application 250, executed by usersystem processing hardware 244. -
Flowchart 1060 further includes predicting, using second stage ML3 of trained ML model ML+ when the analysis of the dataset performed inaction 1064 detects the presence of the predetermined data attribute, a probability that the predetermined data attribute is indicative of a condition or a property (action 1066). In some implementations in which second stage ML3 of trained model ML+ is used to predict the probability hat the predetermined data attribute is indicative of a condition, that condition may be one of a physical condition, a disease state, or a chronic medical condition, for example, as noted above. Alternatively and as further noted above, in other implementations in which second stage ML3 of trained ML model ML+ is used to predict the probability that the predetermined data attribute is indicative of a condition, that condition may be the operating performance of a machine, such as its output, energy consumption, heat generation, or overall efficiency, for example. - In implementations in which first stage ML2 of trained ML model ML+ is configured to detect one or more temporal segments of the received dataset that include the predetermined data attribute, predicting the probability that the predetermined data attribute is indicative of the condition or the property using ML3 in
action 1066, may include predicting whether at least one of those one or more temporal segments including the predetermined data attribute is indicative of the condition or the property. In some of those implementations, second stage ML3 may be trained using a dataset, DS4, which has been annotated to correlate the predetermined data attribute with one of the condition or the property, to predict whether a temporal segment including the predetermined data attribute is indicative of the condition or the property. - Trained ML model ML+ may then be validated using validation data having a known ground truth, by delivering the validation data as an input to first stage ML2 and obtaining a prediction for the condition or the property as an output from second stage ML3, In some implementations, training of ML3, as well as validation of trained ML model ML+, may be perforated by
software code 108, executed by processinghardware 104 ofsystem 102, or byuser software application 250, executed by usersystem processing hardware 244. It is noted that in various implementations, one or both of first stage ML2 and second stage ML3 of trained ML model ML+ may be trained using a federated learning process, as known in the art. It is further noted that with respect to the method outlined byflowchart 1060, in some 1062, 1064, and 1066, may be performed in an automated process from which human participation may be omitted.implementations actions - Thus, the present application discloses systems and methods for performing ML model based condition and property detection. In the exemplary use case of infectious disease prediction, the present ML model based diagnostic solution can render real-time disease state predictions for asymptomatic as well as symptomatic disease carriers in a manner that does not require special equipment or specially trained personnel, can be deployed rapidly, ubiquitously, and in a privacy-preserving way.
- Moreover, the present application discloses a ML model based condition and property detection solution that can be deployed on any computer or smartphone either within its own application or embedded within another application. Consequently, the present ML model based condition and property detection solution can advantageously be deployed in an active manner, such as part of a multi-step screening process at a public or private event, or in any venue, such as an airport or cruise ship, for example, designed to host large groups. Alternatively the present ML model based condition and property detection solution may be deployed in an ambient manner (working in the background of a mobile phone software application for example) and thereby create a system that can not only provide notice to the individual user, but may also, when the user opts in or otherwise gives informed consent, contribute to national or global real-time status/outbreak warning systems. It is emphasized that even this use case can be implemented in a privacy preserving way, because, as noted above, this ML model based condition and property detection solution can be deployed locally on each device, not requiring the sending of audio data or PII to an external server in order to render a disease state or other prediction. Additionally, because the present ML model based condition and property detection solution can employ a multi-step automated segmentation process, as described above, which allows for unstructured input data to be usable for both training and prediction purposes, it advantageously produces normalized datasets that are ideally suited for machine learning.
- From the above description it is manifest that various techniques can be used for implementing the concepts described in the present application without departing from the scope of those concepts. Moreover, while the concepts have been described with specific reference to certain implementations, a person of ordinary skill in the art would recognize that changes can be made in form and detail without departing from the scope of those concepts. As such, the described implementations are to be considered in all respects as illustrative and not restrictive. It should also be understood that the present application is not limited to the particular implementations described herein, but many rearrangements, modifications, and substitutions are possible without departing from the scope of the present disclosure.
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/752,741 US20220384040A1 (en) | 2021-05-27 | 2022-05-24 | Machine Learning Model Based Condition and Property Detection |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202163194018P | 2021-05-27 | 2021-05-27 | |
| US17/752,741 US20220384040A1 (en) | 2021-05-27 | 2022-05-24 | Machine Learning Model Based Condition and Property Detection |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20220384040A1 true US20220384040A1 (en) | 2022-12-01 |
Family
ID=84193282
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/752,741 Pending US20220384040A1 (en) | 2021-05-27 | 2022-05-24 | Machine Learning Model Based Condition and Property Detection |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US20220384040A1 (en) |
Citations (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150073306A1 (en) * | 2012-03-29 | 2015-03-12 | The University Of Queensland | Method and apparatus for processing patient sounds |
| US20150351663A1 (en) * | 2013-01-24 | 2015-12-10 | B.G. Negev Technologies And Applications Ltd. | Determining apnea-hypopnia index ahi from speech |
| US20170304732A1 (en) * | 2014-11-10 | 2017-10-26 | Lego A/S | System and method for toy recognition |
| US20180260612A1 (en) * | 2016-08-08 | 2018-09-13 | Indaflow LLC | Object Recognition for Bottom of Basket Detection Using Neural Network |
| US20190042911A1 (en) * | 2017-12-22 | 2019-02-07 | Intel Corporation | System and method for learning the structure of deep convolutional neural networks |
| US20190088367A1 (en) * | 2012-06-18 | 2019-03-21 | Breathresearch Inc. | Method and apparatus for training and evaluating artificial neural networks used to determine lung pathology |
| US20190385711A1 (en) * | 2018-06-19 | 2019-12-19 | Ellipsis Health, Inc. | Systems and methods for mental health assessment |
| US20200293887A1 (en) * | 2019-03-11 | 2020-09-17 | doc.ai, Inc. | System and Method with Federated Learning Model for Medical Research Applications |
| US20210076977A1 (en) * | 2017-12-21 | 2021-03-18 | The University Of Queensland | A method for analysis of cough sounds using disease signatures to diagnose respiratory diseases |
| US20210125732A1 (en) * | 2019-10-25 | 2021-04-29 | XY.Health Inc. | System and method with federated learning model for geotemporal data associated medical prediction applications |
| US20210361227A1 (en) * | 2018-04-05 | 2021-11-25 | Google Llc | System and Method for Generating Diagnostic Health Information Using Deep Learning and Sound Understanding |
| US20220061694A1 (en) * | 2020-09-02 | 2022-03-03 | Hill-Rom Services Pte. Ltd. | Lung health sensing through voice analysis |
| US20220110542A1 (en) * | 2020-10-08 | 2022-04-14 | International Business Machines Corporation | Multi-modal lung capacity measurement for respiratory illness prediction |
| US20220180890A1 (en) * | 2020-12-07 | 2022-06-09 | Transportation Ip Holdings, Llc | Systems and methods for diagnosing equipment |
| US20230015028A1 (en) * | 2019-12-16 | 2023-01-19 | ResApp Health Limited | Diagnosing respiratory maladies from subject sounds |
| US20230045078A1 (en) * | 2020-01-22 | 2023-02-09 | Aural Analytics, Inc. | Systems and methods for audio processing and analysis of multi-dimensional statistical signature using machine learing algorithms |
| US20230190140A1 (en) * | 2020-05-19 | 2023-06-22 | Resmed Sensor Technologies Limited | Methods and apparatus for detection and monitoring of health parameters |
| US20230329630A1 (en) * | 2020-08-28 | 2023-10-19 | Pfizer Inc. | Computerized decision support tool and medical device for respiratory condition monitoring and care |
-
2022
- 2022-05-24 US US17/752,741 patent/US20220384040A1/en active Pending
Patent Citations (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150073306A1 (en) * | 2012-03-29 | 2015-03-12 | The University Of Queensland | Method and apparatus for processing patient sounds |
| US20190088367A1 (en) * | 2012-06-18 | 2019-03-21 | Breathresearch Inc. | Method and apparatus for training and evaluating artificial neural networks used to determine lung pathology |
| US20150351663A1 (en) * | 2013-01-24 | 2015-12-10 | B.G. Negev Technologies And Applications Ltd. | Determining apnea-hypopnia index ahi from speech |
| US20170304732A1 (en) * | 2014-11-10 | 2017-10-26 | Lego A/S | System and method for toy recognition |
| US20180260612A1 (en) * | 2016-08-08 | 2018-09-13 | Indaflow LLC | Object Recognition for Bottom of Basket Detection Using Neural Network |
| US20210076977A1 (en) * | 2017-12-21 | 2021-03-18 | The University Of Queensland | A method for analysis of cough sounds using disease signatures to diagnose respiratory diseases |
| US20190042911A1 (en) * | 2017-12-22 | 2019-02-07 | Intel Corporation | System and method for learning the structure of deep convolutional neural networks |
| US20210361227A1 (en) * | 2018-04-05 | 2021-11-25 | Google Llc | System and Method for Generating Diagnostic Health Information Using Deep Learning and Sound Understanding |
| US20190385711A1 (en) * | 2018-06-19 | 2019-12-19 | Ellipsis Health, Inc. | Systems and methods for mental health assessment |
| US20200293887A1 (en) * | 2019-03-11 | 2020-09-17 | doc.ai, Inc. | System and Method with Federated Learning Model for Medical Research Applications |
| US20210125732A1 (en) * | 2019-10-25 | 2021-04-29 | XY.Health Inc. | System and method with federated learning model for geotemporal data associated medical prediction applications |
| US20230015028A1 (en) * | 2019-12-16 | 2023-01-19 | ResApp Health Limited | Diagnosing respiratory maladies from subject sounds |
| US20230045078A1 (en) * | 2020-01-22 | 2023-02-09 | Aural Analytics, Inc. | Systems and methods for audio processing and analysis of multi-dimensional statistical signature using machine learing algorithms |
| US20230190140A1 (en) * | 2020-05-19 | 2023-06-22 | Resmed Sensor Technologies Limited | Methods and apparatus for detection and monitoring of health parameters |
| US20230329630A1 (en) * | 2020-08-28 | 2023-10-19 | Pfizer Inc. | Computerized decision support tool and medical device for respiratory condition monitoring and care |
| US20220061694A1 (en) * | 2020-09-02 | 2022-03-03 | Hill-Rom Services Pte. Ltd. | Lung health sensing through voice analysis |
| US20220110542A1 (en) * | 2020-10-08 | 2022-04-14 | International Business Machines Corporation | Multi-modal lung capacity measurement for respiratory illness prediction |
| US20220180890A1 (en) * | 2020-12-07 | 2022-06-09 | Transportation Ip Holdings, Llc | Systems and methods for diagnosing equipment |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Jiang et al. | Learning visual attention to identify people with autism spectrum disorder | |
| EP3693966B1 (en) | System and method for continuous privacy-preserved audio collection | |
| US10685648B2 (en) | Sensor fusion model to enhance machine conversational awareness | |
| US11735317B2 (en) | Method for generating prediction result for predicting occurrence of fatal symptoms of subject in advance and device using same | |
| Carchiolo et al. | Medical prescription classification: a NLP-based approach | |
| US10002311B1 (en) | Generating an enriched knowledge base from annotated images | |
| WO2022091062A1 (en) | Automatic detection of disease-associated respiratory sounds | |
| US20190347269A1 (en) | Structured report data from a medical text report | |
| US20160232658A1 (en) | Automatic ground truth generation for medical image collections | |
| Chen et al. | Semi-supervised domain adaptation for major depressive disorder detection | |
| Lamia et al. | Detection of pneumonia infection by using deep learning on a mobile platform | |
| Javed et al. | Enhancing chronic disease prediction in IoMT-enabled Healthcare 5.0 using deep machine learning: Alzheimer’s disease as a case study | |
| CN118265985A (en) | Manage models trained using the machine learning process | |
| Zebari et al. | Automated Detection of Covid-19 from X-ray Using SVM | |
| Arnab et al. | Analysis of different modality of data to diagnose parkinson's disease using machine learning and deep learning approaches: A review | |
| Altaf et al. | Systematic review for phonocardiography classification based on machine learning | |
| WO2024080928A1 (en) | Method and system for predicting a type of a cardiac disease | |
| Muthulakshmi et al. | Big data analytics for heart disease prediction using regularized principal and quadratic entropy boosting | |
| Zhang | A Novel Eye-tracking and Audio Hybrid System for Autism Spectrum Disorder Early Detection | |
| Kalimuthukumar et al. | Early-detection of Parkinson’s disease by patient voice modulation analysis through MFCC Feature extraction technique | |
| US20220384040A1 (en) | Machine Learning Model Based Condition and Property Detection | |
| Kumar et al. | Decoding stress with computer vision-based approach using audio signals for psychological event identification during COVID-19 | |
| Painuli et al. | Efficient feature selection and hyperparameter tuning for improved speech signal-based Parkinson’s disease diagnosis via machine learning techniques | |
| Jacob et al. | Enhanced Machine Learning Framework for Autonomous Depression Detection Using Modwave Cepstral Fusion and Stochastic Embedding | |
| KR102693059B1 (en) | Method for disease management using voice data and apparatus for performing the method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: DISNEY ENTERPRISES, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:COMITO, KEITH;HALE, GREGORY BROOKS;KUMAR, KOMATH NAVEEN;SIGNING DATES FROM 20220421 TO 20220424;REEL/FRAME:060006/0940 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION COUNTED, NOT YET MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION COUNTED, NOT YET MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION COUNTED, NOT YET MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |