CN109558832A - A kind of human body attitude detection method, device, equipment and storage medium - Google Patents
A kind of human body attitude detection method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN109558832A CN109558832A CN201811427578.XA CN201811427578A CN109558832A CN 109558832 A CN109558832 A CN 109558832A CN 201811427578 A CN201811427578 A CN 201811427578A CN 109558832 A CN109558832 A CN 109558832A
- Authority
- CN
- China
- Prior art keywords
- human body
- body attitude
- image data
- frame image
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/251—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of human body attitude detection method, device, equipment and storage mediums.This method comprises: acquisition multiple image data;Current frame image data are input in human body attitude detection model trained in advance, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported with reference to figure, human body attitude detection model is the convolutional neural networks training generation for being applied to embedded platform;In human body attitude with reference to identification human body attitude key point in figure;According to the credibility of human body attitude key point, human body attitude confidence map is generated;Judge whether current frame image data are last frame image data;If it is not, then human body attitude confidence map is input in human body attitude detection model, for participating in generating the human body attitude confidence map of next frame image data;If so, terminating the operation that execution generates the human body attitude confidence map of multiple image data.The embodiment of the present invention is realized in the enterprising pedestrian's body attitude detection of embedded platform.
Description
Technical field
The present embodiments relate to human body attitude detection technique more particularly to a kind of human body attitude detection method, device, set
Standby and storage medium.
Background technique
Human body attitude detection is research direction most challenging in computer vision field, is widely used in man-machine friendship
Mutually, the fields such as intelligent monitoring, virtual reality and human body behavioural analysis.But each key point institute by forming human body attitude
Local image characteristics be in multiple dimensioned affine transformation, and image be easy by target person dressing, camera shooting angle,
The factors such as distance, illumination variation and partial occlusion influence, so that human body attitude detection progress is slow.
In the prior art, generally use based on convolutional neural networks and carry out human body attitude detection, meanwhile, in order to obtain compared with
High accuracy of identification, it usually needs acquire a large amount of training sample and long-time supervised learning is carried out to human body attitude detection model.
In realizing process of the present invention, at least there are the following problems in the prior art for inventor's discovery: due to embedded flat
There is no GPU (Graphics Processing Unit, graphics processor) maximum to calculation amount in convolutional neural networks in platform
Convolution operation optimizes, and therefore, largely the human body attitude detection method based on convolutional neural networks can not be applied to embedded
Platform.
Summary of the invention
The embodiment of the present invention provides a kind of human body attitude detection method, device, equipment and storage medium, is being embedded in realizing
Human body attitude detection on formula platform.
In a first aspect, the embodiment of the invention provides a kind of human body attitude detection methods, this method comprises:
Acquire multiple image data;
Current frame image data are input in human body attitude detection model trained in advance, to refer to previous frame picture number
According to human body attitude confidence map, export multiple human body attitudes with reference to figure, the human testing model be applied to it is embedded flat
The convolutional neural networks training of platform generates;
In the human body attitude with reference to identification human body attitude key point in figure;
According to the credibility of the human body attitude test point, human body attitude confidence map is generated;
Judge whether current frame image data are last frame image data;
If it is not, then the human body attitude confidence map is input in the human body attitude detection model, generated for participating in
The human body attitude confidence map of next frame image data;
If so, terminating the operation that execution generates the human body attitude confidence map of multiple image data.
It is further, described to be input to the current frame image data in human body attitude detection model trained in advance,
With the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported with reference to figure, comprising:
Judge whether the human body attitude confidence map of previous frame image data is credible;
If so, the human body attitude confidence map of the current frame image data and the previous frame image data is input to
In advance in trained human body attitude detection model, multiple human body attitudes are exported with reference to figure;
It is detected if it is not, the current frame image data and pre-set image data are then input to human body attitude trained in advance
In model, multiple human body attitudes are exported with reference to figure.
Further, it is described in the human body attitude with reference to identifying human body attitude key point in figure, comprising:
In the human body attitude with reference to the coordinate position of most probable value determining in figure, using the coordinate position as human body
Posture key point.
Further, the credibility according to the human body attitude key point generates human body attitude confidence map, comprising:
Judge whether the human body attitude key point is credible;
If so, mask artwork is generated centered on the human body attitude key point, as human body attitude confidence map;
If it is not, then using the pre-set image data as human body attitude confidence map.
It is further, described to judge whether the human body attitude key point is credible, comprising:
Judge whether the corresponding probability value of the human body key point is greater than preset threshold value;
If so, determining that the human body attitude key point is credible;
If not, it is determined that the human body attitude key point is insincere.
Further, the human body attitude detection model includes main road, the first branch and second branch, and the main road includes
Residual error module and up-sampling module, the first branch include refining network, and the second branch includes feedback module;
It is described to be input to current frame image data in human body attitude detection model trained in advance, to refer to previous frame figure
As the human body attitude confidence map of data, multiple human body attitudes are exported with reference to figure, comprising:
Current frame image data are input to the residual error module to handle, to refer to the people of previous frame image data
Body posture confidence map is input to the feedback module and is handled, and obtains the first convolution results;
The first convolution results that the residual error module exports are separately input into the up-sampling module and the refinement net
Network module is handled, and the second convolution results and third convolution results are respectively obtained;
Second convolution results are added with the third convolution results, export multiple human body attitudes with reference to figure.
Further, the residual error module includes the first residual unit, the second residual unit and third residual unit;
It is described current frame image data are input to the residual error module to handle, with reference to by previous frame image data
Human body attitude confidence map be input to the feedback module and handled, obtain the first convolution as a result, including:
The current frame image data are input to first residual unit to be handled to obtain the first intermediate result;
First intermediate result is input to second residual unit to handle, and by the previous frame image
The human body attitude confidence map of data is input to the feedback module results added that carries out that treated, obtains the second intermediate result;
Second intermediate result is input to the third residual unit to handle, obtains third intermediate result, is made
For first convolution results;
Wherein, the port number of first intermediate result, second intermediate result and the third intermediate result is successively
Increase.
Further, the human body attitude detection model further includes third branch;
First convolution results by residual error module output are separately input into the up-sampling module and described mention
Refining network module is handled, and the second convolution results and third convolution results are respectively obtained, comprising:
First intermediate result is input to the third branch to handle, obtains the 4th intermediate result;
Second intermediate result is input to the third branch to handle, obtains the 5th intermediate result;
The third intermediate result and the 5th intermediate result are input to the up-sampling module to handle, obtained
6th intermediate result;
4th intermediate result and the 6th intermediate result are input to the up-sampling module to handle, obtained
7th intermediate result, as second convolution results;
The first convolution results that the residual error module exports are input to the refinement network module to handle, obtain institute
State third convolution results;
Wherein, the port number among the 6th centre and the described 7th is successively reduced.
Further, described to be input to current frame image data in human body attitude detection model trained in advance, with ginseng
It is admitted to the human body attitude confidence map of a frame image data, exports multiple human body attitudes with reference to figure, further includes:
By first convolution results and the second convolution results added, the second objective result is obtained;
Multiple described human body attitudes are added with reference to figure with second objective result, new multiple human body attitudes ginseng is exported
Examine figure;
Wherein, second objective result is used for when being trained to the human body attitude detection model, described in raising
The precision of human body attitude detection model.
Second aspect, the embodiment of the invention also provides a kind of human body attitude detection device, which includes:
Image data acquiring module, for acquiring multiple image data;
Human body attitude is examined with reference to figure output module for current frame image data to be input to human body attitude trained in advance
It surveys in model, with the human body attitude confidence map with reference to previous frame image data, exports multiple human body attitudes with reference to figure, the human body
Attitude detection model is the convolutional neural networks training generation for being applied to embedded platform;
Human body attitude key point identification module is used in the human body attitude with reference to identification human body attitude key point in figure;
Human body attitude confidence map generation module generates human body appearance for the credibility according to the human body attitude key point
State confidence map;
Judgment module, for judging whether current frame image data are last frame image data;
First execution module, for if it is not, the human body attitude confidence map, which is then input to the human body attitude, detects mould
In type, for participating in generating the human body attitude confidence map of next frame image data;
Second execution module, for if so, terminating the behaviour of the human body attitude confidence map of execution generation multiple image data
Make.
Further, the human body attitude is with reference to figure output module, comprising:
Confidence map credibility judging unit, whether the human body attitude confidence map for judging previous frame image data is credible;
First human body attitude reference figure output unit, for if so, by the current frame image data and described upper one
The human body attitude confidence map of frame image data is input in human body attitude detection model trained in advance, exports multiple human body attitudes
With reference to figure;
Second human body attitude is with reference to figure output unit, for if it is not, then by the current frame image data and pre-set image
Data are input in human body attitude detection model trained in advance, export multiple human body attitudes with reference to figure.
Further, the human body attitude key point identification module, comprising:
Human body attitude key point recognition unit, for the coordinate in the human body attitude with reference to most probable value determining in figure
Position, using the coordinate position as human body attitude key point.
Further, the human body attitude confidence map generation module, comprising:
Human body attitude key point credibility judging unit, for judging whether the human body attitude key point is credible;
First human body attitude confidence map generation unit is used for if so, raw centered on the human body attitude key point
At mask artwork, as human body attitude confidence map;
Second human body attitude confidence map generation unit, for if it is not, then using the pre-set image data as human body attitude
Confidence map.
Further, the human body attitude key point credibility judging unit, is specifically used for:
Judge whether the corresponding probability value of the human body key point is greater than preset threshold value;
If so, determining that the human body attitude key point is credible;
If not, it is determined that the human body attitude key point is insincere.
Further, the human body attitude detection model includes main road, the first branch and second branch, and the main road includes
Residual error module and up-sampling module, the first branch include refining network module, and the second branch includes feedback module;
It is described to be input to current frame image data in human body attitude detection model trained in advance, to refer to previous frame figure
As the human body attitude confidence map of data, multiple human body attitudes are exported with reference to figure, comprising:
Current frame image data are input to the residual error module to handle, to refer to the people of previous frame image data
Body posture confidence map is input to the feedback module and is handled, and obtains the first convolution results;
The first convolution results that the residual error module exports are separately input into the up-sampling module and the refinement net
Network module is handled, and the second convolution results and third convolution results are respectively obtained;
Second convolution results are added with the third convolution results, export multiple human body attitudes with reference to figure.
Further, the residual error module includes the first residual unit, the second residual unit and third residual unit;
It is described current frame image data are input to the residual error module to handle, with reference to by previous frame image data
Human body attitude confidence map be input to the feedback module and handled, obtain the first convolution as a result, including:
The current frame image data are input to first residual unit to handle, obtain the first intermediate result;
First intermediate result is input to second residual unit and carries out processing and by the previous frame image
The human body attitude confidence map of data is input to the feedback module results added that carries out that treated, obtains the second intermediate result;
Second intermediate result is input to the third residual unit to handle, obtains third intermediate result, is made
For first convolution results;
Wherein, the port number of first intermediate result, second intermediate result and the third intermediate result is successively
Increase.
Further, the human body attitude detection model further includes third branch;
First convolution results by residual error module output are separately input into the up-sampling module and described mention
Refining network module is handled, and the second convolution results and third convolution results are respectively obtained, comprising:
First intermediate result is input to the third branch to handle, obtains the 4th intermediate result;
Second intermediate result is input to the third branch to handle, obtains the 5th intermediate result;
The third intermediate result and the 5th intermediate result are input to the up-sampling module to handle, obtained
6th intermediate result;
4th intermediate result and the 6th intermediate result are input to the up-sampling module to handle, obtained
7th intermediate result, as second convolution results;
The first convolution results that the residual error module exports are input to the refinement network module to handle, obtain institute
State third convolution results;
Wherein, the port number among the 6th centre and the described 7th is successively reduced.
Further, described to be input to current frame image data in human body attitude detection model trained in advance, with ginseng
It is admitted to the human body attitude confidence map of a frame image data, exports multiple human body attitudes with reference to figure, further includes:
By first convolution results and the second convolution results added, the second objective result is obtained;
Multiple described human body attitudes are added with reference to figure with second objective result, new multiple human body attitudes ginseng is exported
Examine figure;
Wherein, second objective result is used for when being trained to the human body attitude detection model, described in raising
The precision of human body attitude detection model.
The third aspect, the embodiment of the invention also provides a kind of equipment, which includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes the method as described in first aspect of the embodiment of the present invention.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer
Program realizes the method as described in first aspect of the embodiment of the present invention when program is executed by processor.
Current frame image data are input to human body trained in advance by acquisition multiple image data by the embodiment of the present invention
In attitude detection model, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported with reference to figure, people
Body attitude detection model is to be applied to the convolutional neural networks training of embedded platform to generate, in human body attitude with reference to knowing in figure
Other human body attitude key point generates human body attitude confidence map, judges current frame image according to the credibility of human body attitude key point
Whether data are last frame image data, if it is not, then human body attitude confidence map is input in human body attitude detection model, are used
In participating in generating the human body attitude confidence map of next frame image data, if so, terminating to execute the people for generating multiple image data
The operation of body posture confidence map is realized in the enterprising pedestrian's body attitude detection of embedded platform, meanwhile, by previous frame image data
Output result introduce to the prediction of the output result of current frame image data during, further improve precision of prediction.
Detailed description of the invention
Fig. 1 is the flow chart of one of embodiment of the present invention human body attitude detection method;
Fig. 2 is the application schematic diagram of one of embodiment of the present invention convolutional neural networks;
Fig. 3 is the flow chart of another human body attitude detection method in the embodiment of the present invention;
Fig. 4 is the structural schematic diagram of one of embodiment of the present invention human body attitude detection device;
Fig. 5 is the structural schematic diagram of one of embodiment of the present invention equipment.
Specific embodiment
In following each embodiments, optional feature and example are provided simultaneously in each embodiment, that records in embodiment is each
A feature can be combined, and form multiple optinal plans, and the embodiment of each number should not be considered merely as to a technical solution.Under
The present invention is described in further detail in conjunction with the accompanying drawings and embodiments in face.It is understood that specific reality described herein
Example is applied to be used only for explaining the present invention rather than limiting the invention.It also should be noted that for ease of description, it is attached
Only the parts related to the present invention are shown in figure rather than entire infrastructure.
Embodiment
So-called computer vision, exactly allows the visual performance of computer mould personification, can be managed as people by observing
Solve objective world.It is studied to the effect that: how computer vision technique to be utilized to solve relevant issues focusing on people,
Including object identification, recognition of face, human testing and tracking, human body attitude detection and human motion analysis etc..Human body attitude inspection
Survey is the important component of Human bodys' response and the important research content of Human bodys' response system, its final mesh
Be the structural parameters for exporting the wholly or partially limbs of people, such as human body contour outline, the position on head and towards, human body key point
Position or site categories.It has important application at many aspects, illustratively, such as player motion identification, animation people
Object production and the image based on content and video frequency searching etc..
For human body attitude detection, human body can be regarded as and be made of the different components that key point is connected, human body
Attitude detection can be determined by obtaining the location information of each key point, wherein the location information of key point can use one
A planar two dimensional coordinate indicates.Human body attitude detection usually require to obtain the head of human body, neck, left shoulder, right shoulder, left elbow,
Right elbow, left finesse, right finesse, left stern, right stern, left knee, right knee, left ankle and right ankle amount to 14 key points.
In traditional technology, human body attitude inspection can be carried out using the human body attitude detection method based on convolutional neural networks
It surveys, wherein the key problem that convolutional neural networks solve is how to automatically extract and abstract characteristics, and then Feature Mapping is arrived
Task object solving practical problems, a convolutional neural networks are generally made of following three parts, and first part is input layer, the
Two parts are composed of convolutional layer, excitation layer and pond layer (or down-sampling layer), the multilayer that Part III is linked entirely by one
Perceptron classifier is constituted.There is convolutional neural networks weight to share characteristic, and shared refer to of weight can pass through a convolution kernel
Convolution operation in other words to extract the same feature of whole image different location be the difference in an image data
The same target of position, their local feature are essentially identical.It is understood that can only obtain one using a convolution kernel
Kind feature can learn different features with each convolution kernel by the way that multi-kernel convolution is arranged to extract the feature of image data.
It is understood that the effect of convolutional layer is the feature extraction by low level and is polymerized to high-level feature, low in image procossing
Level is characterized in essential characteristic, the local features such as texture and edge, the shape etc. of high-level feature such as face and object,
The global property of sample can be more showed, this process is exactly convolutional neural networks to target object level generality.
It is understood that if it is desired to realizing that the human body attitude detection method based on convolutional neural networks can be embedded flat
It is run on platform, needs that the calculation amount of the convolutional neural networks is smaller, the speed of service is fast and precision of prediction meets actual requirement.
It cannot achieve to solve the human body attitude detection method based on convolutional neural networks in embedded platform operation
Problem, it is contemplated that convolutional neural networks are improved, lightweight convolutional neural networks, the embodiment of the present invention can be specifically used
Provided convolutional neural networks refer to lightweight convolutional neural networks.So-called lightweight convolutional neural networks refer to answer
Convolutional neural networks for embedded platform.
Human body attitude detection method is further described below in conjunction with specific embodiment.
Fig. 1 is a kind of flow chart of human body attitude detection method provided in an embodiment of the present invention, and the present embodiment is applicable to
The case where detecting human body attitude, this method can be executed by human body attitude detection device, the device can using software and/or
The mode of hardware realizes that the device can be configured in equipment, such as typically computer or mobile terminal etc..Such as Fig. 1 institute
Show, this method specifically comprises the following steps:
Step 110, acquisition multiple image data.
In an embodiment of the present invention, video can be understood as being made of an at least frame image data, therefore, in order to
Human body attitude in video is identified, image data one by one can be divided video into, respectively to every frame image
Data are analyzed.Here what multiple image data indicated is the image data in same video, and in other words, which includes
Multiple image data.Multiple image data can be named sequentially in time.Illustratively, if video includes N frame figure
As data, N >=1, at this point, sequentially in time above-mentioned N frame image data can be known as: the first frame image data, the second frame figure
As data ... ..., N-1 frame image data and nth frame image data.
It is understood that when dividing video into multiple image data, it can be sequentially in time successively to every frame figure
As data are handled.Meanwhile certain frame image data being presently processing can be known as current frame image data, by present frame
The previous frame image data of image data is known as previous frame image data, and next frame image data of current frame image data is claimed
For next frame image data.Currently it is understood that if current frame data is the first frame image data, to the present frame figure
As only having next frame image data without previous frame image data for data;If current frame image data are last
One frame image data only has previous frame image data without next frame image data then for current frame image data;
If current frame image data are neither the first frame image data is also not last frame image data, to current frame image number
For, having previous frame image data also has next frame image data.
It is using above-mentioned the reason of successively handling sequentially in time every frame image data: for human body attitude
For detection, there may be certain relevances between adjacent two field pictures data, i.e., if known according to previous frame image data
Not Chu certain key point appeared in some position in previous frame image, then the key point may also go out in current frame image data
Near same position in present current frame image data.In other words, if the testing result of previous frame image data meets
Preset condition can then refer to the testing result of previous frame image data, handle current frame image data.
Current frame image data are input in human body attitude detection model trained in advance by step 120, to refer to upper one
The human body attitude confidence map of frame image data exports multiple human body attitudes with reference to figure, and human body attitude detection model is through being applied to
The convolutional neural networks training of embedded platform generates.
In an embodiment of the present invention, human body attitude confidence map can refer to the image including human body attitude key point, alternatively,
Human body attitude confidence map can be understood as
The image being centrally generated.Human body attitude key point described here can refer to previously described head, neck, left shoulder, right shoulder, a left side
14 key points such as elbow, right elbow, left finesse, right finesse, left stern, right stern, left knee, right knee, left ankle and right ankle.
Human body attitude content with reference to of both figure may include, it is possible to each point as human body attitude key point
Location information and the corresponding probability value of the location information, wherein will likely can be known as waiting as the point of human body attitude key point
Reconnaissance, correspondingly, human body attitude with reference to figure may include each candidate point location information and the location information it is corresponding general
Rate value, i.e., each corresponding probability value of candidate point, location information can be indicated with coordinate form.Meanwhile it can be according to each
The corresponding probability value of the location information of candidate point is determined using which candidate point as human body attitude key point.Illustratively, as selected
The corresponding candidate point of most probable value is selected in the corresponding probability value of location information of each candidate point as human body attitude key point.
It include the location information (x of candidate point A in certain human body attitude reference figureA, yA) and corresponding probability value PA;The position of candidate point B
Information (xB, yB) and corresponding probability value PB;Location information (the x of candidate point CC, yC) and corresponding probability value PC, wherein
PA< PB< PC, based on above-mentioned, determine using candidate point C as human body attitude key point.
It should be noted that the corresponding human body attitude key point of every human body attitude confidence map, every human body attitude ginseng
Examining figure includes multiple candidate points, and the candidate point is the candidate point for some key point, as certain human body attitude is wrapped with reference to figure
Multiple candidate points are included, the candidate point is the candidate point for left elbow.For another example certain human body attitude also includes multiple times with reference to figure
Reconnaissance, the candidate point are the candidate points for left knee.Based on above-mentioned it will be appreciated that, for certain frame image data, need from
N number of key point is determined in the frame image data, then corresponding there are N human body attitudes with reference to figure and N human body attitude confidence maps.
Trained human body attitude detection model can be embedded to be applied to by setting the training sample of sets of numbers in advance
The convolutional neural networks training of platform generates, and the convolutional neural networks that can be applied to embedded platform are lightweight convolutional Neural
Network, human body attitude detection model may include main road, the first branch, second branch and third branch;Main road may include residual
Difference module and up-sampling module, the first branch may include refining network module, and second branch may include feedback module;Residual error
Module may include the first residual unit, the second residual unit and third residual unit.For the group of human body attitude detection model
It can be found in hereinafter at the detailed description of part.
Current frame image data are input in human body attitude detection model trained in advance, to refer to previous frame picture number
According to human body attitude confidence map, export multiple human body attitudes with reference to figure, the following two kinds situation can be divided:
Situation one is input in human body attitude model trained in advance using current frame image data as input variable, is obtained
To multiple the first human body attitude reference figures, and multiple the human body attitude confidence maps obtained according to previous frame image data, output are more
Human body attitude is opened with reference to figure, wherein every first human body attitude reference figure obtains more according to corresponding previous frame image data
Open certain human body attitude confidence map in human body attitude confidence map, the human body attitude reference of output current frame image data
Figure, corresponding relationship described above is determination whether identical based on key point.Illustratively, as current frame image data certain
Opening the key point that the first human body attitude reference figure is directed to is left elbow, then its reference is corresponding crucial in data on previous frame image
Point is the human body attitude confidence map of left elbow.
It is understood that being directed to situation one, the human body attitude confidence map of previous frame image data is not used as input variable,
It is input to together with current frame image data in human body attitude detection model trained in advance, but it is defeated in current frame image data
Enter to human body attitude detection model trained in advance, after obtaining multiple first human body attitude reference figures, according to previous frame picture number
According to multiple human body attitude confidence maps, successively determine whether every first human body attitude reference figure credible, if credible, can will
This first human body attitude reference figure is as the frame human body attitude with reference to figure;It, can be by previous frame image data if insincere
In for this human body attitude confidence map as the frame human body attitude with reference to figure.
It is situation two, the human body attitude confidence map of current frame image data and previous frame image data is defeated as input variable
Enter into human body attitude detection model trained in advance, exports multiple human body attitudes with reference to figure.
It is understood that the human body attitude confidence map of previous frame image data is also used as input variable in above situation two,
It is input to together with current frame image data in human body attitude detection model trained in advance, the beneficial effect of above-mentioned setting exists
In: for for video, there is certain relevance between adjacent two field pictures data, the result of previous frame image data is made
It for feedback information, is input in human body attitude detection model trained in advance, participates in the output of prediction current frame image data
As a result in process, it can further improve the precision of prediction of human body attitude detection model.
It should be noted that it is directed to second situation, in order to further increase the precision of prediction of human body attitude detection model,
It specifically can be used such as under type: judging whether the human body attitude confidence map of previous frame image data is credible;If credible, can incite somebody to action
The human body attitude confidence map of current frame image data and previous frame image data is input to human body attitude detection mould trained in advance
In type, multiple human body attitudes are exported with reference to figure;If insincere, current frame image data and pre-set image data can be inputted
Into human body attitude detection model trained in advance, multiple human body attitudes are exported with reference to figure;Alternatively, can be incited somebody to action if insincere
Current frame image data are input in human body attitude detection model trained in advance, export multiple human body attitudes with reference to figure.Wherein,
Pre-set image data refer to the image data not comprising priori knowledge, such as all black picture, if indicating with matrix sheet form, as
Full null matrix.For the output result of current frame image data, the human body attitude confidence map of previous frame image data is
Image data comprising priori knowledge;For the output result of next frame image data, the human body of current frame image data
Posture confidence map is the image data comprising priori knowledge.
The reason of can further improve the precision of prediction of human body attitude detection model using aforesaid way is: if upper one
The human body attitude confidence map of frame image data is insincere, it can be said that the human body attitude confidence map of bright previous frame image data is not
Reliably, if in these cases, human body attitude detection model trained in advance is still also input to as input variable
In, not but not improve human body attitude detection model precision of prediction, may be decreased the precision of prediction of human body attitude model instead.
Based on the foregoing, it is desirable to ensure to be input to the previous frame picture number in human body attitude detection model trained in advance as input variable
According to human body attitude confidence map be it is believable, therefore, determining whether the human body attitude confidence map with reference to previous frame image data
Before, specific use judges the whether believable mode of the human body attitude confidence map of previous frame image data to realize, if can
The human body attitude confidence map of previous frame image data then can be input to human body attitude trained in advance as input variable and examined by letter
It surveys in model, opposite, if insincere, not as input variable.Under type such as can be used and judge previous frame image
Whether the human body attitude confidence map of data is credible, specific: closing in the human body attitude of previous frame with reference to identification human body attitude in figure
Key point is generated centered on human body attitude key point if the corresponding probability value of human body key point is greater than preset threshold value
Mask artwork as the human body attitude confidence map of previous frame, and determines that the human body attitude confidence map of previous frame is credible;If human body closes
The corresponding probability value of key point is less than or equal to preset threshold value, then using pre-set image data as human body attitude confidence map, and determines
The human body attitude confidence map of previous frame is insincere.
It should also be noted that, multiple human body attitudes described above are directed to the defeated of current frame image data with reference to figure
Out as a result, i.e. current frame image data correspond to multiple human body attitudes with reference to figure, more specifically, if necessary from current frame image
N number of key point is determined in data, then N human body attitudes of corresponding output are with reference to figure.Meanwhile the previous frame image data as reference
Human body attitude confidence map also including N.
Separately it should be noted that the whether credible finger of the human body attitude confidence map for judging previous frame image data described above
Be to judge whether every human body attitude confidence map of previous frame image data credible respectively.It is also to be appreciated that due to human body
Posture confidence map can refer to the image including key point, and different key points correspond to different human body attitude confidence maps, therefore, for
Different key points judge that the whether believable condition of human body attitude confidence map may be the same or different, specifically can be according to reality
Situation is determined, and is not specifically limited herein.
In addition, if current frame image data are the first frame image datas, i.e., previous frame image data is not present in it, then may be used
Current frame image data are input in advance trained human body attitude detection model, alternatively, can by current frame image data and
Pre-set image data are input in human body attitude detection model trained in advance.
Optionally, based on the above technical solution, current frame image data are input to human body appearance trained in advance
In state detection model, with the human body attitude confidence with reference to previous frame image data, multiple human body attitudes are exported with reference to figure, specifically may be used
To include: to judge whether the human body attitude confidence map of previous frame image data is credible.If so, by current frame image data and upper
The human body attitude confidence map of one frame image data is input in human body attitude detection model trained in advance, exports multiple human body appearances
State is with reference to figure.If it is not, current frame image data and pre-set image data, which are then input to human body attitude trained in advance, detects mould
In type, multiple human body attitudes are exported with reference to figure.
In an embodiment of the present invention, in order to further increase the precision of prediction of human body attitude detection model, it is contemplated that adopt
With such as under type: judging whether the human body attitude confidence map of previous frame image data is credible;It, can be by present frame figure if credible
It is input to as the human body attitude confidence map of data and previous frame image data in human body attitude detection model trained in advance, output
Multiple human body attitudes are with reference to figure;If insincere, current frame image data and pre-set image data can be input to preparatory instruction
In experienced human body attitude detection model, multiple human body attitudes are exported with reference to figure.
By aforesaid operations, it is ensured that upper one be input to as input variable in human body attitude detection model trained in advance
The human body attitude confidence map of frame image data is believable, and then the human body attitude confidence map according to previous frame image data is mentioned
The priori knowledge of confession improves human body attitude detection model to the precision of prediction of the output result of current frame image data.
Illustratively, the human body attitude confidence map of a frame image data as above has N, judges that N human body attitudes are set respectively
Whether letter figure is credible, and judging result is that x human body attitude confidence maps are credible, and (N-x) human body attitude confidence map is insincere, then may be used
X believable human body attitude confidence maps, (N-x) pre-set image data and current frame image data are input to human body attitude inspection
It surveys in model, exports multiple human body attitudes with reference to figure.
Optionally, based on the above technical solution, current frame image data are input to human body appearance trained in advance
In state detection model, with the human body attitude confidence map with reference to previous frame image data, before exporting multiple human body attitudes with reference to figure,
It specifically can also include: to be pre-processed respectively to every frame image data, the image data that obtains that treated.
In an embodiment of the present invention, pretreatment may include normalization and albefaction, wherein normalization refers to through a system
Rank transformation, i.e., the shadow that other transforming function transformation functions convert image can be eliminated by finding one group of parameter using the not bending moment of image
It rings, original image to be processed is converted into corresponding sole criterion form, the canonical form image is to translation, rotation or scaling
Equiaffine transformation has invariant feature.Usually normalization include the following steps: i.e. coordinate centralization, x-shearing normalization,
Scaling normalization and rotational normalization.By current frame image data be input in advance trained human body attitude detection model it
Before, human body attitude detection model can be generated based on neural metwork training, and image data is normalized to played work
Be conclude unified samples statistical distribution, and then accelerate e-learning speed, guarantee output data in numerical value it is small not by
It eats.
It is superfluous when as input variable input due to having very strong correlation in image data between adjacent pixel
Remaining.The effect of albefaction is to reduce the redundancy of input, more precisely, by whitening processing, so that input variable has
Following property: correlation is lower between feature;All feature variances having the same, are usually arranged as unit in image procossing
Variance.
It is understood that being input to human body trained in advance as input variable after pre-processing to image data
Current frame image data in attitude detection model are image datas after treatment.Certainly, previous frame image data
It is image data after treatment.
Step 130, in human body attitude with reference to identifying human body attitude key point in figure.
In an embodiment of the present invention, according to it is described previously it is found that human body attitude with reference to figure may include of both in
Hold, it is possible to the location information and the corresponding probability value of the location information of each point as human body attitude key point, wherein
Human body attitude key point can refer to the point for being determined as key point, and in other words, human body attitude key point is key point, together
When, it can will likely be known as candidate point as the point of human body attitude key point.
Based on above-mentioned it will be appreciated that, human body attitude includes that the location information of multiple candidate points and position are believed with reference to figure
Corresponding probability value is ceased, can be determined according to the corresponding probability value of location information of each candidate point using which candidate point as people
Body posture key point.Illustratively, such as select most probable value in the corresponding probability value of the location information of each candidate point corresponding
Candidate point as human body attitude key point.
Optionally, based on the above technical solution, in human body attitude with reference to identification human body attitude key point, tool in figure
Body may include: the coordinate position in human body attitude with reference to most probable value determining in figure, using the coordinate position as human body
Posture key point.
In an embodiment of the present invention, due to human body attitude with reference to figure include may be as each of human body attitude key point
The location information and the corresponding probability value of the location information of point therefore can be according to corresponding to the location informations of each point
Probability value determines and regard which point as human body attitude key point.It is specifically as follows: most general with reference to determination in figure in human body attitude
The coordinate position of rate value, using coordinate position as human body attitude key point.
It should be noted that for every human body attitude with reference to figure for, only one human body posture key point.Using
The above-mentioned mode according to probability value determines human body attitude key point, it is understood that there may be following problem, in human body attitude with reference to having in figure
At least two probability values are equal, and are all larger than other probability values, then can according to the actual situation, if joint connect it is whether reasonable,
Which further determining that using the coordinate position of probability value as human body attitude key point.Illustratively, as human body attitude refers to
Probability value is equal there are two in figure and is all larger than other probability values, and the coordinate position of two probability values is respectively A and B, respectively by A
Whether reasonably joint is carried out as human body attitude key point with B connect judgement, judging result are as follows: if using A as human body appearance
State key point, then joint connection is unreasonable;If using B as human body attitude key point, joint connection is reasonable.Accordingly, it is determined that B
For human body attitude key point.
Step 140, according to the credibility of human body attitude key point, generate human body attitude confidence map.
In an embodiment of the present invention, credibility may include credible and insincere, determine credible and incredible standard
It can be with are as follows: whether the corresponding probability value of human body attitude key point is greater than preset threshold value, i.e., if human body attitude key point is corresponding
Probability value be greater than preset threshold value, it can be said that bright human body posture key point is credible;If human body attitude key point is corresponding
Probability value be less than or equal to preset threshold value, it can be said that bright human body posture key point is insincere.
On this basis, if human body attitude key point is credible, mask artwork is generated centered on human body attitude key point,
As human body attitude confidence map;If human body attitude key point is insincere, can be set using pre-set image data as human body attitude
Letter figure.Pre-set image data described here are identical as previously described pre-set image data, and pre-set image data can be complete
Black image, if being indicated with matrix sheet form, as full null matrix.Wherein, under type such as can be used and judge human body attitude key point
It is whether credible, it is specific: to judge whether the probability value of human body attitude key point is greater than preset threshold value.If human body attitude is crucial
The probability value of point is greater than preset threshold value, it can be said that bright human body posture key point is credible;If human body attitude key point
Probability value is less than or equal to preset threshold value, it can be said that bright human body posture key point is insincere.
It should be noted that if it is determined that human body attitude key point is insincere, then it will be corresponding in previous frame image data
Human body attitude key point is as present frame human body attitude key point, still, for incredible human body attitude key point,
Human body attitude confidence map is not to be generated according to human body attitude key point corresponding in previous frame image data, but according to pre-
What if image data human body attitude confidence map generated.
Optionally, based on the above technical solution, according to the credibility of human body attitude key point, human body attitude is generated
Confidence map can specifically include: judge whether human body attitude key point is credible.If so, using human body attitude key point in
The heart generates mask artwork, as human body attitude confidence map.If it is not, then using pre-set image data as human body attitude confidence map.
In an embodiment of the present invention, mask artwork, which refers to, carries out the image obtained after image masks processing to image.Wherein,
Image masks, which refer to, uses selected image, figure or object, is blocked to image (all or part) to be processed to control
The region of image procossing or treatment process.Wherein, it is known as exposure mask or template for the specific image of covering or object.In digitized map
As in processing, exposure mask can be two-dimensional matrix array, or multivalue image, image masks are mainly used for: one extracts sense
Interest region.It is multiplied with the area-of-interest exposure mask of pre-production with image to be processed, obtains region of interest area image, felt emerging
Image value remains unchanged in interesting region, and image value is zero outside region;Secondly, shielding action.I.e. with exposure mask to figure to be processed
As upper some regions make screen, so that it is not involved in processing, be not involved in the calculating of processing parameter or only covered region is dealt with,
Statistics;Thirdly, structure feature extract.Detected and extracted with similitude template or image matching method in image to be processed with cover
The similar structure feature of film;Four, the image of special shape is made.
According to the credibility of human body attitude key point, human body attitude is generated into human body attitude confidence map with reference to figure, specifically may be used
If credible to include: human body attitude key point, mask artwork is generated centered on human body attitude key point, as human body appearance
State confidence map, if may include: that human body attitude key point is credible, centered on human body attitude key point, and using high
This karyogenesis mask artwork, as human body attitude confidence map.It should be noted that can be determined by the way that the parameter of Gaussian kernel is arranged
Mask artwork institute influence area, wherein the parameter of Gaussian kernel includes the width and height of filter window, and Gaussian kernel can be high for two dimension
This core.Illustratively, if certain Gaussian kernel is two-dimensional Gaussian kernel, the parameter of the two-dimensional Gaussian kernel is that the width of filter window is 7, high
Degree be 7, i.e., mask artwork institute influence area be 7 × 7 square region.
It should be noted that if human body attitude key point is insincere, it can be using pre-set image data as human body appearance
Pre-set image data can also be considered a kind of mask artwork by state confidence map.Pre-set image data described here with it is described previously
Pre-set image data it is identical, pre-set image data can be all black picture, if being indicated with matrix sheet form, as full zero moment
Battle array.
Optionally, based on the above technical solution, judge whether human body attitude key point is credible, specifically can wrap
It includes: judging whether the corresponding probability value of human body key point is greater than preset threshold value.If so, determining that human body key point is credible.If
It is no, it is determined that human body key point is insincere.
In an embodiment of the present invention, it should be noted that threshold value can be set according to the actual situation, not made herein
It is specific to limit.In addition, the corresponding threshold value of different human body posture key point may be the same or different, it specifically can also basis
Actual conditions are determined, and are not specifically limited herein, such as important human body attitude key point, settable biggish threshold
Value, for unessential human body attitude key point, settable lesser threshold value.Illustratively, as being when human body attitude key point
When the crown, corresponding threshold value is 0.9, and when human body attitude key point is left knee, corresponding threshold value is 0.5.
Step 150 judges whether current frame image data are last frame image data;If it is not, thening follow the steps 160;
If so, thening follow the steps 170.
Human body attitude confidence map is input in human body attitude detection model by step 160, generates next frame figure for participating in
As the human body attitude confidence map of data.
Step 170 terminates to execute the operation for the human body attitude confidence map for generating multiple image data.
In an embodiment of the present invention, judge whether current frame image data are last frame image data, if currently
Frame image data is not last frame image data, then the human body attitude confidence map of current frame image data can be input to human body
In attitude detection model, the reference of the output result as next frame image data, to improve the output of next frame image data
Next frame image data is input in human body attitude detection model trained in advance by precision as a result, to refer to present frame
The human body attitude confidence map of image data exports multiple human body attitudes of next frame image data with reference to figure, joins in human body attitude
Identification human body attitude key point in figure is examined, according to the credibility of human body attitude key point, generates human body attitude confidence map.
It should be noted that showing to terminate to execute if current frame image data are last frame image data
The operation for generating the human body attitude confidence map of multiple image data, without obtained human body attitude confidence map is input to people again
In body attitude detection model.On this basis it will be appreciated that, if current frame image data are last frame image data,
Step 120, step 130 can then be only carried out and judge whether human body attitude key point is credible, if insincere, by upper one
Corresponding human body attitude key point is as human body attitude key point in frame image data.It will be understood, it is every to pass through step
120, step 130 and judge whether human body attitude key point is credible, it, will be corresponding in previous frame image data if insincere
Human body attitude key point as human body attitude key point, it is crucial that the corresponding human body attitude of current frame image data can be obtained
Point.
It should be noted step 120- step 150, be the treatment process for current frame image data, accordingly
, step 120 and the human body attitude in step 130 refer to the corresponding human body attitude reference of current frame image data with reference to figure
Figure, step 130 and the human body attitude key point in step 140 refer to that the corresponding human body attitude of current frame image data is crucial
Point and step 140 and the human body attitude confidence map in step 150 refer to that the corresponding human body attitude of current frame image data is set
Letter figure.
Based on above-mentioned, what is indicated due to current frame image data is certain frame image data being presently processing, such as
The first frame image data of fruit is certain frame image data being presently processing, then can be using the first frame image data as present frame
Image data;It, can be by the second frame picture number if the second frame image data is certain frame image data being presently processing
According to as current frame image data, and so on.In other words, current frame image data can be first frame image data, the
Two frame image datas, third frame image data ..., N-1 frame image data or nth frame image data.
Assuming that video includes N frame image data, N >=1, if it is determined that current frame image data are not nth frame image datas,
Step 120-140 can be then repeated, and then the processing completed to the first frame image data to N-1 frame image data is grasped
Make;If it is determined that current frame image data are nth frame image data, then step 120- step 130 and if not can be executed
It is credible, then using human body attitude key point corresponding in previous frame image data as human body attitude key point.
Current frame image data are input to preparatory instruction by acquiring multiple image data by the technical solution of the present embodiment
In experienced human body attitude detection model, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported
With reference to figure, human body attitude detection model is the convolutional neural networks training generation for being applied to embedded platform, in human body attitude
With reference to human body attitude key point is identified in figure, according to the credibility of human body attitude key point, human body attitude confidence map, judgement are generated
Whether current frame image data are last frame image data, if it is not, human body attitude confidence map is then input to human body attitude inspection
It surveys in model, for participating in generating the human body attitude confidence map of next frame image data, generates multiframe figure if so, terminating to execute
As the operation of the human body attitude confidence map of data, realize in the enterprising pedestrian's body attitude detection of embedded platform, meanwhile, by upper one
During the output result of frame image data is introduced to the prediction of the output result of current frame image data, further improve pre-
Survey precision.
Optionally, based on the above technical solution, human body attitude detection model includes main road, the first branch and second
Branch, main road include residual error module and up-sampling module, and the first branch includes refining network module, and second branch includes feedback mould
Block.
Current frame image data are input in human body attitude detection model trained in advance, to refer to previous frame picture number
According to human body attitude confidence map, export multiple human body attitudes with reference to figure, can specifically include: current frame image data are input to
Residual error module is handled, and the human body attitude confidence map of previous frame image data is input at feedback module with reference
Reason, obtains the first convolution results.The first convolution results that residual error module is exported are separately input into up-sampling module and refine net
Network module is handled, and the second convolution results and third convolution results are respectively obtained.By the second convolution results and third convolution knot
Fruit is added, and exports multiple human body attitudes with reference to figure.
In an embodiment of the present invention, residual error module can be used for extracting the features such as edge and the profile of image data, and
Up-sampling module can be used for extracting the contextual information of image data.Refine network module is used to export residual error module the
One convolution results are handled, and the first convolution results can be considered to network intermediate layer information, i.e. refinement network module is utilized
Network intermediate layer information increases it and returns gradient, and then improves the precision of prediction of convolutional neural networks.Feedback module is used for
The human body attitude confidence map of previous frame image data is introduced into convolutional neural networks, current frame image data is improved and exports result
Precision.
Current frame image data are input to residual error module to handle, to refer to the human body appearance of previous frame image data
State confidence map is input to feedback module and is handled, and obtains the first convolution as a result, can understand as follows: by the current frame image
Data are input to the residual error module and handle and be input to the human body attitude confidence map of the previous frame image data
Results added that the feedback module carries out that treated, obtains the first convolution results.
The first convolution results that residual error module is exported are separately input into up-sampling module and refine at network module
Reason, obtains the second convolution results and third convolution results, then the second convolution results are added with third convolution results, exports multiple
Human body attitude is with reference to figure, wherein up-sampling module can specifically use arest neighbors interpolation method, and other up-sampling sides can also be used
Method can specifically be set according to the actual situation, is not specifically limited herein.
Network intermediate layer information is utilized by refining network module, increases it and returns gradient, and then improve convolution
The precision of prediction of neural network.The human body attitude confidence map of previous frame image data is introduced into convolutional Neural net by feedback module
In network, prediction of the human body attitude detection model to current frame image data is participated in, also improves the prediction essence of convolutional neural networks
Degree.
Optionally, based on the above technical solution, residual error module include the first residual unit, the second residual unit and
Third residual unit.
Current frame image data are input to residual error module to handle, to refer to the human body appearance of previous frame image data
State confidence map is input to feedback module and is handled, and obtains the first convolution as a result, can specifically include: by current frame image data
It is input to the first residual unit to be handled, obtains the first intermediate result.First intermediate result is input to the second residual unit
Carry out processing and the human body attitude confidence map of previous frame image data be input to feedback module to carry out treated result phase
Add, obtains the second intermediate result.Second intermediate result is input to third residual unit to handle, obtains knot among third
Fruit, as the first convolution results.Wherein, the port number of the first intermediate result, the second result and third result successively increases.
In an embodiment of the present invention, residual error module can specifically include the first residual unit, the second residual unit and
Three residual units, wherein each residual unit is made of ShuffleNet subelement and ShuffleNet down-sampling subelement,
Wherein, ShuffleNet subelement may be implemented to operate the image data of arbitrary dimension, by two state modulators, divide
Depth and output depth Wei not inputted, wherein input depth representing is the number of plies for inputting network intermediate features layer, exports depth
Refer to the number of plies of the exported intermediate features layer of the subelement, the number of plies is corresponding with port number, and ShuffleNet subelement is extracted
The feature of higher level, while the information of original level is remained, the size for not changing image data may be implemented, only change
The depth for becoming network intermediate features layer can be regarded as advanced " convolutional layer " for keeping size constant.Wherein, exist
In convolutional neural networks, port number refers to the number of convolution kernel in each convolutional layer.In addition, it should be noted that, each residual error
Unit can only include a ShuffleNet subelement, include three ShuffleNet compared to original each residual unit
For subelement, network structure is simplified, correspondingly, also just reducing calculation amount, improves treatment effeciency.
Pass through ShuffleNet down-sampling subelement in the first residual unit, the second residual unit and third residual unit
Successively handle, so that the size of the first intermediate result, the second intermediate result and third intermediate result successively becomes smaller, meanwhile, in order to
The constant of network size is kept, the port number of the first intermediate result, the port number of the second intermediate result and third intermediate result are made
Port number successively increase.In addition, the corresponding characteristic pattern in each channel.
It should be noted that intermediate result can be indicated with W × H × K, wherein W indicates that the width of intermediate result, H indicate
The length of intermediate result, K indicate that port number, W × H are the size for indicating intermediate result.It, can for input image data
To be expressed as W × H × D, wherein W and H is identical as aforementioned meaning, and D indicates depth, illustratively, if input image data is
RGB image, then D=3, if input image data is gray level image, D=1.
Illustratively, as the first intermediate result, the second intermediate result and third intermediate result M × N × K are indicated, M, N and
As hereinbefore, the first intermediate result is 64 × 32 × 32 to the meaning of K, and the second intermediate result is 32 × 16 × 64, among third
It as a result is 16 × 8 × 128.Based on above-mentioned it is found that the size of the first intermediate result is 64 × 32, the size of the second intermediate result is
32 × 16, the size of third intermediate result is 16 × 8, above-mentioned to show among the first intermediate result, the second intermediate result and third
As a result size successively becomes smaller.Meanwhile first the port number of intermediate result be 32, the port number of the second intermediate result is 64, the
The port number of three intermediate results is 128, above-mentioned to show the logical of the first intermediate result, the second intermediate result and third intermediate result
Road number successively increases.
Optionally, based on the above technical solution, human body attitude detection model specifically can also include third branch.
The first convolution results that residual error module is exported are separately input into up-sampling module and refine at network module
Reason, respectively obtains the second convolution results and third convolution results, can specifically include: the first intermediate result is input to third branch
Road is handled, and the 4th intermediate result is obtained.Second intermediate result is input to third branch to handle, is obtained among the 5th
As a result.Third intermediate result and the 5th intermediate result are input to up-sampling module to handle, obtain the 6th intermediate result.It will
4th intermediate result and the 6th intermediate result are input to up-sampling module and are handled, and the 7th intermediate result are obtained, as second
Convolution results.The first convolution results that residual error module is exported are input to refinement network module and handle, and obtain third convolution
As a result.Wherein, the port number of the 6th intermediate result and the 7th intermediate result is successively reduced.
In an embodiment of the present invention, human body attitude detection model specifically can also include third branch, third branch institute
Play the role of being: by realizing third branch and moving to the convolution operation for jumping connection on main road, thus further
Improve the precision of prediction of human body attitude detection model.Third branch can specifically include 1 × 1 convolution core module, batch standardization
Module and linear activation primitive module.Wherein, 1 × 1 convolution kernel can function as follows, specific:
Situation one is directed to for single channel and single convolution kernel, and 1 × 1 convolution kernel is carried out to input image data
Scaling, this is because only one parameter of 1 × 1 convolution kernel, this convolution kernel slide on input image data, just quite
In to input image data multiplied by a coefficient;
Situation two is directed to for multichannel and multiple convolution kernels, and 1 × 1 convolution kernel has following both sides effect: its
One, realize interaction and information integration across channel;Secondly, carry out dimensionality reduction and rise tie up and reduce network parameter, drop described here
Dimension refers to reducing port number, rises dimension and refer to increasing port number;Thirdly, be significantly increased under the premise of not losing resolution ratio it is non-
Linear characteristic.
Batch standardized module is for carrying out batch standardization, wherein batch standardization (or batch normalizes) be in order to
The neural network number of plies is overcome to deepen, convergence rate is slack-off, and caused gradient disappears or gradient explosion, specifically can be by using
It criticizes standardization and comes the certain layers of specification or all layers of input, thus the mean value and variance of fixed every layer of input signal, so that often
One layer of input has a stable distribution.More specifically: it is commonly used in front of activation primitive, standardizes to x=W+b
Change, make the mean value 0 for exporting result, variance 1, wherein W indicates that weight matrix, b indicate biasing.It is understood that in convolution
In neural network, weight matrix refers to that convolution kernel, i.e. W indicate convolution kernel.
Since the 7th intermediate result is obtained after the 6th intermediate result and the 4th intermediate result to be input to up-sampling module
, therefore, the size of the 7th intermediate result is greater than the size of the 6th intermediate result, meanwhile, in order to keep network size constant, make
The port number of 6th intermediate result and the port number of the 7th intermediate result are successively reduced.
By realizing third branch and moving to the convolution operation for jumping connection on main road, to further improve
The precision of prediction of human body attitude detection model.In addition, the first intermediate result, the second intermediate result and third intermediate result can be managed
Coded portion is solved, the 6th intermediate result and the 7th intermediate result are interpreted as decoded portion, in order to keep network size constant,
Coded portion successively increases the port number of intermediate result as the size of intermediate result reduces;In decoded portion, with centre
As a result size increases, and successively reduces the port number of intermediate result.Furthermore, it is to be understood that arriving, provided by the embodiment of the present invention
Convolutional neural networks are a kind of asymmetric encoding-decoding structures.
Optionally, based on the above technical solution, current frame image data are input to human body appearance trained in advance
In state detection model, multiple human body attitudes are exported with reference to figure, specifically with the human body attitude confidence map with reference to previous frame image data
It can also include: that the first convolution results and the second convolution results added are obtained into the second objective result.Multiple human body attitudes are joined
It examines figure and the second objective result is added, export multiple new human body attitudes with reference to figure.Wherein, the second objective result is used for people
When body attitude detection model is trained, the precision of human body attitude detection model is improved.
In an embodiment of the present invention, in order to improve human body attitude detection model in the precision of training stage, it may be considered that
Increase midway to supervise, midway supervision refers to calculating loss in the output in each stage, it is ensured that bottom parameter is normally more
Newly.
By the first convolution results and the second convolution results added, obtain the second objective result, then by the second objective result with
Multiple human body attitudes are added with reference to figure, obtain multiple new human body attitudes with reference to figure, above-mentioned second objective result has been partway
The effect of supervision, i.e. the second objective result also assist in the calculating process of loss.
It should be noted that in forecast period, can not execute the first convolution results and the second convolution results added
Operation exports result and only includes multiple human body attitudes with reference to figure that is, in forecast period.
It should also be noted that, technical solution described in the embodiment of the present invention is not necessarily to after collecting multiple image data
Carry out whether having face in detection image data, if there is face, then the position for detecting face where in image data, then
The operation such as extracted, be without the reason of aforesaid operations: aforesaid operations take a long time, and testing result error is larger.
It is understood that data-handling efficiency can be greatly improved when without aforesaid operations.
It is another it should be noted that due to the second residual unit and third residual unit by ShuffleNet subelement and
ShuffleNet down-sampling subelement composition retains full size information, i.e., second before carrying out down-sampling every time on main road
The ShuffleNet down-sampling subelement of residual unit is input to the second residual error before carrying out down-sampling, by the first intermediate result
Unit;Third residual unit ShuffleNet down-sampling subelement before carrying out down-sampling, the second intermediate result is defeated
Enter to third residual unit.Twice between down-sampling, feature, i.e. the first residual error list are extracted using a ShuffleNet subelement
Feature is extracted using a ShuffleNet subelement between member and the second residual unit, which is first
The ShuffleNet subelement of residual unit;ShuffleNet is used between second residual unit and third residual unit
Unit extracts feature, i.e. extracts spy using a ShuffleNet subelement between the second residual unit and third residual unit
Sign, the ShuffleNet subelement are the ShuffleNet subelement of the second residual unit.
Convolutional neural networks provided by the embodiment of the present invention, which introduce, to be refined network module, feedback module and will jump
The convolution operation of connection moves on main road, the above-mentioned precision of prediction for improving convolutional neural networks.In addition, using asymmetric volume
Code-decoding structure ensure that network size is basically unchanged, since each residual unit only includes that ShuffleNet is single
Member simplifies network structure, accordingly for original each residual unit includes three ShuffleNet subelements
, also just reduce calculation amount, improves treatment effeciency.Based on above-mentioned, so that the human body attitude based on convolutional neural networks is examined
Survey method can be applied to embedded platform, as smart phone embedded platform on, and real time execution and precision of prediction can
To meet the requirements.
Convolutional neural networks provided by embodiment in order to better understand the present invention, are said below with specific example
It is bright, specific:
As shown in Fig. 2, being a kind of application schematic diagram of convolutional neural networks, which, which has, may include:
Main road, the first branch, second branch and third branch, wherein main road include the first convolution module 21, the first residual unit 22,
Second residual unit 23, third residual unit 24, the second convolution module 25, up-sampling module 26, addition without carry module 27 and third
Convolution module 28, wherein the first residual unit 22, the second residual unit 23 and third residual unit 24 include ShuffleNet
Down-sampling subelement 221 and ShuffleNet subelement 222, the first branch include refining network module 29, wherein refine network
Module 29 includes ShuffleNet subelement 222 and up-sampling module 26, and second branch includes feedback module 30, third branch packet
Include the second convolution module 25.
It should be noted that W × the H marked in module, unit or sub-unit × K, indicates to pass through the module, unit or son
The result obtained after cell processing, wherein W indicates that the width of result, H indicate the length of result, and K indicates port number.
It should also be noted that, the first convolution module 21 includes following processing operation: the first step, convolution operation, specific institute
The size of the convolution kernel used is 3 × 3;Second step, batch standardization;Third step, linear activation primitive.Second convolution module 25 packet
Following processing operation: the first step, convolution operation is included, specifically uses the size of institute's convolution kernel for 1 × 1;Second step, batch standardization;
Third step, linear activation primitive.Third convolution module 26 includes following processing operation: the first step, convolution operation are specifically used
Convolution kernel size be 1 × 1;Second step, batch standardization;Third step, linear activation primitive;4th step, convolution operation, specifically
The size of the convolution kernel used is 3 × 3.
Assuming that the RGB image that current frame image data are 256 × 128 × 3, is input to convolution mind as input variable
Through successively after the first convolution module 21 and the first residual unit 22, obtaining the first intermediate result, knot among first in network
Fruit is 64 × 32 × 32, and the first intermediate result is input to the second residual unit 23 and is handled, and by previous frame picture number
According to human body attitude confidence map be input to feedback module 30 and carry out treated results added, obtain the second intermediate result, second
Intermediate result is 32 × 16 × 64, and the second intermediate result is input to third residual unit 24 and is handled, is obtained among third
As a result, using third intermediate result as the first convolution as a result, the first convolution results are 16 × 8 × 128.It should be noted that anti-
Presenting module 30 may include 1 × 1 convolution kernel, for rising dimension, this is because the human body attitude confidence map of previous frame image data is
64 × 32 × 14, and the first intermediate result is 64 × 32 × 32, needs a liter dimension, to guarantee that the two output channel number is consistent.
The second convolution module 25 that first intermediate result is input to third branch is handled, knot among the 4th is obtained
Fruit, the 4th intermediate result are 64 × 32 × 32.
The second convolution module 25 that second intermediate result is input to third branch is handled, knot among the 5th is obtained
Fruit, the 5th intermediate result are 32 × 16 × 32.
After the second convolution module 25 and up-sampling module 26 that third intermediate result is input on main road are handled
The result and the 5th intermediate result arrived, the addition without carry module 27 being input on main road jointly are handled, and knot among the 6th is obtained
Fruit, the result and the 4th intermediate result that the up-sampling module 26 that the 6th intermediate result is input on main road is handled,
The addition without carry module 27 being input on main road jointly is handled, and the 7th intermediate result is obtained, using the 7th intermediate result as
Two convolution results, the second convolution results are 64 × 32 × 32.
The second convolution module 25 that third intermediate result is input on main road handled as a result, being input to again
ShuffleNet subelement 222 in the first branch is handled, and the 8th intermediate result is obtained, and the 8th intermediate result is input to
Up-sampling module 26 in the first branch is handled, and obtains the 9th intermediate result, then the 9th intermediate result is input to first
The ShuffleNet subelement 222 of branch road is handled, and the tenth intermediate result is obtained, and the tenth intermediate result is input to first
The up-sampling module 26 of branch road is handled, and the 11st intermediate result is obtained.6th intermediate result is input to the first branch
On ShuffleNet subelement 222 handled, obtain the 12nd intermediate result, the 12nd intermediate result be input to first
The up-sampling module 26 of branch road is handled, and the 13rd intermediate result is obtained, among the 11st intermediate result and the 13rd
Results added, obtains third convolution results, and third convolution results are 64 × 32 × 32.
Second convolution results and third convolution results are input to the addition without carry module 27 on main road, are obtained among the 14th
As a result, the 14th intermediate result is input to the ShuffleNet subelement 222 on main road, the 15th intermediate result is obtained, the
15 intermediate results are 64 × 32 × 32, and the 15th intermediate result is input to the third convolution module 28 on main road, is exported more
Human body attitude is opened with reference to figure.
By the first convolution results and the second convolution results added, the second objective result is obtained, the second objective result is 64 ×
32×14.Multiple human body attitudes are added with reference to figure and the second objective result, export multiple new human body attitudes with reference to figure.Its
In, the second objective result is used for when being trained to human body attitude detection model, improves the precision of human body attitude detection model.
It should be noted that since the human body attitude confidence map of previous frame image data is not when starting, just and currently
Frame image data is input in convolutional neural networks as input variable, but interbed and the first intermediate result conduct in a network
Input variable is input in convolutional neural networks, above-mentioned to realize reduction data processing amount.
Fig. 3 is the flow chart of another human body attitude detection method provided in an embodiment of the present invention, and the present embodiment is applicable
In detect human body attitude the case where, this method can be executed by human body attitude detection device, the device can using software and/
Or the mode of hardware is realized, which can be configured in equipment, such as typically computer or mobile terminal etc..Such as Fig. 3
Shown, this method specifically comprises the following steps:
Step 301, acquisition multiple image data.
Step 302 judges whether the human body attitude confidence map of previous frame image data is credible;If so, thening follow the steps
303;If it is not, thening follow the steps 304.
The human body attitude confidence map of current frame image data and previous frame image data is input to preparatory instruction by step 303
In experienced human body attitude detection model, multiple human body attitudes are exported with reference to figure, and are transferred to and are executed step 305.
Current frame image data and pre-set image data are input to human body attitude detection figure trained in advance by step 304
In, multiple human body attitudes are exported with reference to figure, and are transferred to and are executed step 305.
Step 305, in human body attitude with reference to the coordinate position for determining most probable value in figure, using coordinate position as human body
Posture key point.
Step 306 judges whether the corresponding probability value of human body attitude key point is greater than preset threshold value;If so, executing
Step 307;If it is not, thening follow the steps 308.
Step 307 generates mask artwork centered on human body attitude key point, as human body attitude confidence map, and is transferred to and holds
Row step 309.
Step 308, using pre-set image data as human body attitude confidence map, and be transferred to execute step 309.
Step 309 judges whether current frame image data are last frame image data;If it is not, thening follow the steps 310;
If so, thening follow the steps 311.
Human body attitude confidence map is input in human body attitude detection model by step 310, generates next frame figure for participating in
As the human body attitude confidence map of data.
Step 311 terminates to execute the operation for the human body attitude confidence map for generating multiple image data.
In an embodiment of the present invention, it should be noted that human body attitude detection model provided by the embodiment of the present invention
It is generated for the convolutional neural networks training through being applied to embedded platform.
Current frame image data are input to preparatory instruction by acquiring multiple image data by the technical solution of the present embodiment
In experienced human body attitude detection model, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported
With reference to figure, human body attitude detection model is the convolutional neural networks training generation for being applied to embedded platform, in human body attitude
With reference to human body attitude key point is identified in figure, according to the credibility of human body attitude key point, human body attitude confidence map, judgement are generated
Whether current frame image data are last frame image data, if it is not, human body attitude confidence map is then input to human body attitude inspection
It surveys in model, for participating in generating the human body attitude confidence map of next frame image data, generates multiframe figure if so, terminating to execute
As the operation of the human body attitude confidence map of data, realize in the enterprising pedestrian's body attitude detection of embedded platform, meanwhile, by upper one
During the output result of frame image data is introduced to the prediction of the output result of current frame image data, further improve pre-
Survey precision.
Fig. 4 is a kind of structural schematic diagram of human body attitude detection device provided in an embodiment of the present invention, and the present embodiment can fit
The case where for detecting human body attitude, the device can realize that the device can be configured at by the way of software and/or hardware
In equipment, such as typically computer or mobile terminal etc..As shown in figure 4, the device specifically includes:
Image data acquiring module 410, for acquiring multiple image data.
Human body attitude is with reference to figure output module 420, for current frame image data to be input to human body appearance trained in advance
In state detection model, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported with reference to figure, human body
Attitude detection model is the convolutional neural networks training generation for being applied to embedded platform.
Human body attitude key point identification module 430 is used in human body attitude with reference to identification human body attitude key point in figure.
Human body attitude confidence map generation module 440 generates human body attitude for the credibility according to human body attitude key point
Confidence map.
Judgment module 450, for judging whether current frame image data are last frame image data.
First execution module 460, for if it is not, then human body attitude confidence map is input in human body attitude detection model,
For participating in generating the human body attitude confidence map of next frame image data.
Second execution module 470, for if so, terminating to execute the human body attitude confidence map for generating multiple image data
Operation.
Current frame image data are input to preparatory instruction by acquiring multiple image data by the technical solution of the present embodiment
In experienced human body attitude detection model, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported
With reference to figure, human body attitude detection model is the convolutional neural networks training generation for being applied to embedded platform, in human body attitude
With reference to human body attitude key point is identified in figure, according to the credibility of human body attitude key point, human body attitude confidence map, judgement are generated
Whether current frame image data are last frame image data, if it is not, human body attitude confidence map is then input to human body attitude inspection
It surveys in model, for participating in generating the human body attitude confidence map of next frame image data, generates multiframe figure if so, terminating to execute
As the operation of the human body attitude confidence map of data, realize in the enterprising pedestrian's body attitude detection of embedded platform, meanwhile, by upper one
During the output result of frame image data is introduced to the prediction of the output result of current frame image data, further improve pre-
Survey precision.
Optionally, based on the above technical solution, human body attitude can specifically include with reference to figure output module 420:
Confidence map credibility judging unit, whether the human body attitude confidence map for judging previous frame image data is credible.
First human body attitude reference figure output unit, for if so, by current frame image data and previous frame picture number
According to human body attitude confidence map be input in advance trained human body attitude detection model, export multiple human body attitudes with reference to figure.
Second human body attitude is with reference to figure output unit, for if it is not, then by current frame image data and pre-set image data
It is input in human body attitude detection model trained in advance, exports multiple human body attitudes with reference to figure.
Optionally, based on the above technical solution, human body attitude key point identification module 430, can specifically include:
Human body attitude key point recognition unit, for the coordinate bit in human body attitude with reference to most probable value determining in figure
It sets, using coordinate position as human body attitude key point.
Optionally, based on the above technical solution, human body attitude confidence map generation module 440, can specifically include:
Human body attitude key point credibility judging unit, for judging whether human body attitude key point is credible.
First human body attitude confidence map generation unit, for being covered if so, being generated centered on human body attitude key point
Mould figure, as human body attitude confidence map;
Second human body attitude confidence map generation unit, for if it is not, then using pre-set image data as human body attitude confidence
Figure.
Optionally, based on the above technical solution, human body attitude key point credibility judging unit can specifically be used
In:
Judge whether the corresponding probability value of human body key point is greater than preset threshold value.
If so, determining that human body attitude key point is credible.
If not, it is determined that human body attitude key point is insincere.
Optionally, based on the above technical solution, human body attitude detection model includes main road, the first branch and second
Branch, main road include residual error module and up-sampling module, and the first branch includes refining network module, and second branch includes feedback mould
Block.
Current frame image data are input in human body attitude detection model trained in advance, to refer to previous frame picture number
According to human body attitude confidence map, export multiple human body attitudes with reference to figure, can specifically include:
Current frame image data are input to residual error module to handle, to refer to the human body appearance of previous frame image data
State confidence map is input to feedback module and is handled, and obtains the first convolution results.
The first convolution results that residual error module is exported are separately input into up-sampling module and refine at network module
Reason, respectively obtains the second convolution results and third convolution results.
Second convolution results are added with third convolution results, export multiple human body attitudes with reference to figure.
Optionally, based on the above technical solution, residual error module include the first residual unit, the second residual unit and
Third residual unit.
Current frame image data are input to residual error module to handle, to refer to the human body appearance of previous frame image data
State confidence map is input to feedback module and is handled, and obtains the first convolution as a result, can specifically include:
Current frame image data are input to the first residual unit to handle, obtain the first intermediate result.
First intermediate result is input to the second residual unit and carries out processing and by the human body appearance of previous frame image data
State confidence map is input to feedback module and carries out treated results added, obtains the second intermediate result.
Second intermediate result is input to third residual unit to handle, third intermediate result is obtained, as the first volume
Product result.
Wherein, the port number of the first intermediate result, the second intermediate result and third intermediate result successively increases.
Optionally, based on the above technical solution, human body attitude detection model specifically can also include third branch.
The first convolution results that residual error module is exported are separately input into up-sampling module and refine at network module
Reason, respectively obtains the second convolution results and third convolution results, can specifically include:
First intermediate result is input to third branch to handle, obtains the 4th intermediate result.
Second intermediate result is input to third branch to handle, obtains the 5th intermediate result.
Third intermediate result and the 5th intermediate result are input to up-sampling module to handle, obtain knot among the 6th
Fruit.
4th intermediate result and the 6th intermediate result are input to up-sampling module to handle, obtain knot among the 7th
Fruit, as the second convolution results.
The first convolution results that residual error module is exported are input to refinement network module and handle, and obtain third convolution knot
Fruit.
Wherein, the port number among the 6th centre and the 7th is successively reduced.
Optionally, based on the above technical solution, current frame image data are input to human body appearance trained in advance
In state detection model, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported with reference to figure, specifically
Can also include:
By the first convolution results and the second convolution results added, the second objective result is obtained.
Multiple human body attitudes are added with reference to figure and the second objective result, export multiple new human body attitudes with reference to figure.
Wherein, the second objective result is used for when being trained to human body attitude detection model, improves human body attitude detection
The precision of model.
People provided by any embodiment of the invention can be performed in human body attitude detection device provided by the embodiment of the present invention
Body attitude detecting method has the corresponding functional module of execution method and beneficial effect.
Fig. 5 is a kind of structural schematic diagram of equipment provided in an embodiment of the present invention.Fig. 5, which is shown, to be suitable for being used to realizing this hair
The block diagram of the example devices 512 of bright embodiment.The equipment 512 that Fig. 5 is shown is only an example, should not be to of the invention real
The function and use scope for applying example bring any restrictions.
As shown in figure 5, equipment 512 is showed in the form of common apparatus.The component of equipment 512 can include but is not limited to:
One or more processor 516, system storage 528 are connected to different system components (including system storage 528 and place
Manage device 516) bus 518.
Bus 518 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts
For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC)
Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Equipment 512 typically comprises a variety of computer system readable media.These media can be it is any can be by equipment
The usable medium of 512 access, including volatile and non-volatile media, moveable and immovable medium.
System storage 528 may include the computer system readable media of form of volatile memory, such as deposit at random
Access to memory (RAM) 530 and/or cache memory 532.Equipment 512 may further include other removable/not removable
Dynamic, volatile/non-volatile computer system storage medium.Only as an example, storage system 534 can be used for read and write can not
Mobile, non-volatile magnetic media (Fig. 5 do not show, commonly referred to as " hard disk drive ").Although being not shown in Fig. 5, Ke Yiti
For the disc driver for being read and write to removable non-volatile magnetic disk (such as " floppy disk "), and to moving non-volatile light
The CD drive of disk (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these cases, each driver
It can be connected by one or more data media interfaces with bus 518.Memory 528 may include that at least one program produces
Product, the program product have one group of (for example, at least one) program module, these program modules are configured to perform of the invention each
The function of embodiment.
Program/utility 540 with one group of (at least one) program module 542, can store in such as memory
In 528, such program module 542 includes but is not limited to operating system, one or more application program, other program modules
And program data, it may include the realization of network environment in each of these examples or certain combination.Program module 542
Usually execute the function and/or method in embodiment described in the invention.
Equipment 512 can also be logical with one or more external equipments 514 (such as keyboard, sensing equipment, display 524 etc.)
Letter, can also be enabled a user to one or more equipment interact with the equipment 512 communicate, and/or with make the equipment 512
Any equipment (such as network interface card, modem etc.) that can be communicated with one or more of the other calculating equipment communicates.This
Kind communication can be carried out by input/output (I/O) interface 522.Also, equipment 512 can also by network adapter 520 with
One or more network (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) communication.Such as
Shown in figure, network adapter 520 is communicated by bus 518 with other modules of equipment 512.It should be understood that although not showing in Fig. 5
Out, other hardware and/or software module can be used with bonding apparatus 512, including but not limited to: microcode, device driver, superfluous
Remaining processing unit, external disk drive array, RAID system, tape drive and data backup storage system etc..
Processor 516 by the program that is stored in system storage 528 of operation, thereby executing various function application and
Data processing, such as realize a kind of human body attitude detection method provided by the embodiment of the present invention, comprising:
Acquire multiple image data.
Current frame image data are input in human body attitude detection model trained in advance, to refer to previous frame picture number
According to human body attitude confidence map, export multiple human body attitudes with reference to figure, human body attitude detection model be applied to it is embedded flat
The convolutional neural networks training of platform generates.
In human body attitude with reference to identification human body attitude key point in figure.
According to the credibility of human body attitude key point, human body attitude confidence map is generated.
Judge whether current frame image data are last frame image data.
If it is not, then human body attitude confidence map is input in human body attitude detection model, next frame figure is generated for participating in
As the human body attitude confidence map of data.
If so, terminating the operation that execution generates the human body attitude confidence map of multiple image data.
Certainly, it will be understood by those skilled in the art that processor can also realize that any embodiment of that present invention provides answers
The technical solution of human body attitude detection method for equipment.The hardware configuration and function of the equipment can be found in the interior of embodiment
Hold and explains.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, the journey
A kind of human body attitude detection method as provided by the embodiment of the present invention is realized when sequence is executed by processor, this method comprises:
Acquire multiple image data.
Current frame image data are input in human body attitude detection model trained in advance, to refer to previous frame picture number
According to human body attitude confidence map, export multiple human body attitudes with reference to figure, human body attitude detection model be applied to it is embedded flat
The convolutional neural networks training of platform generates.
In human body attitude with reference to identification human body attitude key point in figure.
According to the credibility of human body attitude key point, human body attitude confidence map is generated.
Judge whether current frame image data are last frame image data.
If it is not, then human body attitude confidence map is input in human body attitude detection model, next frame figure is generated for participating in
As the human body attitude confidence map of data.
If so, terminating the operation of the human body attitude confidence map of execution multiple image data.
The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable media
Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable
Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or
Device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: tool
There are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only memory of one or more conducting wires
(ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-
ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage
Medium can be any tangible medium for including or store program, which can be commanded execution system, device or device
Using or it is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited
In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof
Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++,
It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion
Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.?
Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or
Wide area network (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as mentioned using Internet service
It is connected for quotient by internet).
Certainly, a kind of computer readable storage medium provided by the embodiment of the present invention, computer executable instructions are not
It is limited to method operation as described above, the human body attitude detection side of equipment provided by any embodiment of the invention can also be performed
Relevant operation in method.It can be found in the content in embodiment to the introduction of storage medium to explain.
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that
The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,
It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention
It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also
It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.
Claims (12)
1. a kind of human body attitude detection method characterized by comprising
Acquire multiple image data;
Current frame image data are input in human body attitude detection model trained in advance, with reference to previous frame image data
Human body attitude confidence map exports multiple human body attitudes with reference to figure, and the human body attitude detection model is to be applied to embedded put down
The convolutional neural networks training of platform generates;
In the human body attitude with reference to identification human body attitude key point in figure;
According to the credibility of the human body attitude key point, human body attitude confidence map is generated;
Judge whether current frame image data are last frame image data;
If it is not, then the human body attitude confidence map is input in the human body attitude detection model, it is next for participating in generating
The human body attitude confidence map of frame image data;
If so, terminating the operation that execution generates the human body attitude confidence map of multiple image data.
2. the method according to claim 1, wherein described be input to preparatory instruction for the current frame image data
In experienced human body attitude detection model, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported
With reference to figure, comprising:
Judge whether the human body attitude confidence map of previous frame image data is credible;
If so, the human body attitude confidence map of the current frame image data and the previous frame image data is input in advance
In trained human body attitude detection model, multiple human body attitudes are exported with reference to figure;
If it is not, the current frame image data and pre-set image data to be then input to human body attitude detection model trained in advance
In, multiple human body attitudes are exported with reference to figure.
3. the method according to claim 1, wherein it is described in the human body attitude with reference to identifying human body appearance in figure
State key point, comprising:
In the human body attitude with reference to the coordinate position of most probable value determining in figure, using the coordinate position as human body attitude
Key point.
4. according to the method described in claim 2, it is characterized in that, the credibility according to the human body attitude key point,
Generate human body attitude confidence map, comprising:
Judge whether the human body attitude key point is credible;
If so, mask artwork is generated centered on the human body attitude key point, as human body attitude confidence map;
If it is not, then using the pre-set image data as human body attitude confidence map.
5. according to the method described in claim 4, it is characterized in that, described judge whether the human body attitude key point is credible,
Include:
Judge whether the corresponding probability value of the human body key point is greater than preset threshold value;
If so, determining that the human body attitude key point is credible;
If not, it is determined that the human body attitude key point is insincere.
6. -5 any method according to claim 1, which is characterized in that the human body attitude detection model include main road,
The first branch and second branch, the main road include residual error module and up-sampling module, and the first branch includes refining network
Module, the second branch include feedback module;
It is described to be input to current frame image data in human body attitude detection model trained in advance, to refer to previous frame picture number
According to human body attitude confidence map, export multiple human body attitudes with reference to figure, comprising:
Current frame image data are input to the residual error module to handle, to refer to the human body appearance of previous frame image data
State confidence map is input to the feedback module and is handled, and obtains the first convolution results;
The first convolution results that the residual error module exports are separately input into the up-sampling module and the refinement network mould
Block is handled, and the second convolution results and third convolution results are respectively obtained;
Second convolution results are added with the third convolution results, export multiple human body attitudes with reference to figure.
7. according to the method described in claim 6, it is characterized in that, the residual error module include the first residual unit, it is second residual
Poor unit and third residual unit;
It is described current frame image data are input to the residual error module to handle, with reference to by the people of previous frame image data
Body posture confidence map is input to the feedback module and is handled, and obtains the first convolution as a result, including:
The current frame image data are input to first residual unit to handle, obtain the first intermediate result;
First intermediate result is input to second residual unit to handle, and by the previous frame image data
Human body attitude confidence map be input to the feedback module results added that carries out that treated, obtain the second intermediate result;
Second intermediate result is input to the third residual unit to handle, third intermediate result is obtained, as institute
State the first convolution results;
Wherein, the port number of first intermediate result, second intermediate result and the third intermediate result successively increases.
8. the method according to the description of claim 7 is characterized in that the human body attitude detection model further includes third branch;
First convolution results by residual error module output are separately input into the up-sampling module and the refinement net
Network module is handled, and the second convolution results and third convolution results are respectively obtained, comprising:
First intermediate result is input to the third branch to handle, obtains the 4th intermediate result;
Second intermediate result is input to the third branch to handle, obtains the 5th intermediate result;
The third intermediate result and the 5th intermediate result are input to the up-sampling module to handle, obtain the 6th
Intermediate result;
4th intermediate result and the 6th intermediate result are input to the up-sampling module to handle, obtain the 7th
Intermediate result, as second convolution results;
The first convolution results that the residual error module exports are input to the refinement network module and handled, obtain described the
Three convolution results;
Wherein, the port number of the 6th intermediate result and the 7th intermediate result is successively reduced.
9. according to the method described in claim 6, it is characterized in that, described be input to training in advance for current frame image data
In human body attitude detection model, the reference of multiple human body attitudes is exported with the human body attitude confidence map with reference to previous frame image data
Figure, further includes:
By first convolution results and the second convolution results added, the second objective result is obtained;
Multiple described human body attitudes are added with reference to figure with second objective result, new multiple described human body attitudes ginseng is exported
Examine figure;
Wherein, second objective result is used for when being trained to the human body attitude detection model, improves human body attitude
The precision of detection model.
10. a kind of human body attitude detection device characterized by comprising
Image data acquiring module, for acquiring multiple image data;
Human body attitude detects mould with reference to figure output module, for current frame image data to be input to human body attitude trained in advance
In type, with the human body attitude confidence map with reference to previous frame image data, multiple human body attitudes are exported with reference to figure, the human body attitude
Detection model is the convolutional neural networks training generation for being applied to embedded platform;
Human body attitude key point identification module is used in the human body attitude with reference to identification human body attitude key point in figure;
Human body attitude confidence map generation module generates human body attitude and sets for the credibility according to the human body attitude key point
Letter figure;
Judgment module, for judging whether current frame image data are last frame image data;
First execution module, for if it is not, then the human body attitude confidence map is input in the human body attitude detection model,
For participating in generating the human body attitude confidence map of next frame image data;
Second execution module, for if so, terminating the operation of the human body attitude confidence map of execution generation multiple image data.
11. a kind of equipment characterized by comprising
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method as described in claim 1-9 is any.
12. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The method as described in claim 1-9 is any is realized when execution.
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811427578.XA CN109558832B (en) | 2018-11-27 | 2018-11-27 | Human body posture detection method, device, equipment and storage medium |
| US17/297,882 US11908244B2 (en) | 2018-11-27 | 2019-11-20 | Human posture detection utilizing posture reference maps |
| PCT/CN2019/119633 WO2020108362A1 (en) | 2018-11-27 | 2019-11-20 | Body posture detection method, apparatus and device, and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811427578.XA CN109558832B (en) | 2018-11-27 | 2018-11-27 | Human body posture detection method, device, equipment and storage medium |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN109558832A true CN109558832A (en) | 2019-04-02 |
| CN109558832B CN109558832B (en) | 2021-03-26 |
Family
ID=65867635
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811427578.XA Active CN109558832B (en) | 2018-11-27 | 2018-11-27 | Human body posture detection method, device, equipment and storage medium |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US11908244B2 (en) |
| CN (1) | CN109558832B (en) |
| WO (1) | WO2020108362A1 (en) |
Cited By (27)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110163878A (en) * | 2019-05-28 | 2019-08-23 | 四川智盈科技有限公司 | A kind of image, semantic dividing method based on dual multiple dimensioned attention mechanism |
| CN110197117A (en) * | 2019-04-18 | 2019-09-03 | 北京奇艺世纪科技有限公司 | Human body contour outline point extracting method, device, terminal device and computer readable storage medium |
| CN110688888A (en) * | 2019-08-02 | 2020-01-14 | 浙江省北大信息技术高等研究院 | A method and system for pedestrian attribute recognition based on deep learning |
| CN110991235A (en) * | 2019-10-29 | 2020-04-10 | 北京���益同展信息科技有限公司 | A state monitoring method, device, electronic device and storage medium |
| CN111008573A (en) * | 2019-11-15 | 2020-04-14 | 广东智媒云图科技股份有限公司 | Limb structure generation method and device, terminal equipment and readable storage medium |
| CN111096835A (en) * | 2019-07-02 | 2020-05-05 | 武汉联影医疗科技有限公司 | An orthosis design method and system |
| WO2020108362A1 (en) * | 2018-11-27 | 2020-06-04 | 广州市百果园信息技术有限公司 | Body posture detection method, apparatus and device, and storage medium |
| CN111311714A (en) * | 2020-03-31 | 2020-06-19 | 北京慧夜科技有限公司 | Attitude prediction method and system for three-dimensional animation |
| CN111950321A (en) * | 2019-05-14 | 2020-11-17 | 杭州海康威视数字技术股份有限公司 | Gait recognition method, device, computer equipment and storage medium |
| CN112149477A (en) * | 2019-06-28 | 2020-12-29 | 北京地平线机器人技术研发有限公司 | Attitude estimation method, apparatus, medium, and device |
| CN112417927A (en) * | 2019-08-22 | 2021-02-26 | 北京奇虎科技有限公司 | Method for establishing human body gesture recognition model, human body gesture recognition method and device |
| CN112488071A (en) * | 2020-12-21 | 2021-03-12 | 重庆紫光华山智安科技有限公司 | Method, device, electronic equipment and storage medium for extracting pedestrian features |
| CN112528842A (en) * | 2020-12-07 | 2021-03-19 | 北京嘀嘀无限科技发展有限公司 | Method, apparatus, device and storage medium for gesture detection |
| CN112861777A (en) * | 2021-03-05 | 2021-05-28 | 上海有个机器人有限公司 | Human body posture estimation method, electronic device and storage medium |
| CN113034580A (en) * | 2021-03-05 | 2021-06-25 | 北京字跳网络技术有限公司 | Image information detection method and device and electronic equipment |
| CN113033526A (en) * | 2021-05-27 | 2021-06-25 | 北京欧应信息技术有限公司 | Computer-implemented method, electronic device and computer program product |
| CN113077383A (en) * | 2021-06-07 | 2021-07-06 | 深圳追一科技有限公司 | Model training method and model training device |
| CN113515143A (en) * | 2021-06-30 | 2021-10-19 | 深圳市优必选科技股份有限公司 | Robot navigation method, robot and computer readable storage medium |
| WO2022002032A1 (en) * | 2020-06-29 | 2022-01-06 | 北京灵汐科技有限公司 | Image-driven model training and image generation |
| CN114782981A (en) * | 2022-03-07 | 2022-07-22 | 奥比中光科技集团股份有限公司 | Human body posture estimation model, model training method and human body posture estimation method |
| CN115376203A (en) * | 2022-07-20 | 2022-11-22 | 华为技术有限公司 | Data processing method and device |
| CN115497171A (en) * | 2022-10-31 | 2022-12-20 | 华南农业大学 | Human behavior recognition method and system based on deep learning |
| KR20230038029A (en) * | 2021-09-10 | 2023-03-17 | 중앙대학교 산학협력단 | 3D medical image analysis system and method using convolutional neural network model |
| CN116843003A (en) * | 2023-06-29 | 2023-10-03 | 京东方科技集团股份有限公司 | Management methods, devices, equipment and storage media for embedded devices |
| CN116912951A (en) * | 2023-09-13 | 2023-10-20 | 华南理工大学 | Human body posture assessment method and device |
| CN116934848A (en) * | 2022-03-31 | 2023-10-24 | 腾讯科技(深圳)有限公司 | Data processing methods, devices, equipment and media |
| CN117078895A (en) * | 2022-05-09 | 2023-11-17 | 宏达国际电子股份有限公司 | Virtual reality system and control method |
Families Citing this family (22)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11576794B2 (en) | 2019-07-02 | 2023-02-14 | Wuhan United Imaging Healthcare Co., Ltd. | Systems and methods for orthosis design |
| US11847823B2 (en) * | 2020-06-18 | 2023-12-19 | Apple Inc. | Object and keypoint detection system with low spatial jitter, low latency and low power usage |
| CN113971742B (en) * | 2020-07-07 | 2025-10-17 | 广州虎牙科技有限公司 | Key point detection, model training and live broadcasting methods, devices, equipment and media |
| WO2022098488A1 (en) * | 2020-11-06 | 2022-05-12 | Innopeak Technology, Inc. | Real-time scene text area detection |
| CN112580543B (en) * | 2020-12-24 | 2024-04-16 | 四川云从天府人工智能科技有限公司 | Behavior recognition method, system and device |
| CN114693588B (en) * | 2020-12-28 | 2025-10-14 | 虹软科技股份有限公司 | Method and device for detecting cargo box status |
| CN112784739B (en) * | 2021-01-21 | 2024-05-24 | 北京百度网讯科技有限公司 | Model training method, key point positioning method, device, equipment and medium |
| US12567173B2 (en) * | 2021-05-07 | 2026-03-03 | Northeastern University | Infant 2D pose estimation and posture detection system |
| CN113361388B (en) * | 2021-06-03 | 2023-11-24 | 北京百度网讯科技有限公司 | Image data correction method, device, electronic equipment and autonomous vehicle |
| CN113516762B (en) * | 2021-08-10 | 2025-06-24 | 腾讯音乐娱乐科技(深圳)有限公司 | Image processing method and device |
| CN114038009A (en) * | 2021-10-26 | 2022-02-11 | 深圳市华安泰智能科技有限公司 | An Image Data Acquisition and Analysis System Based on Human Skeleton Key Points |
| CN114639033A (en) * | 2021-12-06 | 2022-06-17 | 南京谦萃智能科技服务有限公司 | Personnel identification method, device, equipment and computer readable storage medium |
| CN114155556B (en) * | 2021-12-07 | 2024-05-07 | 中国石油大学(华东) | A human posture estimation method and system based on a stacked hourglass network with a channel shuffling module |
| CN114638744B (en) * | 2022-03-03 | 2024-07-26 | 厦门大学 | Human body posture transfer method and device |
| CN115062657B (en) * | 2022-06-09 | 2025-02-18 | 浙江众信安医疗科技有限公司 | Pressure sensor data identification method, mattress control method and related device |
| CN115100745B (en) * | 2022-07-05 | 2023-06-20 | 北京甲板智慧科技有限公司 | Swin transducer model-based motion real-time counting method and system |
| CN115690832B (en) * | 2022-09-05 | 2026-01-23 | 北京航空航天大学杭州创新研究院 | Human kneeling gesture detection method, device, detection equipment and storage medium |
| CN115497151B (en) * | 2022-10-25 | 2025-05-16 | 四川虹微技术有限公司 | Method for detecting bad postures for specific age groups applied to TV scenes |
| JP7740316B2 (en) * | 2023-01-18 | 2025-09-17 | 日本電気株式会社 | Method, device and system for identifying a person's posture state |
| CN116206157A (en) * | 2023-03-08 | 2023-06-02 | 中国工商银行股份有限公司 | Target detection method, device, computer equipment and storage medium |
| CN116912176B (en) * | 2023-06-26 | 2026-01-02 | 平安科技(深圳)有限公司 | Image quality verification methods, apparatus, computer equipment and storage media |
| CN117392761B (en) * | 2023-12-13 | 2024-04-16 | 深圳须弥云图空间科技有限公司 | Human body pose recognition method and device, electronic equipment and storage medium |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3203412A1 (en) * | 2016-02-05 | 2017-08-09 | Delphi Technologies, Inc. | System and method for detecting hand gestures in a 3d space |
| CN108399367A (en) * | 2018-01-31 | 2018-08-14 | 深圳市阿西莫夫科技有限公司 | Hand motion recognition method, apparatus, computer equipment and readable storage medium storing program for executing |
| CN108710868A (en) * | 2018-06-05 | 2018-10-26 | 中国石油大学(华东) | A human body key point detection system and method based on complex scenes |
| CN108846365A (en) * | 2018-06-24 | 2018-11-20 | 深圳市中悦科技有限公司 | It fights in video detection method, device, storage medium and the processor of behavior |
| CN108875523A (en) * | 2017-12-28 | 2018-11-23 | 北京旷视科技有限公司 | Human synovial point detecting method, device, system and storage medium |
| CN108875492A (en) * | 2017-10-11 | 2018-11-23 | 北京旷视科技有限公司 | Face datection and crucial independent positioning method, device, system and storage medium |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102385695A (en) | 2010-09-01 | 2012-03-21 | 索尼公司 | Human body three-dimensional posture identifying method and device |
| US8437506B2 (en) * | 2010-09-07 | 2013-05-07 | Microsoft Corporation | System for fast, probabilistic skeletal tracking |
| US20150294143A1 (en) * | 2014-04-10 | 2015-10-15 | GM Global Technology Operations LLC | Vision based monitoring system for activity sequency validation |
| US10372228B2 (en) * | 2016-07-20 | 2019-08-06 | Usens, Inc. | Method and system for 3D hand skeleton tracking |
| CN107832708A (en) * | 2017-11-09 | 2018-03-23 | 云丁网络技术(北京)有限公司 | A kind of human motion recognition method and device |
| CN107798313A (en) | 2017-11-22 | 2018-03-13 | 杨晓艳 | A kind of human posture recognition method, device, terminal and storage medium |
| CN109344755B (en) | 2018-09-21 | 2024-02-13 | 广州市百果园信息技术有限公司 | Video action recognition method, device, equipment and storage medium |
| CN109558832B (en) * | 2018-11-27 | 2021-03-26 | 广州市百果园信息技术有限公司 | Human body posture detection method, device, equipment and storage medium |
-
2018
- 2018-11-27 CN CN201811427578.XA patent/CN109558832B/en active Active
-
2019
- 2019-11-20 US US17/297,882 patent/US11908244B2/en active Active
- 2019-11-20 WO PCT/CN2019/119633 patent/WO2020108362A1/en not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3203412A1 (en) * | 2016-02-05 | 2017-08-09 | Delphi Technologies, Inc. | System and method for detecting hand gestures in a 3d space |
| CN108875492A (en) * | 2017-10-11 | 2018-11-23 | 北京旷视科技有限公司 | Face datection and crucial independent positioning method, device, system and storage medium |
| CN108875523A (en) * | 2017-12-28 | 2018-11-23 | 北京旷视科技有限公司 | Human synovial point detecting method, device, system and storage medium |
| CN108399367A (en) * | 2018-01-31 | 2018-08-14 | 深圳市阿西莫夫科技有限公司 | Hand motion recognition method, apparatus, computer equipment and readable storage medium storing program for executing |
| CN108710868A (en) * | 2018-06-05 | 2018-10-26 | 中国石油大学(华东) | A human body key point detection system and method based on complex scenes |
| CN108846365A (en) * | 2018-06-24 | 2018-11-20 | 深圳市中悦科技有限公司 | It fights in video detection method, device, storage medium and the processor of behavior |
Cited By (40)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11908244B2 (en) | 2018-11-27 | 2024-02-20 | Bigo Technology Pte. Ltd. | Human posture detection utilizing posture reference maps |
| WO2020108362A1 (en) * | 2018-11-27 | 2020-06-04 | 广州市百果园信息技术有限公司 | Body posture detection method, apparatus and device, and storage medium |
| CN110197117A (en) * | 2019-04-18 | 2019-09-03 | 北京奇艺世纪科技有限公司 | Human body contour outline point extracting method, device, terminal device and computer readable storage medium |
| CN110197117B (en) * | 2019-04-18 | 2021-07-06 | 北京奇艺世纪科技有限公司 | Human body contour point extraction method and device, terminal equipment and computer readable storage medium |
| CN111950321B (en) * | 2019-05-14 | 2023-12-05 | 杭州海康威视数字技术股份有限公司 | Gait recognition method, device, computer equipment and storage medium |
| CN111950321A (en) * | 2019-05-14 | 2020-11-17 | 杭州海康威视数字技术股份有限公司 | Gait recognition method, device, computer equipment and storage medium |
| CN110163878A (en) * | 2019-05-28 | 2019-08-23 | 四川智盈科技有限公司 | A kind of image, semantic dividing method based on dual multiple dimensioned attention mechanism |
| CN112149477A (en) * | 2019-06-28 | 2020-12-29 | 北京地平线机器人技术研发有限公司 | Attitude estimation method, apparatus, medium, and device |
| CN111096835A (en) * | 2019-07-02 | 2020-05-05 | 武汉联影医疗科技有限公司 | An orthosis design method and system |
| CN110688888B (en) * | 2019-08-02 | 2022-08-05 | 杭州未名信科科技有限公司 | A method and system for pedestrian attribute recognition based on deep learning |
| CN110688888A (en) * | 2019-08-02 | 2020-01-14 | 浙江省���大信息技术高等研究院 | A method and system for pedestrian attribute recognition based on deep learning |
| CN112417927A (en) * | 2019-08-22 | 2021-02-26 | 北京奇虎科技有限公司 | Method for establishing human body gesture recognition model, human body gesture recognition method and device |
| CN110991235B (en) * | 2019-10-29 | 2023-09-01 | 京东科技信息技术有限公司 | A state monitoring method, device, electronic equipment and storage medium |
| CN110991235A (en) * | 2019-10-29 | 2020-04-10 | 北京海益同展信息科技有限公司 | A state monitoring method, device, electronic device and storage medium |
| CN111008573B (en) * | 2019-11-15 | 2024-04-26 | 广东智媒云图科技股份有限公司 | Limb structure generation method and device, terminal equipment and readable storage medium |
| CN111008573A (en) * | 2019-11-15 | 2020-04-14 | 广东智媒云图科技股份有限公司 | Limb structure generation method and device, terminal equipment and readable storage medium |
| CN111311714A (en) * | 2020-03-31 | 2020-06-19 | 北京慧夜科技有限公司 | Attitude prediction method and system for three-dimensional animation |
| WO2022002032A1 (en) * | 2020-06-29 | 2022-01-06 | 北京灵汐科技有限公司 | Image-driven model training and image generation |
| CN112528842A (en) * | 2020-12-07 | 2021-03-19 | 北京嘀嘀无限科技发展有限公司 | Method, apparatus, device and storage medium for gesture detection |
| CN112488071B (en) * | 2020-12-21 | 2021-10-26 | 重庆紫光华山智安科技有限公司 | Method, device, electronic equipment and storage medium for extracting pedestrian features |
| CN112488071A (en) * | 2020-12-21 | 2021-03-12 | 重庆紫光华山智安科技有限公司 | Method, device, electronic equipment and storage medium for extracting pedestrian features |
| CN112861777B (en) * | 2021-03-05 | 2024-08-27 | 上海有个机器人有限公司 | Human body posture estimation method, electronic equipment and storage medium |
| CN113034580A (en) * | 2021-03-05 | 2021-06-25 | 北京字跳网络技术有限公司 | Image information detection method and device and electronic equipment |
| CN113034580B (en) * | 2021-03-05 | 2023-01-17 | 北京字跳网络技术有限公司 | Image information detection method, device and electronic equipment |
| CN112861777A (en) * | 2021-03-05 | 2021-05-28 | 上海有个机器人有限公司 | Human body posture estimation method, electronic device and storage medium |
| CN113033526A (en) * | 2021-05-27 | 2021-06-25 | 北京欧应信息技术有限公司 | Computer-implemented method, electronic device and computer program product |
| CN113077383B (en) * | 2021-06-07 | 2021-11-02 | 深圳追一科技有限公司 | Model training method and model training device |
| CN113077383A (en) * | 2021-06-07 | 2021-07-06 | 深圳追一科技有限公司 | Model training method and model training device |
| CN113515143A (en) * | 2021-06-30 | 2021-10-19 | 深圳市优必选科技股份有限公司 | Robot navigation method, robot and computer readable storage medium |
| KR102623109B1 (en) | 2021-09-10 | 2024-01-10 | 중앙대학교 산학협력단 | 3D medical image analysis system and method using convolutional neural network model |
| KR20230038029A (en) * | 2021-09-10 | 2023-03-17 | 중앙대학교 산학협력단 | 3D medical image analysis system and method using convolutional neural network model |
| CN114782981B (en) * | 2022-03-07 | 2025-04-29 | 奥比中光科技集团股份有限公司 | Human body posture estimation model, model training method and human body posture estimation method |
| CN114782981A (en) * | 2022-03-07 | 2022-07-22 | 奥比中光科技集团股份有限公司 | Human body posture estimation model, model training method and human body posture estimation method |
| CN116934848A (en) * | 2022-03-31 | 2023-10-24 | 腾讯科技(深圳)有限公司 | Data processing methods, devices, equipment and media |
| CN117078895A (en) * | 2022-05-09 | 2023-11-17 | 宏达国际电子股份有限公司 | Virtual reality system and control method |
| CN115376203A (en) * | 2022-07-20 | 2022-11-22 | 华为技术有限公司 | Data processing method and device |
| CN115497171A (en) * | 2022-10-31 | 2022-12-20 | 华南农业大学 | Human behavior recognition method and system based on deep learning |
| CN116843003A (en) * | 2023-06-29 | 2023-10-03 | 京东方科技集团股份有限公司 | Management methods, devices, equipment and storage media for embedded devices |
| CN116912951B (en) * | 2023-09-13 | 2023-12-22 | 华南理工大学 | Human body posture assessment method and device |
| CN116912951A (en) * | 2023-09-13 | 2023-10-20 | 华南理工大学 | Human body posture assessment method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2020108362A1 (en) | 2020-06-04 |
| US20220004744A1 (en) | 2022-01-06 |
| CN109558832B (en) | 2021-03-26 |
| US11908244B2 (en) | 2024-02-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109558832A (en) | A kind of human body attitude detection method, device, equipment and storage medium | |
| US12488624B2 (en) | Method and apparatus for extracting biological features, device, medium, and program product | |
| CN111931764B (en) | A target detection method, target detection framework and related equipment | |
| CN111160269A (en) | A method and device for detecting facial key points | |
| CN112966574A (en) | Human body three-dimensional key point prediction method and device and electronic equipment | |
| CN115546461B (en) | A face attribute editing method based on mask denoising and feature selection | |
| CN110032925A (en) | A kind of images of gestures segmentation and recognition methods based on improvement capsule network and algorithm | |
| CN114550169B (en) | Training method, device, equipment and medium for cell classification model | |
| CN113538254B (en) | Image restoration method, device, electronic equipment and computer readable storage medium | |
| CN118093840B (en) | Visual question-answering method, device, equipment and storage medium | |
| CN110399888A (en) | A Go Referee System Based on MLP Neural Network and Computer Vision | |
| JP2023131117A (en) | Training of coupled sensing models, coupled sensing methods, devices, equipment and media | |
| CN109255382A (en) | For the nerve network system of picture match positioning, method and device | |
| CN116343287A (en) | Facial expression recognition, model training method, device, equipment and storage medium | |
| CN119723044B (en) | A camouflaged target detection method, system, device and medium | |
| Chen et al. | SRCBTFusion-Net: An efficient fusion architecture via stacked residual convolution blocks and transformer for remote sensing image semantic segmentation | |
| CN114862716B (en) | Image enhancement method, device, equipment and storage medium for face image | |
| CN114283152A (en) | Image processing method, image processing model training method, image processing device, image processing equipment and image processing medium | |
| CN120451145A (en) | Skin lesion region segmentation method based on Mamba | |
| CN116959125A (en) | A data processing method and related devices | |
| CN111539434A (en) | Similarity-based infrared weak and small target detection method | |
| CN118762394B (en) | Line of sight estimation method | |
| CN113936292A (en) | Skin detection method, device, equipment and medium | |
| CN113822903A (en) | Segmentation model training method, image processing method, device, equipment and medium | |
| CN116542292B (en) | Training methods, devices, equipment and storage media for image generation models |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| TR01 | Transfer of patent right | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20220610 Address after: 31a, 15 / F, building 30, maple mall, bangrang Road, Brazil, Singapore Patentee after: Baiguoyuan Technology (Singapore) Co.,Ltd. Address before: 511442 23-39 / F, building B-1, Wanda Plaza North, Wanbo business district, 79 Wanbo 2nd Road, Nancun Town, Panyu District, Guangzhou City, Guangdong Province Patentee before: GUANGZHOU BAIGUOYUAN INFORMATION TECHNOLOGY Co.,Ltd. |