CN121032506A

CN121032506A - A method, device, storage medium, and program product for identifying transaction risks.

Info

Publication number: CN121032506A
Application number: CN202511146500.0A
Authority: CN
Inventors: 袁国韬
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2025-08-15
Filing date: 2025-08-15
Publication date: 2025-11-28

Abstract

This application discloses a transaction risk identification method, device, storage medium, and program product, relating to the fields of information security and artificial intelligence. The method includes: collecting a user's facial image sequence, a first voice sequence, and a behavioral data sequence in a target application within a preset time period; performing expression recognition on each frame of the facial image sequence to obtain a target expression type sequence; performing emotion recognition on the first voice sequence to obtain a target emotion type; and performing behavior recognition on the behavioral data sequence to obtain a target behavior type; performing risk analysis on the target expression type sequence, target emotion type, and target behavior type to obtain a transaction risk value; and determining the transaction risk level of the user's transactions through the target application within the preset time period based on the transaction risk value. This method can identify risky transactions from multiple dimensions, especially risky transactions by special groups in special risk scenarios, thereby improving the accuracy of transaction risk identification.

Description

Transaction risk identification method, device, storage medium and program product

Technical Field

The embodiment of the application relates to the fields of information security and artificial intelligence, in particular to a transaction risk identification method, equipment, a storage medium and a program product.

Background

At present, the safety of financial transactions of users through mobile banking or online banking channels is ensured by means of short message verification codes or face recognition and the like.

However, when special groups (such as the elderly and visually impaired users) conduct financial transactions through channels such as mobile banking or online banking, risk scenes such as impossions (such as remote control user equipment) and irrational decisions (such as guidance by propaganda language) exist in the identity information, but the existing identification modes of single dimension such as short message verification codes or face recognition cannot accurately identify the risk transactions of the special groups under the special risk scenes.

Disclosure of Invention

The embodiment of the application provides a transaction risk identification method, a device, a storage medium and a program product, which realize a transaction risk identification function and solve the problem that the risk transaction of a special crowd in special risk scenes such as fraudulent use of identity information and irrational decision cannot be accurately identified in the prior art.

In a first aspect, an embodiment of the present application provides a transaction risk identification method, where the method includes:

acquiring a facial image sequence, a first voice sequence and a behavior data sequence of a user in a target application within a preset duration;

carrying out expression recognition on each frame of facial image in the facial image sequence to obtain a target expression type sequence;

Carrying out emotion recognition on the first voice sequence to obtain a target emotion type, and carrying out behavior recognition on the behavior data sequence to obtain a target behavior type;

Risk analysis is carried out on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and the transaction risk level of the user for carrying out transaction through the target application within the preset duration is determined based on the transaction risk value.

In the embodiment of the application, a facial image sequence, a first voice sequence and a behavior data sequence of a user in a preset time period can be acquired, expression recognition is carried out on each frame of facial image in the facial image sequence to obtain a target expression type sequence, emotion recognition is carried out on the first voice sequence to obtain a target emotion type, behavior recognition is carried out on the behavior data sequence to obtain a target behavior type, risk analysis is carried out on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and a transaction risk level of a user transacting through the target application in the preset time period is determined based on the transaction risk value. According to the technical scheme, the facial image sequence of the user in the preset time length is subjected to expression recognition, the first voice sequence of the user in the preset time length is subjected to emotion recognition, the behavior data sequence of the user in the target application in the preset time length is subjected to behavior recognition, and then risk analysis is further carried out based on the recognized target expression type sequence, the target emotion type and the target behavior type, so that risk recognition can be carried out on transactions of the user in the target application from multiple dimensions (namely, expression, emotion and behavior), compared with a risk recognition mode only depending on a single dimension (such as a short message verification code or face recognition), the accuracy of multi-dimensional risk recognition is higher, and further risk transactions of the user in the target application, particularly risk transactions of special groups in special risk scenes such as identity information imposter, irrational decision and non-subjective willingness operation, can be accurately recognized, and accordingly, the accuracy of transaction risk recognition of the user is improved, and the use experience of the user is improved.

In a second aspect, an embodiment of the present application provides a transaction risk identification device, including:

the acquisition module is used for acquiring a facial image sequence, a first voice sequence and a behavior data sequence of a user in a target application within a preset duration;

The first recognition module is used for carrying out expression recognition on each frame of facial image in the facial image sequence to obtain a target expression type sequence;

the second recognition module is used for carrying out emotion recognition on the first voice sequence to obtain a target emotion type, and carrying out behavior recognition on the behavior data sequence to obtain a target behavior type;

The determining module is used for performing risk analysis on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and determining the transaction risk level of the user for carrying out transaction through the target application within the preset duration based on the transaction risk value.

In a third aspect, an embodiment of the present application provides an electronic device, including at least one processor, and a memory communicatively coupled to the at least one processor, where the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the transaction risk identification method of any of the embodiments of the present application.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a transaction risk identification method according to any of the embodiments of the present application.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements a transaction risk identification method as in any of the embodiments of the present application.

The descriptions of the second aspect, the third aspect, the fourth aspect and the fifth aspect of the present application may refer to the detailed descriptions of the first aspect, and the advantages described in the second aspect, the third aspect, the fourth aspect and the fifth aspect may refer to the analysis of the advantages described in the first aspect, which is not repeated herein.

In the present application, the names of the transaction risk recognition devices described above do not constitute limitations on the devices or function modules themselves, and in actual implementations, these devices or function modules may appear under other names. Insofar as the function of each device or function module is similar to that of the present application, it falls within the scope of the claims of the present application and the equivalents thereof.

These and other aspects of the application will be more readily apparent from the following description.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a transaction risk identification method according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of a transaction risk identification method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a transaction risk identification device according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

It should be noted that the terms "first," "second," "target," and "original," etc. in the description and claims of the present application and the above-described drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be capable of executing sequences other than those illustrated or otherwise described. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

Fig. 1 is a schematic flow chart of a transaction risk identification method according to an embodiment of the present application, where the embodiment may be applied to a scenario in which risk identification is required for a transaction of a user in a target application. The transaction risk identification method provided by the embodiment of the application can be executed by the transaction risk identification device provided by the embodiment of the application, and the device can be realized in a software and/or hardware mode. In a specific embodiment, the transaction risk recognition device may be integrated in an electronic device, for example, a smart phone or a personal computer. The execution body for executing the method can be an electronic device. Referring to fig. 1, the transaction risk identification method of the present embodiment includes, but is not limited to, the following steps:

s110, acquiring a face image sequence, a first voice sequence and a behavior data sequence of a user in a preset duration in a target application.

The preset duration is a preset time length for collecting data, such as 1 minute.

The facial image sequence is a sequence formed by recording facial images of a user acquired by a camera and other devices in time sequence within a preset duration. The first voice sequence is a sequence formed by recording voice signals of a user acquired by a microphone and other devices in time sequence within a preset time length. The target application is an application for a user to conduct financial transactions, and may be a mobile banking application or the like by way of example. The behavior data sequence is a sequence of operation behaviors performed by a user in a target application within a preset duration, and the formed sequence is recorded in time sequence, and the behavior data can comprise touch area, touch pressure, movement speed and the like by way of example.

Specifically, a special crowd (such as the elderly and visually impaired users) has a plurality of special risk scenes, such as scenes of impossibility, irrational decision, non-subjective willingness operation and the like, when using a target application to conduct financial transactions, and the expression, emotion and behavior of the user can have some anomalies when the user conducts transactions through the target application in the special risk scenes, such as tension when an operator operates the target application in the special risk scenes of impossibility of the identity information, hesitation when the operator operates the target application in the special risk scenes of irrational decision, and hesitation when the operator operates the target application in the special risk scenes of non-subjective willingness operation.

Under the condition that the user operation target application is detected, the camera, the microphone and the touch sensor in the electronic equipment can be activated when the user is authorized to perform data acquisition operation on the camera, the microphone and the touch sensor, so that user data are acquired by the camera, the microphone and the touch sensor, then facial images of the user in a preset time period are acquired by the camera according to a first preset acquisition frequency from the moment when the user operation target application is detected, a facial image sequence is obtained, voice signals of the user in the preset time period are acquired by the microphone according to a second preset acquisition frequency, a first voice sequence is obtained, and behavior data sequences are obtained by the touch sensor according to behavior data such as touch area, touch pressure and moving speed of the user in the preset time period, wherein the first preset acquisition frequency is the acquisition frequency preset for the camera, the second preset acquisition frequency is the acquisition frequency preset for the microphone, and the third preset acquisition frequency is the acquisition frequency preset for the touch sensor.

S120, carrying out expression recognition on each frame of facial image in the facial image sequence to obtain a target expression type sequence.

The target expression type sequence is an expression type sequence of a user within a preset time length, namely a sequence formed by expression types corresponding to each frame of facial images in the facial image sequence.

The facial recognition method comprises the steps of preprocessing each frame of facial image in a facial image sequence, namely denoising, contrast enhancement and the like, performing facial recognition on the current frame of facial image in the preprocessed facial image sequence, namely inputting the current frame of facial image into a pre-trained facial recognition model, extracting features of the current frame of facial image by using learned model parameters at the moment by the facial recognition model to obtain facial key point features, performing facial recognition on the facial key point features to obtain a first recognition result, determining the first recognition result as a facial expression type corresponding to the current frame of facial image, namely a combined facial expression type of lifting of the heart and tension of the eyelids, or a combined facial expression type of sagging of the mouth corners and expansion of the nose wings and the like, traversing each frame of facial image in the preprocessed facial image sequence, repeating the facial recognition process to obtain a corresponding facial expression type of each frame of facial image, combining the facial image according to a time sequence to obtain a target facial type sequence, and performing facial recognition on the facial recognition model, wherein the facial recognition result is the facial recognition model in the pre-trained facial model.

Optionally, the training process of the expression recognition model comprises the steps of collecting a plurality of facial images of each user under the condition of obtaining user authorization, marking the facial images as sample facial images, marking the facial images of each sample to obtain expression type labels of the corresponding sample facial images, constructing a training set according to the expression type labels, inputting each sample facial image into an initial expression recognition model, guiding training output of the initial expression recognition model by using the corresponding expression type labels, namely training the initial expression recognition model by taking a loss value between the minimum training output and the corresponding expression type labels as a target, and obtaining an optimal expression recognition model, wherein the initial expression recognition model is a model frame which is not trained and is used for learning the relation between the facial images and the expression types so as to realize the expression recognition function. The initial expression recognition model is trained based on a large number of sample facial images and corresponding expression type labels, so that the recognition accuracy of the expression recognition model can be improved.

S130, carrying out emotion recognition on the first voice sequence to obtain a target emotion type, and carrying out behavior recognition on the behavior data sequence to obtain the target behavior type.

The target emotion type is the emotion type of the user in a preset time period. The target behavior type is the behavior type of a user operating the target application within a preset time period.

Specifically, the first voice sequence may be preprocessed, such as denoising, and then the preprocessed first voice sequence is subjected to emotion recognition, for example, the first voice sequence may be input into a pre-trained emotion recognition model, at this time, the emotion recognition model may perform emotion analysis on a voice signal at each moment in the first voice sequence by using a model parameter that has been learned, to obtain a second recognition result, and determine the second recognition result as a target emotion type, where the emotion recognition model is a pre-trained neural network model for performing emotion recognition on the voice signal.

Then, the behavior data sequence can be preprocessed, such as denoising, and then the behavior recognition is performed on the preprocessed behavior data sequence, namely, the behavior data sequence can be input into a pre-trained behavior recognition model, at the moment, the behavior recognition model can perform behavior analysis on the behavior data at each moment in the behavior data sequence by using the learned model parameters to obtain a third recognition result, and the third recognition result is determined to be a target behavior type, wherein the behavior recognition model is a pre-trained neural network model and is used for performing behavior recognition on the behavior data.

The method comprises the steps of obtaining a training process of an emotion recognition model, wherein the training process comprises the steps of collecting a plurality of voice sequences of each user under the condition of obtaining user authorization, marking the voice sequences as sample voice sequences, marking emotion types for each sample voice sequence to obtain emotion type labels of the corresponding sample voice sequences, constructing a training set according to the emotion type labels, inputting each sample voice sequence into an initial emotion recognition model, guiding training output of the initial emotion recognition model by the corresponding emotion type labels, namely training the initial emotion recognition model by taking a loss value between the minimum training output and the corresponding emotion type labels as a target, obtaining an optimal emotion recognition model, and improving recognition accuracy of the emotion recognition model, wherein the initial emotion recognition model is a model framework which is not trained yet and is used for learning relations between the voice sequences and the emotion types so as to realize an emotion recognition function.

The training process of the behavior recognition model comprises the steps of collecting a plurality of behavior data sequences of each user in target application under the condition of obtaining user authorization, marking the behavior data sequences as sample behavior data sequences, marking the behavior type of each sample behavior data sequence to obtain a behavior type label corresponding to the sample behavior data sequences, constructing a training set, inputting each sample behavior data sequence into the initial behavior recognition model, guiding training output of the initial behavior recognition model by using the corresponding behavior type label, namely training the initial behavior recognition model by using a loss value between the minimum training output and the corresponding behavior type label as a target, obtaining an optimal behavior recognition model, and improving recognition accuracy of the behavior recognition model, wherein the initial behavior recognition model is a model framework which is not trained yet and is used for learning the relationship between the behavior data sequences and the behavior types so as to realize the behavior recognition function.

And S140, performing risk analysis on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and determining a transaction risk level of a user transaction through the target application within a preset duration based on the transaction risk value.

The transaction risk value is a value obtained after risk analysis is carried out on the expression, emotion and behavior of the user in the target application and is used for representing the risk degree of the user in transaction through the target application within a preset duration. The transaction risk level is the risk level corresponding to the transaction risk value.

Specifically, risk analysis can be performed on a target expression type sequence, a target emotion type and a target behavior type, namely, in an implementation mode, a preset emotion risk relation can be queried based on the emotion type corresponding to each frame of facial image in the target expression type sequence to obtain a risk value of a corresponding frame, an average value of the risk values of all frames is calculated to obtain an emotion risk value corresponding to the target expression type sequence, wherein the preset emotion risk relation comprises a corresponding relation between the emotion type and the risk value, the emotion risk value is used for representing the transaction risk degree corresponding to the target expression type sequence, a preset emotion risk relation is queried based on the target emotion type to obtain an emotion risk value corresponding to the target emotion type, the preset emotion risk relation comprises a corresponding relation between the emotion type and the risk value, the emotion risk value comprises a corresponding relation between the behavior type and the risk value, the actual transaction risk value is obtained, and then the emotion risk value corresponding to the target emotion type is calculated based on the first emotion risk value, the emotion risk value is set to be a third emotion risk value, the emotion risk value is set to be a weighted risk value, and the second emotion risk value is set to be a weighted risk value is set to be a third risk value.

In another implementation manner, the target expression type sequence, the target emotion type and the target behavior type can be input into a pre-trained risk prediction model, and at the moment, the risk prediction model can utilize the learned model parameters to perform risk analysis on the expression type, the target emotion type and the target behavior type corresponding to each frame in the target expression type sequence to obtain an output result, and the output result is determined to be a transaction risk value, wherein the risk prediction model is a pre-trained decision tree model and is used for performing risk analysis on the expression type, the emotion type and the behavior type.

And inquiring a preset risk level relation based on the transaction risk value to obtain a risk level corresponding to the transaction risk value, and determining the risk level as a transaction risk level so as to determine the transaction risk level of the transaction performed by the user through the target application within a preset duration, wherein the preset risk level relation comprises a corresponding relation between the risk value and the risk level.

Optionally, the training process of the risk prediction model includes collecting sample expression types, sample emotion types and sample behavior types when users conduct transactions through target applications under the condition that user authorization is obtained, recording the sample expression types, the sample emotion types and the sample behavior types of the same time as sample data, labeling transaction risk values for each sample data to obtain transaction risk value labels of corresponding sample data, constructing a training set, inputting each sample data into an initial risk prediction model, guiding training output of the initial risk prediction model by the corresponding transaction risk value labels, namely training the initial risk prediction model by taking a loss value between the minimum training output and the corresponding transaction risk value labels as a target, obtaining an optimal risk prediction model, and improving recognition accuracy of the risk prediction model, wherein the initial risk prediction model is a model frame which is not trained yet and is used for learning relations among the expression types, the emotion types, the behavior types and the transaction risk values so as to achieve a risk prediction function.

According to the technical scheme, a facial image sequence, a first voice sequence and a behavior data sequence of a user in a preset time period can be acquired, expression recognition is conducted on each frame of facial image in the facial image sequence to obtain a target expression type sequence, emotion recognition is conducted on the first voice sequence to obtain a target emotion type, behavior recognition is conducted on the behavior data sequence to obtain a target behavior type, risk analysis is conducted on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and a transaction risk level of a transaction conducted by the user through the target application in the preset time period is determined based on the transaction risk value. According to the technical scheme, the facial image sequence of the user in the preset time length is subjected to expression recognition, the first voice sequence of the user in the preset time length is subjected to emotion recognition, the behavior data sequence of the user in the target application in the preset time length is subjected to behavior recognition, and then risk analysis is further carried out based on the recognized target expression type sequence, the target emotion type and the target behavior type, so that risk recognition can be carried out on transactions of the user in the target application from multiple dimensions (namely, expression, emotion and behavior), compared with a risk recognition mode only depending on a single dimension (such as a short message verification code or face recognition), the accuracy of multi-dimensional risk recognition is higher, and further risk transactions of the user in the target application, particularly risk transactions of special groups in special risk scenes such as identity information imposter, irrational decision and non-subjective willingness operation, can be accurately recognized, and accordingly, the accuracy of transaction risk recognition of the user is improved, and the use experience of the user is improved.

The following further describes a transaction risk identification method provided by the embodiment of the present application, and fig. 2 is another flow chart of the transaction risk identification method provided by the embodiment of the present application. The embodiments of the present application are optimized based on the above embodiments. Referring to fig. 2, the method of the present embodiment includes, but is not limited to, the following steps:

s201, detecting risk triggering operation of a user in a target application, and collecting a face image sequence, a first voice sequence and a behavior data sequence of the user in the target application within a preset duration.

Wherein the risk triggering operation is an operation behavior that may cause a transaction risk event or cause an exacerbation of a transaction risk condition.

Optionally, the risk triggering operation may include any of a transaction amount greater than a preset value threshold, a product risk level of the transaction product greater than a preset level threshold, and an operating frequency in the target application greater than a preset frequency threshold, where the preset value threshold is a preset value for characterizing a critical value of the transaction amount for which a transaction risk may exist, the preset level threshold is a preset product risk level for characterizing a critical value of the product risk level for which a transaction risk may exist, and the preset frequency threshold is a preset value for characterizing a critical value of the operating frequency for which a transaction risk may exist. By limiting the triggering conditions corresponding to the risk triggering operation, the risk operation of the user in the target application can be accurately and comprehensively detected, an accurate triggering basis is provided for subsequent data acquisition, and the comprehensiveness and reliability of transaction risk identification are further improved.

Specifically, under the condition that the target application is operated by a user, the transaction amount, the transaction product and the operation frequency of the user in the target application can be detected in real time, and the product risk level corresponding to the transaction product is obtained by inquiring a preset transaction product level relation based on the transaction product, wherein the preset transaction product level relation comprises the corresponding relation between the transaction product and the product risk level, if the transaction amount is larger than a preset numerical threshold, the product risk level of the transaction product is larger than a preset level threshold or the operation frequency in the target application is larger than a preset frequency threshold, the fact that the transaction risk exists in the operation of the user in the target application is indicated, and the operation of the user in the target application can be determined to be risk triggering operation at the moment, otherwise, the operation of the user in the target application is determined not to be risk triggering operation.

When the risk triggering operation of the user in the target application is detected, under the condition that the authorization of the user to the data acquisition operation of the camera, the microphone and the touch sensor is obtained, acquiring a facial image sequence, a first voice sequence and a behavior data sequence of the user in the target application by using the camera, the microphone and the touch sensor from the moment when the risk triggering operation of the user in the target application is detected.

S202, carrying out expression recognition on each frame of facial image in the facial image sequence to obtain a target expression type sequence.

S203, carrying out emotion recognition on the first voice sequence to obtain a candidate emotion type.

The candidate emotion type is the emotion type corresponding to the first voice sequence.

Specifically, in one implementation manner, feature extraction may be performed on a first speech sequence to obtain a speech speed feature and a fundamental frequency fluctuation feature, where the speech speed feature is used to describe the degree of speed of speech in a time dimension, the fundamental frequency fluctuation feature is used to reflect the regularity and stability of the fundamental frequency in a speech signal along with time change, when the speech speed feature is greater than a preset speech speed threshold or the fundamental frequency fluctuation feature is greater than a preset fluctuation threshold, it is determined that a candidate emotion type belongs to a negative emotion type, when the speech speed feature is not greater than the preset speech speed threshold and the fundamental frequency fluctuation feature is not greater than the preset fluctuation threshold, it is determined that the candidate emotion type belongs to a non-negative emotion type, where the preset speech speed threshold is a preset standard value for measuring the speed of speech, and is used to determine whether a user is in a negative emotion type, and the preset fluctuation threshold is a preset standard value for measuring the magnitude of the fundamental frequency fluctuation feature, and the negative emotion type is determined whether the user is in a negative emotion type, and the negative emotion type may include tension, panic, anxiety, sadness, and the like. Through the speech speed characteristics and the fundamental frequency fluctuation characteristics, the calculation efficiency can be improved, the implementation complexity is reduced, and the determination efficiency of the candidate emotion types is further improved.

In another implementation manner, a pre-trained emotion recognition model may be utilized to recognize the emotion of the user in the preset duration based on the first voice sequence, so as to obtain the candidate emotion type. The emotion recognition model can accurately recognize the candidate emotion types, so that the determination accuracy of the candidate emotion types is improved.

S204, determining whether the candidate emotion type belongs to a negative emotion type.

Specifically, when the candidate emotion type belongs to a negative emotion type, the probability that the risk exists in the transaction of the user in the target application is high, the emotion of the user needs to be further dynamically identified, and S205 can be executed at the moment, when the candidate emotion type belongs to a non-negative emotion type, the probability that the risk exists in the transaction of the user in the target application is low, the emotion of the user does not need to be further dynamically identified, and S207 can be executed at the moment.

S205, when the candidate emotion type belongs to the negative emotion type, generating a verification text, and collecting a voice sequence of the user aiming at the verification text to obtain a second voice sequence.

The verification text is a randomly generated text and is used for dynamically verifying whether the emotion of the user belongs to a negative emotion type, and the second voice sequence is a voice sequence formed by voice signals when the user reads the verification text.

Specifically, when the candidate emotion type belongs to the negative emotion type, a verification text may be randomly generated, for example, the verification text may be "confirm transfer to X, verification code 8899", and the verification text is displayed on a display screen of the electronic device, and simultaneously a prompt message is generated and displayed to remind the user to read the verification text, then a microphone is used to collect a voice sequence of the user for the verification text according to a second preset collection frequency to obtain a second voice sequence, and then S206 is executed.

S206, carrying out emotion recognition on the second voice sequence to obtain a target emotion type.

Specifically, the emotion of the user is recognized based on the second voice sequence by using the pre-trained emotion recognition model to obtain the target emotion type, and then S208 is executed. The target emotion type can be accurately identified through the emotion identification model, and further the accuracy of determining the target emotion type is improved.

S207, determining the candidate emotion type as a target emotion type when the candidate emotion type belongs to a non-negative emotion type.

Specifically, after the candidate emotion type is determined as the target emotion type, S208 is performed.

It should be noted that, since the transaction risk degree represented by the non-negative emotion type is low, when the candidate emotion type belongs to the non-negative emotion type, it is only necessary to directly determine that the target emotion type belongs to the non-negative emotion type, and it is not necessary to determine which non-negative emotion type the target emotion type belongs to in a fine granularity.

S208, performing behavior recognition on the behavior data sequence to obtain a target behavior type.

S209, risk analysis is carried out on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and the transaction risk level of the user for carrying out transaction through the target application in the preset duration is determined based on the transaction risk value.

S210, inquiring a preset mapping relation based on the transaction risk level to obtain a transaction protection operation corresponding to the transaction risk level.

The preset mapping relation can comprise a corresponding relation between the risk level and the protection operation, wherein the protection operation is used for intervening the transaction of the user in the target application so as to ensure the security of the transaction.

Specifically, the preset mapping relationship can be queried based on the transaction risk level to obtain the guard operation corresponding to the transaction risk level, and the guard operation is determined to be the transaction guard operation.

The method comprises the steps of providing a transaction risk level, wherein the transaction risk level is a first level, the transaction risk level is lower than a second level, the transaction risk level is lower than a third level, the transaction risk level is lower than the third level, and the transaction risk level is higher than the fourth level.

S211, executing transaction protection operation to intervene in the transaction of the user in the target application.

The technical scheme of the embodiment of the application detects risk triggering operation of a user in a target application, acquires a facial image sequence, a first voice sequence and a behavior data sequence of the user in a preset time length, performs data acquisition operation when detecting that the operation of the user in the target application is risk triggering operation, further avoids executing data acquisition operation when trading safety, saves computing resources, improves reliability of trading risk identification, secondly performs expression identification on each frame of facial image in the facial image sequence to obtain a target expression type sequence, performs emotion identification on the first voice sequence to obtain a candidate emotion type, then generates a verification text when the candidate emotion type belongs to a negative emotion type, acquires a voice sequence of the user aiming at the verification text, obtains a second voice sequence, performs emotion identification on the second voice sequence to obtain a target emotion type, can further dynamically identify the type of the user at the current time when the candidate emotion type belongs to the negative emotion type, further improves accuracy of determining the target emotion type, provides an accurate value for subsequent determining of trading risk, performs behavior data in the target emotion type, performs behavior data analysis on the target emotion type in the target emotion type, namely, performs behavior data analysis on the target emotion type in the target emotion type and the target application time length, and performs behavior data analysis on the target emotion type in the target emotion type, compared with a risk identification mode which only depends on a single dimension (such as a short message verification code or face identification), the accuracy of multi-dimension risk identification is higher, further, risk transactions of users in target applications, particularly risk transactions of special crowds in special risk scenes such as fraudulent use, irrational decision and non-subjective willingness operation of identity information, are accurately identified, and accordingly accuracy of transaction risk identification is improved.

Fig. 3 is a schematic structural diagram of a transaction risk identification device according to an embodiment of the present application, and referring to fig. 3, the transaction risk identification device may include:

the acquisition module 310 is configured to acquire a facial image sequence, a first voice sequence and a behavior data sequence of a user in a target application within a preset duration;

The first recognition module 320 is configured to perform expression recognition on each frame of facial image in the facial image sequence to obtain a target expression type sequence;

the second recognition module 330 is configured to perform emotion recognition on the first voice sequence to obtain a target emotion type, and perform behavior recognition on the behavior data sequence to obtain a target behavior type;

The determining module 340 is configured to perform risk analysis on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and determine a transaction risk level of a transaction performed by the user through the target application within a preset duration based on the transaction risk value.

In an embodiment, the transaction risk identification device further includes a detection module, where the detection module is specifically configured to:

Before acquiring a face image sequence, a first voice sequence and a behavior data sequence of a user in a preset duration, detecting a risk triggering operation of the user in a target application, and triggering and executing the acquisition of the face image sequence, the first voice sequence and the behavior data sequence of the user in the preset duration.

In an embodiment, the risk triggering operation in the detection module includes any one of transaction amount greater than a preset value threshold, product risk level of the transaction product greater than a preset level threshold, and frequency of operation in the target application greater than a preset frequency threshold.

In one embodiment, the second recognition module 330 performs emotion recognition on the first voice sequence to obtain a target emotion type, including:

carrying out emotion recognition on the first voice sequence to obtain a candidate emotion type;

When the candidate emotion type belongs to the negative emotion type, generating a verification text, and collecting a voice sequence of a user aiming at the verification text to obtain a second voice sequence;

and carrying out emotion recognition on the second voice sequence to obtain the target emotion type.

In one embodiment, the second recognition module 330 performs emotion recognition on the first speech sequence to obtain a candidate emotion type, including:

Extracting features of the first voice sequence to obtain speech speed features and fundamental frequency fluctuation features;

And when the speech speed characteristic is larger than a preset speech speed threshold value or the fundamental frequency fluctuation characteristic is larger than a preset fluctuation threshold value, determining that the candidate emotion type belongs to a negative emotion type.

In one embodiment, the second recognition module 330 performs emotion recognition on the first voice sequence to obtain a candidate emotion type, including recognizing emotion of the user in a preset duration based on the first voice sequence by using an emotion recognition model to obtain the candidate emotion type;

Accordingly, the second recognition module 330 performs emotion recognition on the second voice sequence to obtain a target emotion type, including recognizing the emotion of the user based on the second voice sequence by using an emotion recognition model to obtain the target emotion type.

In an embodiment, the transaction risk identification device further includes a risk protection module, where the risk protection module is specifically configured to:

After determining the transaction risk level of the user for transaction through the target application within the preset duration based on the transaction risk value, inquiring a preset mapping relation based on the transaction risk level to obtain a transaction protection operation corresponding to the transaction risk level;

A transaction guard operation is performed to intervene in the user's transaction in the target application.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional modules is illustrated, and in practical application, the above-described functional allocation may be performed by different functional modules according to needs, i.e. the internal structure of the apparatus is divided into different functional modules to perform all or part of the functions described above. The specific working process of the functional module described above may refer to the corresponding process in the foregoing method embodiment, and will not be described herein.

The transaction risk identification device provided by the embodiment is applicable to the transaction risk identification method provided by any embodiment, and has corresponding functions and beneficial effects.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Fig. 4 shows a block diagram of an exemplary electronic device 11 suitable for use in implementing embodiments of the application. The electronic device 11 shown in fig. 4 is only an example, and should not impose any limitation on the function and the range of use of the present embodiment.

As shown in fig. 4, the electronic device 11 is in the form of a general purpose computing electronic device. The components of the electronic device 11 may include, but are not limited to, one or more processors or processing units 16, a system memory 28, and a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.

Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

The electronic device 11 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by electronic device 11 and includes both volatile and non-volatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The electronic device 11 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard disk drive"). Although not shown in fig. 4, a disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a portable compact disk read-only memory, digital versatile disk read-only memory, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. The system memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the embodiments of the application.

A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.

The electronic device 11 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with the electronic device 11, and/or with any devices (e.g., network card, modem, etc.) that enable the electronic device 11 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Also, the electronic device 11 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 20.

As shown in fig. 4, the network adapter 20 communicates with other modules of the electronic device 11 via the bus 18. It should be appreciated that although not shown in FIG. 4, other hardware and/or software modules may be used in connection with electronic device 11, including, but not limited to, microcode, device drivers, redundant processing units, external disk drive arrays, redundant array of independent disks (Redundant Array of INDEPENDENT DISKS, RAID) systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes various functional applications and page display by running a program stored in the system memory 28, for example, to implement a transaction risk recognition method provided in an embodiment of the present application, where the method includes collecting a facial image sequence, a first voice sequence, and a behavior data sequence of a user in a target application within a preset duration, performing expression recognition on each frame of facial image in the facial image sequence to obtain a target expression type sequence, performing emotion recognition on the first voice sequence to obtain a target emotion type, performing behavior recognition on the behavior data sequence to obtain a target behavior type, performing risk analysis on the target expression type sequence, the target emotion type, and the target behavior type to obtain a transaction risk value, and determining a transaction risk level of the user performing a transaction through the target application within the preset duration based on the transaction risk value.

Of course, those skilled in the art will appreciate that the processor may also implement the technical solution of the transaction risk identification method provided in any embodiment of the present application.

The embodiment of the application provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, realizes a transaction risk recognition method provided by the embodiment of the application, and the method comprises the steps of collecting a facial image sequence, a first voice sequence and a behavior data sequence of a user in a preset duration, carrying out expression recognition on each frame of facial image in the facial image sequence to obtain a target expression type sequence, carrying out emotion recognition on the first voice sequence to obtain a target emotion type, carrying out behavior recognition on the behavior data sequence to obtain a target behavior type, carrying out risk analysis on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and determining a transaction risk level of the user for carrying out transaction through the target application in the preset duration based on the transaction risk value.

The computer storage media of the present embodiments may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a flash memory, an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, radio Frequency (RF), etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The embodiment of the application also provides a computer program product, which comprises a computer program, wherein the computer program, when being executed by a processor, realizes the transaction risk recognition method provided by the embodiment of the application, and comprises the steps of collecting a facial image sequence, a first voice sequence and a behavior data sequence of a user in a preset duration, carrying out expression recognition on each frame of facial image in the facial image sequence to obtain a target expression type sequence, carrying out emotion recognition on the first voice sequence to obtain a target emotion type, carrying out behavior recognition on the behavior data sequence to obtain a target behavior type, carrying out risk analysis on the target expression type sequence, the target emotion type and the target behavior type to obtain a transaction risk value, and determining the transaction risk level of the user for carrying out transaction through the target application in the preset duration based on the transaction risk value.

Computer program product in the implementation, the computer program code for carrying out operations of the present application may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

It will be appreciated by those of ordinary skill in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be centralized on a single computing device, or distributed over a network of computing devices, and may alternatively be implemented in program code executable by a computer device, such that they are stored in a memory device and executed by the computing device, or they may be separately fabricated as individual integrated circuit modules, or multiple modules or steps within them may be fabricated as a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.

It should be noted that, in the technical solution of the embodiment of the present application, the collected information is information and data authorized by the user or fully authorized by each party, and the related data is collected, stored, used, processed, transmitted, provided, disclosed, applied, etc. to comply with related laws and regulations and standards of related countries and regions, necessary security measures are adopted, no prejudice to the public order colloquial is adopted, and corresponding operation entrance is provided for the user to select authorization or rejection, in addition, corresponding operation entrance is provided for the user to select consent or reject automated decision result, if the user selects rejection, expert decision flow is entered.

Note that the above is only a preferred embodiment of the present application and the technical principle applied. It will be understood by those skilled in the art that the present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the application. Therefore, while the application has been described in connection with the above embodiments, it is to be understood that the application is not limited to the specific embodiments disclosed and that many other equivalent embodiments may be made without departing from the spirit and scope of the application as defined by the appended claims.

Claims

1. A method for identifying transaction risks, characterized in that the method comprises:

Collect user facial image sequences, first voice sequences, and behavioral data sequences in the target application within a preset time period;

Perform expression recognition on each frame of the facial image sequence to obtain a target expression type sequence;

Emotion recognition is performed on the first speech sequence to obtain the target emotion type, and behavior recognition is performed on the behavior data sequence to obtain the target behavior type;

Risk analysis is performed on the target facial expression type sequence, the target emotion type, and the target behavior type to obtain a transaction risk value. Based on the transaction risk value, the transaction risk level of a user conducting a transaction through the target application within a preset time period is determined.

2. The transaction risk identification method according to claim 1, characterized in that, before collecting the user's facial image sequence, first voice sequence, and behavioral data sequence in the target application within a preset time period, it further includes:

If a risk-triggered operation by the user in the target application is detected, the system will trigger the collection of the user's facial image sequence, first voice sequence, and behavioral data sequence in the target application within a preset time period.

3. The transaction risk identification method according to claim 2, wherein the risk triggering operation includes any one of the following: the transaction amount is greater than a preset numerical threshold, the product risk level of the transaction product is greater than a preset level threshold, or the operation frequency in the target application is greater than a preset frequency threshold.

4. The transaction risk identification method according to claim 1, characterized in that, the step of performing emotion recognition on the first speech sequence to obtain the target emotion type includes:

Emotion recognition is performed on the first speech sequence to obtain candidate emotion types;

When the candidate emotion type is a negative emotion type, a verification text is generated, and the user's voice sequence in response to the verification text is collected to obtain a second voice sequence;

Emotion recognition is performed on the second speech sequence to obtain the target emotion type.

5. The transaction risk identification method according to claim 4, characterized in that, the step of performing emotion recognition on the first speech sequence to obtain candidate emotion types includes:

Feature extraction is performed on the first speech sequence to obtain speech rate features and fundamental frequency fluctuation features;

When the speech rate feature is greater than a preset speech rate threshold, or the fundamental frequency fluctuation feature is greater than a preset fluctuation threshold, the candidate emotion type is determined to be a negative emotion type.

6. The transaction risk identification method according to claim 4, characterized in that, the step of performing emotion recognition on the first speech sequence to obtain candidate emotion types includes:

The emotion recognition model is used to identify the user's emotions within a preset time period based on the first speech sequence to obtain the candidate emotion type;

Accordingly, the step of performing emotion recognition on the second speech sequence to obtain the target emotion type includes:

The emotion recognition model is used to identify the user's emotion based on the second speech sequence to obtain the target emotion type.

7. The transaction risk identification method according to claim 1, characterized in that, after determining the transaction risk level of a user's transaction through a target application within a preset time period based on the transaction risk value, it further includes:

Based on the transaction risk level, a preset mapping relationship is queried to obtain the transaction protection operation corresponding to the transaction risk level; the preset mapping relationship includes the correspondence between risk level and protection operation.

Perform the transaction protection operation to intervene in the user's transactions in the target application.

8. An electronic device, characterized in that the electronic device comprises:

At least one processor; and

A memory communicatively connected to the at least one processor; wherein,

The memory stores a computer program that can be executed by the at least one processor, the computer program being executed by the at least one processor to enable the at least one processor to perform the transaction risk identification method according to any one of claims 1 to 7.

9. A computer-readable storage medium having a computer program stored thereon, characterized in that, when executed by a processor, the program implements the transaction risk identification method as described in any one of claims 1 to 7.

10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the transaction risk identification method as described in any one of claims 1 to 7.