CN113763405A

CN113763405A - An image detection method and device

Info

Publication number: CN113763405A
Application number: CN202110142944.2A
Authority: CN
Inventors: 周安涛; 赵鑫; 李源
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2021-02-02
Filing date: 2021-02-02
Publication date: 2021-12-07
Anticipated expiration: 2041-02-02
Also published as: CN113763405B

Abstract

The invention discloses an image detection method and device, and relates to the technical field of computers. A specific implementation of the method includes: acquiring a training sample; wherein, the training sample includes: a training image, a region label and a boundary label; inputting the training image into a detection model to obtain a region detection result and a boundary detection result; The region label, the boundary label, the region detection result and the boundary detection result are used to train the detection model; based on the trained detection model, it is determined whether the detection image has been tampered with. This embodiment can improve detection accuracy.

Description

An image detection method and device

技术领域technical field

本发明涉及计算机技术领域，尤其涉及一种图像检测方法和装置。The present invention relates to the field of computer technology, and in particular, to an image detection method and device.

背景技术Background technique

在实际应用场景中，不法分子将多张图像的内容合成到一张图像中，改变了图像的原始含义，对用户造成误导。例如，在电商平台中，商家篡改原始图像以吸引消费者。因此，如何检测图像是否被篡改，成为亟待解决的问题。In practical application scenarios, criminals synthesize the contents of multiple images into one image, which changes the original meaning of the image and misleads users. For example, in e-commerce platforms, merchants tamper with original images to attract consumers. Therefore, how to detect whether an image has been tampered with has become an urgent problem to be solved.

现有技术通过边缘检测，识别图像是否被篡改。In the prior art, edge detection is used to identify whether an image has been tampered with.

但是，该方法仅关注图像的局部特征，其检测准确度较低。However, this method only focuses on the local features of the image, and its detection accuracy is low.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明实施例提供一种图像检测方法和装置，能够提高检测准确度。In view of this, embodiments of the present invention provide an image detection method and apparatus, which can improve detection accuracy.

第一方面，本发明实施例提供了一种图像检测方法，包括：In a first aspect, an embodiment of the present invention provides an image detection method, including:

获取训练样本；其中，所述训练样本，包括：训练图像、区域标签和边界标签；Obtain training samples; wherein, the training samples include: training images, region labels and boundary labels;

将所述训练图像输入检测模型，得到区域检测结果和边界检测结果；Inputting the training image into a detection model to obtain a region detection result and a boundary detection result;

根据所述区域标签、所述边界标签、所述区域检测结果和所述边界检测结果，训练所述检测模型；train the detection model according to the region label, the boundary label, the region detection result and the boundary detection result;

基于训练好的所述检测模型，确定检测图像是否被篡改。Based on the trained detection model, it is determined whether the detection image has been tampered with.

可选地，Optionally,

所述检测模型，包括：特征提取层、区域检测层和边界检测层；The detection model includes: a feature extraction layer, a region detection layer and a boundary detection layer;

所述将训练图像输入检测模型，得到区域检测结果和边界检测结果，包括：The training image is input into the detection model to obtain the region detection result and the boundary detection result, including:

将所述训练图像输入所述特征提取层，以从所述训练图像中提取出高阶特征图和低阶特征图；inputting the training image into the feature extraction layer to extract a high-order feature map and a low-order feature map from the training image;

将所述高阶特征图和所述低阶特征图输入所述区域检测层，得到所述区域检测结果；Inputting the high-order feature map and the low-order feature map into the region detection layer to obtain the region detection result;

将所述高阶特征图和所述低阶特征图输入所述边界检测层，得到所述边界检测结果。The high-order feature map and the low-order feature map are input into the boundary detection layer to obtain the boundary detection result.

可选地，Optionally,

所述将所述训练图像输入所述特征提取层，以从所述训练图像中提取出高阶特征图和低阶特征图，包括：Inputting the training image into the feature extraction layer to extract high-order feature maps and low-order feature maps from the training images, including:

将所述训练图像输入主干网络，得到所述低阶特征图和第一特征图；Inputting the training image into the backbone network to obtain the low-level feature map and the first feature map;

基于多尺度网络从所述第一特征图中提取多尺度特征，得到多个第二特征图；Extracting multi-scale features from the first feature map based on the multi-scale network to obtain multiple second feature maps;

将所述多个第二特征图拼接后输入第一卷积层，得到所述高阶特征图；After splicing the plurality of second feature maps, they are input into the first convolution layer to obtain the high-order feature maps;

其中，所述主干网络包括：第一多通道卷积层和深度可分离卷积层；所述第一卷积层为1×1卷积层。The backbone network includes: a first multi-channel convolutional layer and a depthwise separable convolutional layer; the first convolutional layer is a 1×1 convolutional layer.

可选地，Optionally,

所述多尺度网络包括：空洞卷积层、第二卷积层和池化层；The multi-scale network includes: an atrous convolutional layer, a second convolutional layer and a pooling layer;

其中，所述第二卷积层为1×1卷积层。Wherein, the second convolutional layer is a 1×1 convolutional layer.

可选地，Optionally,

所述区域检测层，包括：第一特征融合层、区域异常分析层和第一结果输出层；The region detection layer includes: a first feature fusion layer, a region anomaly analysis layer and a first result output layer;

所述将所述高阶特征图和所述低阶特征图输入所述区域检测层，得到所述区域检测结果，包括：Inputting the high-order feature map and the low-order feature map into the region detection layer to obtain the region detection result, including:

将所述高阶特征图和所述低阶特征图输入所述第一特征融合层，得到第三特征图；Inputting the high-order feature map and the low-order feature map into the first feature fusion layer to obtain a third feature map;

根据所述第三特征图和所述区域异常分析层，确定第四特征图；其中，所述第四特征图用于表征所述第三特征图中篡改区域与背景区域的像素值差异；According to the third feature map and the regional anomaly analysis layer, a fourth feature map is determined; wherein, the fourth feature map is used to represent the pixel value difference between the tampered area and the background area in the third feature map;

将所述第四特征图输入所述第一结果输出层，得到所述区域检测结果。The fourth feature map is input into the first result output layer to obtain the region detection result.

可选地，Optionally,

所述将所述高阶特征图和所述低阶特征图输入所述第一特征融合层，得到第三特征图，包括：Inputting the high-order feature map and the low-order feature map into the first feature fusion layer to obtain a third feature map, including:

将所述低阶特征图输入第三卷积层，得到第五特征图；Input the low-level feature map into the third convolutional layer to obtain the fifth feature map;

对所述高阶特征图进行上采样，得到第六特征图；Upsampling the high-order feature map to obtain a sixth feature map;

将所述第五特征图和所述第六特征图拼接后输入第二多通道卷积层，得到所述第三特征图；After splicing the fifth feature map and the sixth feature map, input the second multi-channel convolution layer to obtain the third feature map;

其中，所述第三卷积层为1×1卷积层。Wherein, the third convolutional layer is a 1×1 convolutional layer.

可选地，Optionally,

所述根据所述第三特征图和所述区域异常分析层，确定第四特征图，包括：The determining a fourth feature map according to the third feature map and the regional anomaly analysis layer includes:

根据所述第三特征图中各个像素坐标的像素值，计算所述第三特征图的平均像素值；Calculate the average pixel value of the third feature map according to the pixel value of each pixel coordinate in the third feature map;

确定各个所述像素坐标的像素值与所述平均像素值的差；determining the difference between the pixel value of each of the pixel coordinates and the average pixel value;

根据各个所述像素坐标的像素值与所述平均像素值的差，计算所述第三特征图的像素值标准差；Calculate the pixel value standard deviation of the third feature map according to the difference between the pixel value of each of the pixel coordinates and the average pixel value;

根据所述像素值标准差、各个所述像素坐标的像素值与所述平均像素值的差，计算各个所述像素坐标的标准化像素值；Calculate the normalized pixel value of each of the pixel coordinates according to the standard deviation of the pixel value and the difference between the pixel value of each of the pixel coordinates and the average pixel value;

根据各个所述像素坐标的标准化像素值，确定所述第四特征图。The fourth feature map is determined according to the normalized pixel value of each of the pixel coordinates.

可选地，Optionally,

所述将所述第四特征图输入所述第一结果输出层，得到所述区域检测结果，包括：The inputting the fourth feature map into the first result output layer to obtain the region detection result includes:

将所述第四特征图输入第四卷积层，得到第七特征图；Input the fourth feature map into the fourth convolution layer to obtain the seventh feature map;

对所述第七特征图进行上采样，得到第八特征图；Upsampling the seventh feature map to obtain an eighth feature map;

将所述第八特征图输入激活函数，得到所述区域检测结果；Inputting the eighth feature map into an activation function to obtain the region detection result;

其中，所述第四卷积层为1×1卷积层。Wherein, the fourth convolutional layer is a 1×1 convolutional layer.

可选地，Optionally,

所述边界检测层，包括：第二特征融合层、边界异常分析层和第二结果输出层；The boundary detection layer includes: a second feature fusion layer, a boundary anomaly analysis layer and a second result output layer;

所述将所述高阶特征图和所述低阶特征图输入所述边界检测层，得到所述边界检测结果，包括：Inputting the high-order feature map and the low-order feature map into the boundary detection layer to obtain the boundary detection result, including:

将所述高阶特征图和所述低阶特征图输入所述第二特征融合层，得到第九特征图；Inputting the high-order feature map and the low-order feature map into the second feature fusion layer to obtain a ninth feature map;

根据所述第九特征图和所述边界异常分析层，确定第十特征图；其中，所述第十特征图用于表征检测窗口内篡改区域与背景区域的像素值差异；According to the ninth feature map and the boundary anomaly analysis layer, determine a tenth feature map; wherein, the tenth feature map is used to represent the pixel value difference between the tampered area and the background area in the detection window;

将所述第十特征图输入所述第二结果输出层，得到所述边界检测结果。The tenth feature map is input into the second result output layer to obtain the boundary detection result.

可选地，Optionally,

所述将所述高阶特征图和所述低阶特征图输入所述第二特征融合层，得到第九特征图，包括：Inputting the high-order feature map and the low-order feature map into the second feature fusion layer to obtain a ninth feature map, including:

将所述低阶特征图输入第五卷积层，得到第十一特征图；Input the low-level feature map into the fifth convolutional layer to obtain the eleventh feature map;

对所述高阶特征图进行上采样，得到第十二特征图；Upsampling the high-order feature map to obtain a twelfth feature map;

将所述第十一特征图和所述第十二特征图拼接后输入第三多通道卷积层，得到所述第九特征图；After splicing the eleventh feature map and the twelfth feature map, input the third multi-channel convolutional layer to obtain the ninth feature map;

其中，所述第五卷积层为1×1卷积层。Wherein, the fifth convolutional layer is a 1×1 convolutional layer.

可选地，Optionally,

所述根据所述第九特征图和所述边界异常分析层，确定第十特征图，包括：The determination of the tenth feature map according to the ninth feature map and the boundary anomaly analysis layer includes:

根据所述检测窗口内第九特征图中各个像素坐标的像素值，计算所述检测窗口的平均像素值；Calculate the average pixel value of the detection window according to the pixel value of each pixel coordinate in the ninth feature map in the detection window;

确定各个所述像素坐标的像素值与所述像素坐标所处检测窗口的平均像素值的差；determining the difference between the pixel value of each of the pixel coordinates and the average pixel value of the detection window where the pixel coordinates are located;

计算所述第九特征图的像素值标准差；calculating the pixel value standard deviation of the ninth feature map;

根据所述像素值标准差、各个所述像素坐标的像素值与所述像素坐标所处检测窗口的平均像素值的差，计算所述检测窗口内像素坐标的标准化像素值；According to the standard deviation of the pixel value, the difference between the pixel value of each of the pixel coordinates and the average pixel value of the detection window where the pixel coordinates are located, calculate the normalized pixel value of the pixel coordinates in the detection window;

根据所述检测窗口内像素坐标的标准化像素值，确定所述第十特征图。The tenth feature map is determined according to the normalized pixel values of the pixel coordinates within the detection window.

可选地，Optionally,

所述将所述第十特征图输入所述第二结果输出层，得到所述边界检测结果，包括：The inputting the tenth feature map into the second result output layer to obtain the boundary detection result includes:

将所述第十特征图输入第六卷积层，得到第十三特征图；Input the tenth feature map into the sixth convolution layer to obtain the thirteenth feature map;

对所述第十三特征图进行上采样，得到第十四特征图；Upsampling the thirteenth feature map to obtain a fourteenth feature map;

将所述第十四特征图输入激活函数，得到所述区域检测结果；Inputting the fourteenth feature map into an activation function to obtain the region detection result;

其中，所述第六卷积层为1×1卷积层。Wherein, the sixth convolutional layer is a 1×1 convolutional layer.

可选地，Optionally,

所述获取训练样本，包括：The acquiring training samples includes:

获取所述训练图像和所述区域标签；obtain the training image and the region label;

对所述区域标签执行膨胀操作，得到膨胀图像；performing an expansion operation on the region label to obtain an expanded image;

对所述区域标签执行腐蚀操作，得到腐蚀图像；performing an erosion operation on the region label to obtain an erosion image;

根据所述膨胀图像和所述腐蚀图像，确定所述边界标签。The boundary labels are determined from the dilation image and the erosion image.

可选地，Optionally,

进一步包括：Further includes:

获取预训练样本；Get pre-training samples;

基于所述预训练样本对所述检测模型进行预训练；pre-training the detection model based on the pre-training samples;

所述将所述训练图像输入检测模型，得到区域检测结果和边界检测结果，包括：The described training image is input into the detection model to obtain the region detection result and the boundary detection result, including:

将所述训练图像输入经过预训练的所述检测模型，得到所述区域检测结果和所述边界检测结果。The training image is input into the pre-trained detection model to obtain the region detection result and the boundary detection result.

第二方面，本发明实施例提供了一种图像检测装置，包括：In a second aspect, an embodiment of the present invention provides an image detection device, including:

获取模块，配置为获取训练样本；其中，所述训练样本，包括：训练图像、区域标签和边界标签；an acquisition module, configured to acquire training samples; wherein, the training samples include: training images, region labels and boundary labels;

训练模块，配置为将所述训练图像输入检测模型，得到区域检测结果和边界检测结果；根据所述区域标签、所述边界标签、所述区域检测结果和所述边界检测结果，训练所述检测模型；a training module, configured to input the training image into a detection model to obtain a region detection result and a boundary detection result; train the detection according to the region label, the boundary label, the region detection result and the boundary detection result Model;

检测模块，配置为基于训练好的所述检测模型，确定检测图像是否被篡改。The detection module is configured to determine whether the detection image has been tampered with based on the trained detection model.

第三方面，本发明实施例提供了一种电子设备，包括：In a third aspect, an embodiment of the present invention provides an electronic device, including:

一个或多个处理器；one or more processors;

存储装置，用于存储一个或多个程序，storage means for storing one or more programs,

当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现如上述任一实施例所述的方法。The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the above embodiments.

第四方面，本发明实施例提供了一种计算机可读介质，其上存储有计算机程序，所述程序被处理器执行时实现如上述任一实施例所述的方法。In a fourth aspect, an embodiment of the present invention provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements the method described in any of the foregoing embodiments.

上述发明中的一个实施例具有如下优点或有益效果：基于检测模型对图像进行边界检测和区域检测，区域检测基于整个图像的篡改区域与背景区域的特征差异，识别篡改区域，其关注的是图像整体特征；边界检测基于篡改边界两侧的特征差异，识别篡改边界，其关注的是局部特征。边界检测能够辅助区域检测，更加精确地确定篡改区域，提高图像检测的准确度。One embodiment of the above invention has the following advantages or beneficial effects: boundary detection and area detection are performed on the image based on the detection model, and the area detection is based on the characteristic difference between the tampered area and the background area of the entire image, and the tampered area is identified. Overall features; Boundary detection identifies tampered boundaries based on feature differences on both sides of the tampered boundary, focusing on local features. Boundary detection can assist region detection, determine the tampered region more accurately, and improve the accuracy of image detection.

上述的非惯用的可选方式所具有的进一步效果将��中��合具体实施方式加以说明。Further effects of the above non-conventional alternatives will be described below in conjunction with specific embodiments.

附图说明Description of drawings

附图用于更好地理解本发明，不构成对本发明的不当限定。其中：The accompanying drawings are used for better understanding of the present invention and do not constitute an improper limitation of the present invention. in:

图1是本发明的一个实施例提供的一种图像检测方法的流程图；1 is a flowchart of an image detection method provided by an embodiment of the present invention;

图2是本发明的一个实施例提供的一种图像检测方法的流程图；2 is a flowchart of an image detection method provided by an embodiment of the present invention;

图3是本发明的一个实施例提供的一种检测模型的架构图；3 is an architecture diagram of a detection model provided by an embodiment of the present invention;

图4(a)是本发明的一个实施例提供的一种区域标签的示意图；Figure 4 (a) is a schematic diagram of a region label provided by an embodiment of the present invention;

图4(b)是本发明的一个实施例提供的一种膨胀图像的示意图；Figure 4(b) is a schematic diagram of a dilated image provided by an embodiment of the present invention;

图4(c)是本发明的一个实施例提供的一种腐蚀图像的示意图；Figure 4(c) is a schematic diagram of a corrosion image provided by an embodiment of the present invention;

图4(d)是本发明的一个实施例提供的一种边界标签的示意图；4(d) is a schematic diagram of a border label provided by an embodiment of the present invention;

图5是本发明的一个实施例提供的一种主干网络的结构示意图；5 is a schematic structural diagram of a backbone network provided by an embodiment of the present invention;

图6是本发明的一个实施例提供的一种多尺度网络的结构示意图；6 is a schematic structural diagram of a multi-scale network provided by an embodiment of the present invention;

图7是本发明的一个实施例提供的一种图像检测装置的结构示意图；7 is a schematic structural diagram of an image detection apparatus provided by an embodiment of the present invention;

图8是本发明实施例可以应用于其中的示例性系统架构图；FIG. 8 is an exemplary system architecture diagram to which an embodiment of the present invention may be applied;

图9是适于用来实现本发明实施例的终端设备或服务器的计算机系统的结构示意图。FIG. 9 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.

具体实施方式Detailed ways

以下结合附图对本发明的示范性实施例做出说明，其中包括本发明实施例的各种细节以助于理解，应当将它们认为仅仅是示范性的。因此，本领域普通技术人员应当认识到，可以对这里描述的实施例做出各种改变和修��，而不会背离本发明的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

边缘检测关注的是检测框内图像的局部特征，并没有考虑整个图像的篡改区域与背景区域的特征差异。因此，其检测结果的准确度还需进一步提升。Edge detection focuses on detecting the local features of the image in the frame, and does not consider the feature difference between the tampered area and the background area of the entire image. Therefore, the accuracy of its detection results needs to be further improved.

鉴于此，如图1所示，本发明实施例提供了一种图像检测方法，包括：In view of this, as shown in FIG. 1 , an embodiment of the present invention provides an image detection method, including:

步骤101：获取训练样本；其中，训练样本，包括：训练图像、区域标签和边界标签。Step 101: Obtain training samples; wherein, the training samples include: training images, region labels and boundary labels.

为了提高训练效果，本发明实施例的训练样本来自于篡改图像数据集CASIA 2.0，CASIA 2.0能够提供超过五千张篡改图像，并涉及多种篡改方式和图像格式，能够满足本发明实施例的训练需求。在实际应用场景中，还可以根据实际情况选择样本数量相对较少的CASIA 1.0数据集等。In order to improve the training effect, the training samples of the embodiment of the present invention come from the tampered image data set CASIA 2.0. CASIA 2.0 can provide more than 5,000 tampered images, and involves a variety of tampering methods and image formats, which can satisfy the training of the embodiment of the present invention. need. In practical application scenarios, the CASIA 1.0 data set with a relatively small number of samples can also be selected according to the actual situation.

数据集中包括训练图像和区域标签，边界标签根据区域标签确定。The dataset includes training images and region labels, and boundary labels are determined from the region labels.

步骤102：将训练图像输入检测模型，得到区域检测结果和边界检测结果。Step 102: Input the training image into the detection model to obtain the region detection result and the boundary detection result.

区域��测结果为预测得到的篡改区域，边界检测结果为预测得到的篡改边界。The area detection result is the predicted tampered area, and the boundary detection result is the predicted tampered boundary.

步骤103：根据区域标签、边界标签、区域检测结果和边界检测结果，训练检测模型。Step 103: Train the detection model according to the region label, the boundary label, the region detection result and the boundary detection result.

根据区域标签、边界标签、区域检测结果、边界检测结果和预设的损失函数，确定损失值；根据损失值，调整检测模型的参数。The loss value is determined according to the region label, the boundary label, the region detection result, the boundary detection result and the preset loss function; according to the loss value, the parameters of the detection model are adjusted.

在实际应用场景中，为了保证检测模型的预测质量，在训练的过程中，使用测试样本对检测模型进行测试，以确定检测模型的预测效果。具体地，训练样本和预测样本的数量比可以为9:1。测试样本可以来源于CASIA 1.0和Columbia数据集。In practical application scenarios, in order to ensure the prediction quality of the detection model, during the training process, the detection model is tested with test samples to determine the prediction effect of the detection model. Specifically, the ratio of the number of training samples to prediction samples may be 9:1. Test samples can be derived from CASIA 1.0 and Columbia datasets.

步骤104：基于训练好的检测模型，确定检测图像是否被篡改。Step 104: Based on the trained detection model, determine whether the detection image has been tampered with.

本发明实施例基于检测模型对图像进行边界检测和区域检测，区域检测基于整个图像的篡改区域与背景区域的特征差异，识别篡改区域，其关注的是图像整体特征；边界检测基于篡改边界两侧的特征差异，识别篡改边界，其关注的是局部特征。边界检测能够辅助区域检测，更加精确地确定篡改区域，提高图像检测的准确度。The embodiment of the present invention performs boundary detection and area detection on the image based on the detection model. The area detection is based on the feature difference between the tampered area and the background area of the entire image, and the tampered area is identified, focusing on the overall characteristics of the image; the boundary detection is based on the two sides of the tampered boundary. feature differences, identify tampered boundaries, and focus on local features. Boundary detection can assist region detection, determine the tampered region more accurately, and improve the accuracy of image detection.

在本发明的一个实施例中，检测模型，包括：特征提取层、区域检测层和边界检测层；In an embodiment of the present invention, the detection model includes: a feature extraction layer, a region detection layer, and a boundary detection layer;

将训练图像输入检测模型，得到区域检测结果和边界检测结果，包括：Input the training image into the detection model to obtain the region detection results and boundary detection results, including:

将训练图像输入特征提取层，以从训练图像中提取出高阶特征图和低阶特征图；Input the training image into the feature extraction layer to extract high-order feature maps and low-order feature maps from the training images;

将高阶特征图和低阶特征图输入区域检测层，得到区域检测结果；Input the high-order feature map and low-order feature map into the region detection layer to obtain the region detection result;

将高阶特征图和低阶特征图输入边界检测层，得到边界检测结果。Input the high-order feature map and low-order feature map into the boundary detection layer to obtain the boundary detection result.

本发明实施例基于检测模型确定区域检测结果和边界检测结果。基于功能的不同，检测模型可以被划分成特征提取层、区域检测层和边界检测层。其中，特征提取层用于从训练图像中提取高阶特征和低阶特征，提取的高阶特征构成高阶特征图，提取的低阶特征构成低阶特征图。低阶特征的分辨率更高，包含位置信息、细节信息等；高阶特征具有更多的语义信息，但是分辨率较低。区域检测层用于检测篡改区域，边界检测层用于检测篡改边界。The embodiment of the present invention determines the region detection result and the boundary detection result based on the detection model. Based on different functions, the detection model can be divided into feature extraction layer, region detection layer and boundary detection layer. Among them, the feature extraction layer is used to extract high-order features and low-order features from the training images, the extracted high-order features form a high-order feature map, and the extracted low-order features form a low-order feature map. Low-order features have higher resolution, including location information, detail information, etc.; high-order features have more semantic information, but lower resolution. The region detection layer is used to detect tampered regions, and the boundary detection layer is used to detect tampered boundaries.

本发明实施例利用高阶特征和低阶特征进行边界检测和区域检测，考虑了训练图像中的多维特征，提高区域检测和边界检测的精确度，进而篡改识别结果的准确度。The embodiment of the present invention uses high-order features and low-order features to perform boundary detection and region detection, and considers multi-dimensional features in training images to improve the accuracy of region detection and boundary detection, thereby tampering with the accuracy of identification results.

在本发明的一个实施例中，将训练图像输入特征提取层，以从训练图像中提取出高阶特征图和低阶特征图，包括：In one embodiment of the present invention, the training image is input into the feature extraction layer to extract high-order feature maps and low-order feature maps from the training images, including:

将训练图像输入主干网络，得到低阶特征图和第一特征图；Input the training image into the backbone network to obtain the low-level feature map and the first feature map;

基于多尺度网络从第一特征图中提取多尺度特征，得到多个第二特征图；Extracting multi-scale features from the first feature map based on the multi-scale network to obtain multiple second feature maps;

将多个第二特征图拼接后输入第一卷积层，得到高阶特征图；After splicing multiple second feature maps and inputting them into the first convolutional layer, a high-order feature map is obtained;

其中，主干网络包括：第一多通道卷积层和深度可分离卷积层；第一卷积层为1×1卷积层。The backbone network includes: a first multi-channel convolutional layer and a depthwise separable convolutional layer; the first convolutional layer is a 1×1 convolutional layer.

在本发明实施例中，特征提取层中包括主干网络、多尺度网络和第一卷积层。主干网络用于从训练图像中提取特征，其可以通过多通道卷积层和深度可分离卷积实现。第一多通道卷积层和深度可分离卷积层相结合，能够提高特征提取效率和质量。在本发明实施例中，主干网络中可以包括多个第一多通道卷积层和多个深度可分离卷积层，还可以将深度可分离卷积层替换成第一多通道卷积层或者其他类型的卷积层。In the embodiment of the present invention, the feature extraction layer includes a backbone network, a multi-scale network and a first convolution layer. The backbone network is used to extract features from training images, which can be implemented by multi-channel convolutional layers and depthwise separable convolutions. The combination of the first multi-channel convolutional layer and the depthwise separable convolutional layer can improve the efficiency and quality of feature extraction. In this embodiment of the present invention, the backbone network may include multiple first multi-channel convolutional layers and multiple depthwise separable convolutional layers, and may also replace the depthwise separable convolutional layers with the first multi-channel convolutional layer or Other types of convolutional layers.

本发明实施例通过提取多尺度特征提高区域检测的精确度，进而提高图像检测结果的准确度和可靠性。The embodiments of the present invention improve the accuracy of region detection by extracting multi-scale features, thereby improving the accuracy and reliability of image detection results.

第一卷积层用于将第二特征图进行融合，得到高阶特征图。The first convolutional layer is used to fuse the second feature maps to obtain high-order feature maps.

在本发明的一个实施例中，多尺度网络包括：空洞卷积层、第二卷积层和池化层��In one embodiment of the present invention, the multi-scale network includes: an atrous convolutional layer, a second convolutional layer and a pooling layer;

其中，第二卷积层为1×1卷积层。Among them, the second convolutional layer is a 1×1 convolutional layer.

本发明实施例通过空洞卷积层加大了感受野，使得多尺度网络能够输出更加丰富的信息，进而提高模型训练效果。本发明实施例可以多个空洞卷积层，如三个或四个等。In the embodiment of the present invention, the receptive field is enlarged by the hole convolution layer, so that the multi-scale network can output more abundant information, thereby improving the model training effect. In this embodiment of the present invention, multiple atrous convolutional layers, such as three or four, may be used.

在本发明的一个实施例中，区域检测层，包括：第一特征融合层、区域异常分析层和第一结果输出层；In an embodiment of the present invention, the region detection layer includes: a first feature fusion layer, a region anomaly analysis layer, and a first result output layer;

将高阶特征图和低阶特征图输入区域检测层，得到区域检测结果，包括：Input the high-order feature map and low-order feature map into the region detection layer to obtain the region detection results, including:

将高阶特征图和低阶特征图输入第一特征融合层，得到第三特征图；Input the high-order feature map and the low-order feature map into the first feature fusion layer to obtain the third feature map;

根据第三特征图和区域异常分析层，确定第四特征图；其中，第四特征图用于表征第三特征图中篡改区域与背景区域的像素值差异；Determine a fourth feature map according to the third feature map and the regional anomaly analysis layer; wherein, the fourth feature map is used to represent the pixel value difference between the tampered area and the background area in the third feature map;

将第四特征图输入第一结果输出层，得到区域检测结果。The fourth feature map is input into the first result output layer to obtain the region detection result.

本发明实施例基于篡改区域与背景区域的像素值差异，确定篡改区域的范围。在计算像素值差异的过程中，考虑的是第三特征图中各个像素坐标的像素值，能够从全局角度识别篡改区域。The embodiment of the present invention determines the range of the tampered area based on the pixel value difference between the tampered area and the background area. In the process of calculating the pixel value difference, the pixel value of each pixel coordinate in the third feature map is considered, and the tampered area can be identified from a global perspective.

在本发明的一个实施例中，将高阶特征图和低阶特征图输入第一特征融合层，得到第三特征图，包括：In an embodiment of the present invention, the high-order feature map and the low-order feature map are input into the first feature fusion layer to obtain a third feature map, including:

将低阶特征图输入第三卷积层，得到第五特征图；Input the low-level feature map into the third convolutional layer to obtain the fifth feature map;

对高阶特征图进行上采样，得到第六特征图；Upsampling the high-order feature map to obtain the sixth feature map;

将第五特征图和第六特征图拼接后输入第二多通道卷积层，得到第三特征图；The fifth feature map and the sixth feature map are spliced and input into the second multi-channel convolutional layer to obtain the third feature map;

其中，第三卷积层为1×1卷积层。Among them, the third convolutional layer is a 1×1 convolutional layer.

本发明实施例通过1×1卷积层对低阶特征图进行特征融合、压缩，以便于去除冗余特征，提高模型训练效果。另外，可以通过双线性插值或转置卷积对高阶特征图进行上采样，以放大高阶特征图的尺寸。第五特征图与第六特征图可以按照Z轴拼接，然后通过1×1卷积层进行特征融合。The embodiment of the present invention performs feature fusion and compression on the low-level feature map through a 1×1 convolution layer, so as to remove redundant features and improve the model training effect. In addition, higher-order feature maps can be upsampled by bilinear interpolation or transposed convolution to enlarge the size of higher-order feature maps. The fifth feature map and the sixth feature map can be spliced according to the Z axis, and then feature fusion is performed through a 1×1 convolutional layer.

在本发明的一个实施例中，根据第三特征图和区域异常分析层，确定第四特征图，包括：In an embodiment of the present invention, the fourth feature map is determined according to the third feature map and the regional anomaly analysis layer, including:

根据第三特征图中各个像素坐标的像素值，计算第三特征图的平均像素值；Calculate the average pixel value of the third feature map according to the pixel value of each pixel coordinate in the third feature map;

确定各个像素坐标的像素值与平均像素值的差；Determine the difference between the pixel value of each pixel coordinate and the average pixel value;

根据各个像素坐标的像素值与平均像素值的差，计算第三特征图的像素值标准差；Calculate the pixel value standard deviation of the third feature map according to the difference between the pixel value of each pixel coordinate and the average pixel value;

根据像素值标准差、各个像素坐标的像素值与平均像素值的差，计算各个像素坐标的标准化像素值；Calculate the normalized pixel value of each pixel coordinate according to the standard deviation of the pixel value and the difference between the pixel value of each pixel coordinate and the average pixel value;

根据各个像素坐标的标准化像素值，确定第四特征图。A fourth feature map is determined according to the normalized pixel values of the respective pixel coordinates.

本发明实施例通过标准化像素值表征像素坐标的像素值与第三特征图平均像素值的差异程度，差异程度越大，说明像素坐标位于篡改区域中的可能性越大。本发明实施例通过像素值的差异，识别像素坐标是否处于篡改区域，能够提高区域识别的精确度。In this embodiment of the present invention, the normalized pixel value is used to represent the degree of difference between the pixel value of the pixel coordinate and the average pixel value of the third feature map. The embodiment of the present invention identifies whether the pixel coordinates are in the tampered area through the difference of pixel values, which can improve the accuracy of area identification.

在本发明的一个实施例中，将第四特征图输入第一结果输出层，得到区域检测结果，包括：In an embodiment of the present invention, the fourth feature map is input into the first result output layer to obtain a region detection result, including:

将第四特征图输入第四卷积层，得到第七特征图；Input the fourth feature map into the fourth convolution layer to obtain the seventh feature map;

对第七特征图进行上采样，得到第八特征图；Upsampling the seventh feature map to obtain the eighth feature map;

将第八特征图输入激活函数，得到区域检测结果；Input the eighth feature map into the activation function to obtain the region detection result;

其中，第四卷积层为1×1卷积层。Among them, the fourth convolutional layer is a 1×1 convolutional layer.

本发明实施例通过上采样放大图像，并通过激活函数将像素值映射到0～1之间，第八特征图各个像素坐标的映射结果构成区域检测结果。本发明实施例在计算标准化像素值之后再进行降维和放大，能够保证计算过程所采用像素值的准确性，提高区域检测结果的准确性和可靠性。采用的激活函数可以为sigmoid函数、softmax函数等。In the embodiment of the present invention, the image is enlarged by up-sampling, and the pixel value is mapped between 0 and 1 through the activation function, and the mapping result of each pixel coordinate of the eighth feature map constitutes the region detection result. In the embodiment of the present invention, dimension reduction and amplification are performed after the normalized pixel value is calculated, which can ensure the accuracy of the pixel value used in the calculation process and improve the accuracy and reliability of the area detection result. The activation function used can be a sigmoid function, a softmax function, or the like.

在本发明的一个实施例中，边界检测层，包括：第二特征融合层、边界异常分析层和第二结果输出层；In an embodiment of the present invention, the boundary detection layer includes: a second feature fusion layer, a boundary anomaly analysis layer, and a second result output layer;

将高阶特征图和低阶特征图输入边界检测层，得到边界检测结果，包括：Input the high-order feature map and low-order feature map into the boundary detection layer to obtain boundary detection results, including:

将高阶特征图和低阶特征图输入第二特征融合层，得到第九特征图；Input the high-order feature map and the low-order feature map into the second feature fusion layer to obtain the ninth feature map;

根据第九特征图和边界异常分析层，确定第十特征图；其中，第十特征图用于表征检测��篡改区域与背景区域的像素值差异；According to the ninth feature map and the boundary anomaly analysis layer, determine the tenth feature map; wherein, the tenth feature map is used to represent the pixel value difference between the tampered area and the background area in the detection window;

将第十特征图输入第二结果输出层，得到边界检测结果。The tenth feature map is input into the second result output layer to obtain the boundary detection result.

本发明实施例基于检测窗口内篡改区域与背景区域的像素值差异，确定篡改区域的范围。与区域检测相区别，本发明实施例考虑的是检测窗口中各个像素坐标的像素值，能够从局部角度辅助区域检测过程确定篡改区域。The embodiment of the present invention determines the range of the tampered area based on the pixel value difference between the tampered area and the background area in the detection window. Different from area detection, the embodiment of the present invention considers the pixel value of each pixel coordinate in the detection window, and can assist the area detection process to determine the tampered area from a local perspective.

在本发明的一个实施例中，将高阶特征图和低阶特征图输入第二特征融合层，得到第九特征图，包括：In an embodiment of the present invention, the high-order feature map and the low-order feature map are input into the second feature fusion layer to obtain a ninth feature map, including:

将低阶特征图输入第五卷积层，得到第十一特征图；Input the low-level feature map into the fifth convolutional layer to obtain the eleventh feature map;

对高阶特征图进行上采样，得到第十二特征图；Upsampling the high-order feature map to obtain the twelfth feature map;

将第十一特征图和第十二特征图拼接后输入第三多通道卷积层，得到第九特征图；The eleventh feature map and the twelfth feature map are spliced and input into the third multi-channel convolutional layer to obtain the ninth feature map;

其中，第五卷积层为1×1卷积层。Among them, the fifth convolutional layer is a 1×1 convolutional layer.

与区域检测部分类似，本发明实施例通过1×1卷积层对低阶特征图进行特征融合、压缩，以便于去除冗余特征，提高模型训练效果。另外，可以通过双线性插值或转置卷积对高阶特征图进行上采样，以放大高阶特征图的尺寸。第十一特征图与第十二特征图可以按照Z轴拼接，然后通过1×1卷积层进行特征融合。Similar to the region detection part, the embodiment of the present invention performs feature fusion and compression on the low-level feature map through a 1×1 convolution layer, so as to remove redundant features and improve the model training effect. In addition, higher-order feature maps can be upsampled by bilinear interpolation or transposed convolution to enlarge the size of higher-order feature maps. The eleventh feature map and the twelfth feature map can be spliced according to the Z axis, and then feature fusion is performed through a 1×1 convolution layer.

在本发明的一个实施例中，根据第九特征图和边界异常分析层，确定第十特征图，包括：In an embodiment of the present invention, the tenth feature map is determined according to the ninth feature map and the boundary anomaly analysis layer, including:

根据检测窗口内第九特征图中各个像素坐标的像素值，计算检测窗口的平均像素值；Calculate the average pixel value of the detection window according to the pixel value of each pixel coordinate in the ninth feature map in the detection window;

确定各个像素坐标的像素值与像素坐标所处检测窗口的平均像素值的差；Determine the difference between the pixel value of each pixel coordinate and the average pixel value of the detection window where the pixel coordinate is located;

计算第九特征图的像素值标准差；Calculate the pixel value standard deviation of the ninth feature map;

根据像素值标准差、各个像素坐标的像素值与像素坐标所处检测窗口的平均像素值的差，计算检测窗口内像素坐标的标准化像素值；According to the standard deviation of the pixel value, the difference between the pixel value of each pixel coordinate and the average pixel value of the detection window where the pixel coordinate is located, calculate the normalized pixel value of the pixel coordinate in the detection window;

根据检测窗口内像素坐标的标准化像素值，确定第十特征图。The tenth feature map is determined according to the normalized pixel values of the pixel coordinates within the detection window.

本发明实施例关注的是检测窗口这一局部区域内的像素值差异。在篡改边界两侧，像素值存在差异，这种差异可以通过计算检测窗口内像素坐标的标准化像素值来确定。在实际应用场景中，检测窗口的尺寸可以根据需要进行调整。The embodiment of the present invention focuses on the pixel value difference in the local area of the detection window. There is a difference in pixel values on both sides of the tampering boundary, which can be determined by computing the normalized pixel values of the pixel coordinates within the detection window. In practical application scenarios, the size of the detection window can be adjusted as required.

在本发明的一个实施例中，将第十特征图输入第二结果输出层，得到边界检测结果，包括：In an embodiment of the present invention, the tenth feature map is input into the second result output layer to obtain a boundary detection result, including:

将第十特征图输入第六卷积层，得到第十三特征图；Input the tenth feature map into the sixth convolutional layer to obtain the thirteenth feature map;

对第十三特征图进行上采样，得到第十四特征图；Upsampling the thirteenth feature map to obtain the fourteenth feature map;

将第十四特征图输入激活函数，得到区域检测结果；Input the fourteenth feature map into the activation function to obtain the region detection result;

其中，第六卷积层为1×1卷积层。Among them, the sixth convolutional layer is a 1×1 convolutional layer.

第二结果输出层与第一结果输出层类似，本发明实施例通过上采样放大图像，并通过激活函数将像素值映射到0～1之间，第十四特征图各个像素坐标的映射结果构成区域检测结果。采用的激活函数可以为sigmoid函数、softmax函数等。The second result output layer is similar to the first result output layer. In the embodiment of the present invention, the image is enlarged by up-sampling, and the pixel value is mapped between 0 and 1 through the activation function. The mapping result of each pixel coordinate of the fourteenth feature map is composed of Area detection results. The activation function used can be a sigmoid function, a softmax function, or the like.

在本发明的一个实施例中，获取训练样本，包括：In one embodiment of the present invention, acquiring training samples includes:

获取训练图像和区域标签；Get training images and region labels;

对区域标签执行膨胀操作，得到膨胀图像；Perform an expansion operation on the region label to obtain an expanded image;

对区域标签执行腐蚀操作，得到腐蚀图像；Perform the erosion operation on the region label to get the erosion image;

根据膨胀图像和腐蚀图像，确定边界标签。Based on the dilated and eroded images, the boundary labels are determined.

在本发明实施例中，鉴于CASIA 2.0中不存在边界标签，因此，本发明实施例基于区域标签生成边界标签。膨胀图像与腐蚀图像的差值为边界标签。膨胀操作和腐蚀操作可以采用7x7的窗口实现。通过本发明实施例能够更加便捷的获取边界标签，提高模型训练效率。In the embodiment of the present invention, since there is no boundary label in CASIA 2.0, the embodiment of the present invention generates a boundary label based on the area label. The difference between the dilated image and the eroded image is the boundary label. Dilation and erosion operations can be implemented with a 7x7 window. Through the embodiment of the present invention, the boundary label can be obtained more conveniently, and the model training efficiency can be improved.

在本发明的一个实施例中，该方法还包括：获取预训练样本；基于预训练样本对检测模型进行预训练；In an embodiment of the present invention, the method further includes: acquiring pre-training samples; pre-training the detection model based on the pre-training samples;

将训练图像输入经过预训练的检测模型，得到区域检测结果和边界检测结果。Input the training image into the pre-trained detection model to obtain the region detection results and boundary detection results.

在本发明实施例中，预训练样本可以通过COCO数据集中的原始图像构建。例如，在COCO数据集中选择一张图像作为原始图像，然后从另一张图像中裁剪出一个物体，经过旋转、放大等操作粘贴到原始图像中。本发明实施例在通过预训练提高检测模型的训练效果，进而提高篡改图像检测的准确��。In this embodiment of the present invention, the pre-training samples may be constructed from the original images in the COCO dataset. For example, select an image in the COCO dataset as the original image, then crop an object from another image, and paste it into the original image through operations such as rotation, enlargement, etc. The embodiment of the present invention improves the training effect of the detection model through pre-training, thereby improving the accuracy of tampered image detection.

如图2所示，本发明实施例提供了一种图像检测方法，包括：As shown in FIG. 2, an embodiment of the present invention provides an image detection method, including:

步骤201：获取预训练样本。Step 201: Obtain pre-training samples.

从COCO数据集中选择原始图像，从另一张图像中裁剪出物体图像，将该物体图像经过旋转、方法后黏贴到原始图像中，得到预训练样本。Select the original image from the COCO dataset, crop the object image from another image, and paste the object image into the original image after rotation and method to obtain pre-training samples.

步骤202：基于预训练样本对检测模型进行预训练。Step 202: Pre-train the detection model based on the pre-training samples.

检测模型的架构如图3所示，下述实施例将对其架构进行详细说明。The architecture of the detection model is shown in FIG. 3 , and the following embodiments will describe the architecture in detail.

步骤203：获取训练图像和区域标签。Step 203: Obtain training images and region labels.

从CASIA 2.0中获取训练图像和区域标签。Get training images and region labels from CASIA 2.0.

步骤204：对区域标签执行膨胀操作，得到膨胀图像。Step 204: Perform an expansion operation on the region label to obtain an expanded image.

步骤205：对区域标签执行腐蚀操作，得到腐蚀图像。Step 205: Perform an erosion operation on the region label to obtain an erosion image.

膨胀操作和腐蚀操作采用的窗口尺寸为7×7。The window size used for dilation and erosion operations is 7 × 7.

如图4所示，从左到右依次是区域标签，膨胀图像、腐蚀图像，边界标签。As shown in Figure 4, from left to right are region labels, dilated images, eroded images, and boundary labels.

步骤206：根据膨胀图像和腐蚀图像，确定边界标签。Step 206: Determine boundary labels according to the dilated image and the eroded image.

训练图像、区域标签和边界标签构成训练样本。Training images, region labels, and boundary labels constitute training samples.

步骤207：将训练图像输入特征提取层，以从训练图像中提取出高阶特征图和低阶特征图。Step 207: Input the training image into the feature extraction layer to extract high-order feature maps and low-order feature maps from the training images.

具体地，将训练图像输入主干网络，主干网络的结构如图5所示，从图中可以看出，主干网络包括入口层、中间层和出口层。入口层包括五个多通道卷积层和九个深度可分离卷积层。以“Conv 32，3x3，stride2”为例，Conv 32表示多通道卷积层的输出通道为32，卷积核为3x3，步幅为2。中间层包括16个相同的深度可分离卷积层。出口层包括一个多通道卷积层和六个深度可分离卷积层。低价特征图由中间层的第三个深度可分离卷积层输出，第一特征图由出口层输出。Specifically, the training images are input into the backbone network. The structure of the backbone network is shown in Figure 5. As can be seen from the figure, the backbone network includes an entry layer, an intermediate layer and an exit layer. The entry layer consists of five multi-channel convolutional layers and nine depthwise separable convolutional layers. Taking "Conv 32, 3x3, stride2" as an example, Conv 32 means that the output channel of the multi-channel convolutional layer is 32, the convolution kernel is 3x3, and the stride is 2. The middle layer consists of 16 identical depthwise separable convolutional layers. The exit layer consists of one multi-channel convolutional layer and six depthwise separable convolutional layers. The low-cost feature map is output by the third depthwise separable convolutional layer in the middle layer, and the first feature map is output by the export layer.

参考图3，将第一特征图依次输入卷积核为3x3，膨胀率为6、12、18的三个空洞卷积层，一个1x1卷积层和一个池化层，得到多个第二特征图。将第二特征图按Z轴拼接，并输入1x1卷积层，得到融合不同尺度特征的高阶特征图。在本发明实施例中，低阶特征图的大小为训练图像的1/4，高阶特征图大小为训练图像的1/16。多尺度网络还可以为图6所示结构，其中包括1x1卷积层，膨胀率为1、2、5的膨胀卷积。在图6中，每一个横排的卷积层共享卷积核参数，以便于同一个目标在不同尺度下有相同的特征表达能力。Referring to Figure 3, the first feature map is sequentially input into the convolution kernel of 3x3, three atrous convolutional layers with expansion rates of 6, 12, and 18, a 1x1 convolutional layer and a pooling layer to obtain multiple second features. picture. The second feature map is spliced along the Z axis and input into a 1x1 convolutional layer to obtain a high-order feature map that fuses features of different scales. In the embodiment of the present invention, the size of the low-level feature map is 1/4 of the training image, and the size of the high-level feature map is 1/16 of the training image. The multi-scale network can also be the structure shown in Figure 6, which includes a 1x1 convolutional layer and dilated convolutions with dilation rates of 1, 2, and 5. In Figure 6, each horizontal convolutional layer shares convolution kernel parameters, so that the same target has the same feature expression ability at different scales.

步骤208：将高阶特征图和低阶特征图输入区域检测层，得到区域检测结果。Step 208: Input the high-order feature map and the low-order feature map into the region detection layer to obtain a region detection result.

具体地，将低阶特征图输入1x1卷积层，得到第五特征图。对高阶特征图进行双线性插值，使其放大四倍，得到第六特征图。将第五特征图与第六特征图按照Z轴拼接，然后输入3x3卷积层将特征融合，得到第三特征图。Specifically, the low-order feature map is input into the 1x1 convolutional layer to obtain the fifth feature map. Perform bilinear interpolation on the higher-order feature map to make it quadruple to obtain the sixth feature map. Splicing the fifth feature map and the sixth feature map according to the Z axis, and then inputting the 3x3 convolution layer to fuse the features to obtain the third feature map.

根据第三特征图中各个像素坐标的像素值，计算第三特征图的平均像素值，如式(1)所示。According to the pixel value of each pixel coordinate in the third feature map, the average pixel value of the third feature map is calculated, as shown in formula (1).

其中，F[i,j]用于表征第三特征图中像素坐标(i，j)的像素值，H用于表征第三特征图的高度，W用于表征第三特征图的宽度，μ_f用于表征第三特征图的平均像素值。Among them, F[i,j] is used to represent the pixel value of the pixel coordinate (i, j) in the third feature map, H is used to represent the height of the third feature map, W is used to represent the width of the third feature map, μ _f is used to characterize the average pixel value of the third feature map.

确定各个像素坐标的像素值与平均像素值的差，如式(2)所示。Determine the difference between the pixel value of each pixel coordinate and the average pixel value, as shown in equation (2).

D_f[i,j]＝F[i,j]-μ_f (2)D _f [i,j]=F[i,j]-μ _f (2)

其中，D_f[i,j]用于表征像素坐标(i，j)的像素值与平均像素值的差。Among them, D _f [i, j] is used to represent the difference between the pixel value of the pixel coordinate (i, j) and the average pixel value.

根据各个像素坐标的像素值与平均像素值的差，计算第三特征图的像素值标准差。According to the difference between the pixel value of each pixel coordinate and the average pixel value, the pixel value standard deviation of the third feature map is calculated.

根据像素值标准差、各个像素坐标的像素值与平均像素值的差，计算各个像素坐标的标准化像素值，如式(3)所示。According to the standard deviation of the pixel value and the difference between the pixel value of each pixel coordinate and the average pixel value, the normalized pixel value of each pixel coordinate is calculated, as shown in formula (3).

Z_f[i,j]＝D_f[i,j]/max(σ_f,ε+ω_σ1) (3)Z _f [i,j]=D _f [i,j]/max(σ _f ,ε+ω _σ1 ) (3)

其中，σ_f用于表征第三特征图的像素值标准差，ε为10^-5，ω_σ1为可以通过训练过程不断调整的第一向量。Among them, σ _f is used to represent the pixel value standard deviation of the third feature map, ε is 10 ^-5 , and ω _σ1 is a first vector that can be continuously adjusted through the training process.

将第四特征图输入1×1卷积层，得到第七特征图。通过双线性插值将第七特征图放大四倍，得到第八特征图。将第八特征图输入sigmoid函数，得到区域检测结果。Input the fourth feature map into a 1×1 convolutional layer to get the seventh feature map. The seventh feature map is enlarged four times by bilinear interpolation to obtain the eighth feature map. Input the eighth feature map into the sigmoid function to obtain the region detection result.

步骤209：将高阶特征图和低阶特征图输入边界检测层，得到边界检测结果。Step 209: Input the high-order feature map and the low-order feature map into the boundary detection layer to obtain a boundary detection result.

具体地，将低阶特征图输入1x1卷积层，得到第十一特征图。对高阶特征图进行双线性插值，使其放大四倍，得到第十二特征图。将第十一特征图与第十二特��图按照Z轴拼接，然后输入3x3卷积层将特征融合，得到第九特征图。Specifically, the low-order feature map is input into the 1x1 convolutional layer to obtain the eleventh feature map. Perform bilinear interpolation on the higher-order feature map to make it quadruple to obtain the twelfth feature map. The eleventh feature map and the twelfth feature map are spliced according to the Z axis, and then the 3x3 convolution layer is input to fuse the features to obtain the ninth feature map.

根据检测窗口内第九特征图中各个像素坐标的像素值，计算检测窗口的平均像素值，如式(4)所示。According to the pixel value of each pixel coordinate in the ninth feature map in the detection window, the average pixel value of the detection window is calculated, as shown in formula (4).

其中，

用于表征高为7、宽为7的检测窗口的平均像素值。in,

The average pixel value used to characterize a detection window with a height of 7 and a width of 7.

确定各个像素坐标的像素值与像素坐标所处检测窗口的平均像素值的差，如式(5)所示。Determine the difference between the pixel value of each pixel coordinate and the average pixel value of the detection window where the pixel coordinate is located, as shown in formula (5).

其中，

用于表征像素坐标(i，j)的像素值与像素坐标所处检测窗口的平均像素值的差。in,

The difference between the pixel value used to characterize the pixel coordinate (i, j) and the average pixel value of the detection window where the pixel coordinate is located.

计算第九特征图的像素值标准差。Calculate the standard deviation of pixel values of the ninth feature map.

根据像素值标准差、各个像素坐标的像素值与像素坐标所处检测窗口的平均像素值的差，计算检测窗口内像素坐标的标准化像素值，如式(6)所示。According to the standard deviation of the pixel value, the difference between the pixel value of each pixel coordinate and the average pixel value of the detection window where the pixel coordinate is located, the normalized pixel value of the pixel coordinate in the detection window is calculated, as shown in formula (6).

在本发明实施例中，第九特征图为第三特征图相同，两者的像素值标准差相同。ω_σ2为可以通过训练过程不断调整的第二向量。In the embodiment of the present invention, the ninth feature map is the same as the third feature map, and the standard deviations of the pixel values of the two are the same. _ωσ2 is a second vector that can be continuously adjusted through the training process.

将第十特征图输入1×1卷积层，得到第十三特征图。通过双线性插值将第十三特征图放大四倍，得到第十四特征图。将第十四特征图输入sigmoid函数，得到边界检测结果。Input the tenth feature map into a 1×1 convolutional layer to get the thirteenth feature map. The thirteenth feature map is enlarged by a factor of four by bilinear interpolation to obtain the fourteenth feature map. Input the fourteenth feature map into the sigmoid function to get the boundary detection result.

步骤210：根据区域标签、边界标签、区域检测结果和边界检测结果，训练检测模型。Step 210: Train the detection model according to the region label, the boundary label, the region detection result and the boundary detection result.

根据区域标签和区域检测结果的差异，可以确定预测篡改区域与实际篡改区域的差异；根据边界标签和边界检测结果的差异，可以确定预测篡改边界与实际篡改边界的差异。本发明实施例采用交叉熵损失函数，包括区域检测和边界检测两部分，如式(7)-(9)所示。According to the difference between the area label and the area detection result, the difference between the predicted tampering area and the actual tampering area can be determined; according to the difference between the boundary label and the boundary detection result, the difference between the predicted tampering boundary and the actual tampering boundary can be determined. The embodiment of the present invention adopts a cross-entropy loss function, which includes two parts: region detection and boundary detection, as shown in equations (7)-(9).

其中，m用于表征训练样本数量，

用于表征训练样本k的区域检测结果，

用于表征训练样本k的边界检测结果，

用于表征训练样本k中像素坐标(i，j)对应的区域标签的值，

用于表征训练样本k中像素坐标(i，j)的区域检测结果，

用于表征训练样本k中像素坐标(i，j)对应的边界标签的值，

用于表征训练样本k中像素坐标(i，j)的边界检测结果。Among them, m is used to represent the number of training samples,

is used to characterize the region detection results of training sample k,

is used to characterize the boundary detection results of the training sample k,

is used to characterize the value of the region label corresponding to the pixel coordinate (i, j) in the training sample k,

is used to characterize the region detection results of the pixel coordinates (i, j) in the training sample k,

is used to characterize the value of the boundary label corresponding to the pixel coordinate (i, j) in the training sample k,

Used to characterize the boundary detection results of pixel coordinates (i, j) in training sample k.

通过式(7)-(9)可以计算损失值，根据损失值调整检测模型的参数。The loss value can be calculated by formulas (7)-(9), and the parameters of the detection model can be adjusted according to the loss value.

步骤211：基于训练好的检测模型，确定检测图像是否被篡改。Step 211: Based on the trained detection model, determine whether the detection image has been tampered with.

检测模型可以将输入的像素值映射到0和1之间，如果Sigmoid函数输出的值大于设定值(本发明实施例为0.5)，则确定像素坐标位于篡改区域中，否则，位于背景区域中。The detection model can map the input pixel value between 0 and 1. If the value output by the Sigmoid function is greater than the set value (0.5 in the embodiment of the present invention), it is determined that the pixel coordinates are located in the tampering area, otherwise, it is located in the background area. .

本发明实施例以Columbia和CASIA 1.0为测试样本集，通过F1分数评估训练得到的检测模型的性能，测试结果如表1所示。由表1可知，与其他模型相比，本发明实施例训练得到的检测模型具有最高的F1分数，说明其性能优于其他模型。其中，RGB-N是一种基于双流Faster R-CNN的篡改图像检测方法，NOI1是一种基于噪声不一致来检测篡改图像的方法，其使用高通小波系数来模拟局部噪声，CFA是一种CFA模式估计方法，它使用附近的像素来近似相机滤波器阵列模式，然后产生每个像素的篡改概率。DCT是一种基于DCT系数直方图差异的JPEG图像篡改检测方法。In the embodiment of the present invention, Columbia and CASIA 1.0 are used as test sample sets, and the performance of the detection model obtained by training is evaluated by the F1 score. The test results are shown in Table 1. It can be seen from Table 1 that, compared with other models, the detection model trained in the embodiment of the present invention has the highest F1 score, indicating that its performance is better than other models. Among them, RGB-N is a tampered image detection method based on dual-stream Faster R-CNN, NOI1 is a method for detecting tampered images based on noise inconsistency, which uses high-pass wavelet coefficients to simulate local noise, and CFA is a CFA mode Estimation method, which uses nearby pixels to approximate the camera filter array pattern and then produces a tampering probability for each pixel. DCT is a JPEG image forgery detection method based on the histogram difference of DCT coefficients.

表1不同模型的F1分数Table 1 F1 scores of different models

ColumbiaColumbia CASIA 1.0CASIA 1.0 检测模型Detection model 0.7470.747 0.4350.435 RGB-NRGB-N 0.6970.697 0.4080.408 NOI1NOI1 0.5740.574 0.2630.263 DCTDCT 0.5200.520 0.3010.301 CFACFA 0.5030.503 0.2120.212

如图7所示，本发明实施例提供了一种图像检测装置，包括：As shown in FIG. 7 , an embodiment of the present invention provides an image detection apparatus, including:

获取模块701，配置为获取训练样本；其中，训练样本，包括：训练图像、区域标签和边界标签；The obtaining module 701 is configured to obtain training samples; wherein, the training samples include: training images, region labels and boundary labels;

训练模块702，配置为将训练图像输入检测模型，得到区域检测结果和边界检测结果；根据区域标签、边界标签、区域检测结果和边界检测结果，训练检测模型；The training module 702 is configured to input the training image into the detection model to obtain the region detection result and the boundary detection result; train the detection model according to the region label, the boundary label, the region detection result and the boundary detection result;

检测模块703，配置为基于训练好的检测模型，确定检测图像是否被篡改。The detection module 703 is configured to determine whether the detected image has been tampered with based on the trained detection model.

训练模块702，配置为将训练图像输入特征提取层，以从训练图像中提取出高阶特征图和低阶特征图；将高阶特征图和低阶特征图输入区域检测层，得到区域检测结果；将高阶特征图和低阶特征图输入边界检测层，得到边界检测结果。The training module 702 is configured to input the training image into the feature extraction layer, so as to extract the high-order feature map and the low-order feature map from the training image; input the high-order feature map and the low-order feature map into the region detection layer to obtain the region detection result ; Input the high-order feature map and low-order feature map into the boundary detection layer to obtain the boundary detection result.

在本发明的一个实施例中，训练模块702，配置为将训练图像输入主干网络，得到低阶特征图和第一特征图；基于多尺度网络从第一特征图中提取多尺度特征，得到多个第二特征图；将多个第二特征图拼接后输入第一卷积层，得到高阶特征图；其中，主干网络包括：第一多通道卷积层和深度可分离卷积层；第一卷积层为1×1卷积层。In an embodiment of the present invention, the training module 702 is configured to input the training image into the backbone network to obtain a low-order feature map and a first feature map; extract multi-scale features from the first feature map based on a multi-scale network, and obtain a a second feature map; multiple second feature maps are spliced and input into the first convolutional layer to obtain a high-order feature map; wherein, the backbone network includes: a first multi-channel convolutional layer and a depthwise separable convolutional layer; A convolutional layer is a 1×1 convolutional layer.

在本发明的一个实施例中，多尺度网络包括：空洞卷积层、第二卷积层和池化层；其中，第二卷积层为1×1卷积层。In an embodiment of the present invention, the multi-scale network includes: an atrous convolutional layer, a second convolutional layer, and a pooling layer; wherein the second convolutional layer is a 1×1 convolutional layer.

在本发明的一个实施例中，区域检测层，包括：第一特征融合层、区域异常分析层和第一结果输出层；训练模块702，配置为将高阶特征图和低阶特征图输入第一特征融合层，得到第三特征图；根据第三特征图和区域异常分析层，确定第四特征图；其中，第四特征图用于表征第三特征图中篡改区域与背景区域的像素值差异；将第四特征图输入第一结果输出层，得到区域检测结果。In an embodiment of the present invention, the region detection layer includes: a first feature fusion layer, a region anomaly analysis layer, and a first result output layer; the training module 702 is configured to input the high-order feature map and the low-order feature map into the first result output layer. a feature fusion layer to obtain a third feature map; according to the third feature map and the regional anomaly analysis layer, a fourth feature map is determined; wherein the fourth feature map is used to represent the pixel values of the tampered area and the background area in the third feature map difference; input the fourth feature map into the first result output layer to obtain the region detection result.

在本发明的一个实施例中，训练模块702，配置为将低阶特征图输入第三卷积层，得到第五特征图；对高阶特征图进行上采样，得到第六特征图；将第五特征图和第六特征图拼接后输入第二多通道卷积层，得到第三特征图；其中，第三卷积层为1×1卷积层。In an embodiment of the present invention, the training module 702 is configured to input the low-order feature map into the third convolutional layer to obtain the fifth feature map; upsample the high-order feature map to obtain the sixth feature map; The fifth feature map and the sixth feature map are spliced and input into the second multi-channel convolutional layer to obtain the third feature map; wherein, the third convolutional layer is a 1×1 convolutional layer.

在本发明的一个实施例中，训练模块702，配置为根据第三特征图中各个像素坐标的像素值，计算第三特征图的平均像素值；确定各个像素坐标的像素值与平均像素值的差；根据各个像素坐标的像素值与平均像素值的差，计算第三特征图的像素值标准差；根据像素值标准差、各个像素坐标的像素值与平均像素值的差，计算各个像素坐标的标准化像素值；根据各个像素坐标的标准化像素值，确定第四特征图。In an embodiment of the present invention, the training module 702 is configured to calculate the average pixel value of the third feature map according to the pixel value of each pixel coordinate in the third feature map; determine the difference between the pixel value of each pixel coordinate and the average pixel value difference; according to the difference between the pixel value of each pixel coordinate and the average pixel value, calculate the pixel value standard deviation of the third feature map; according to the pixel value standard deviation, the difference between the pixel value of each pixel coordinate and the average pixel value, calculate each pixel coordinate The normalized pixel value of , and the fourth feature map is determined according to the normalized pixel value of each pixel coordinate.

在本发明的一个实施例中，训练模块702，配置为将第四特征图输入第四卷积层，得到第七特征图；对第七特征图进行上采样，得到第八特征图；将第八特征图输入激活函数，得到区域检测结果；其中，第四卷积层为1×1卷积层。In an embodiment of the present invention, the training module 702 is configured to input the fourth feature map into the fourth convolution layer to obtain the seventh feature map; upsample the seventh feature map to obtain the eighth feature map; The eight feature maps are input to the activation function to obtain the region detection results; among them, the fourth convolutional layer is a 1×1 convolutional layer.

在本发明的一个实施例中，边界检测层，包括：第二特征融合层、边界异常分析层和第二结果输出层；训练模块702，配置为将高阶特征图和低阶特征图输入第二特征融合层，得到第九特征图；根据第九特征图和边界异常分析层，确定第十特征图；其中，第十特征图用于表征检测窗口内篡改区域与背景区域的像素值差异；将第十特征图输入第二结果输出层，得到边界检测结果。In an embodiment of the present invention, the boundary detection layer includes: a second feature fusion layer, a boundary abnormality analysis layer, and a second result output layer; the training module 702 is configured to input the high-order feature map and the low-order feature map into the first The second feature fusion layer is to obtain the ninth feature map; the tenth feature map is determined according to the ninth feature map and the boundary anomaly analysis layer; wherein, the tenth feature map is used to represent the pixel value difference between the tampered area and the background area in the detection window; The tenth feature map is input into the second result output layer to obtain the boundary detection result.

在本发明的一个实施例中，训练模块702，配置为将低阶特征图输入第五卷积层，得到第十一特征图；对高阶特征图进行上采样，得到第十二特征图；将第十一特征图和第十二特征图拼接后输入第三多通道卷积层，得到第九特征图；其中，第五卷积层为1×1卷积层。In an embodiment of the present invention, the training module 702 is configured to input the low-order feature map into the fifth convolutional layer to obtain an eleventh feature map; perform upsampling on the high-order feature map to obtain the twelfth feature map; The eleventh feature map and the twelfth feature map are spliced and input into the third multi-channel convolutional layer to obtain the ninth feature map; among them, the fifth convolutional layer is a 1×1 convolutional layer.

在本发明的一个实施例中，训练模块702，配置为根据检测窗口内第九特征图中各个像素坐标的像素值，计算检测窗口的平均像素值；确定各个像素坐标的像素值与像素坐标所处检测窗口的平均像素值的差；计算第九特征图的像素值标准差；根据像素值标准差、各个像素坐标的像素值与像素坐标所处检测窗口的平均像素值的差，计算检测窗口内像素坐标的标准化像素值；根据检测窗口内像素坐标的标准化像素值，确定第十特征图。In an embodiment of the present invention, the training module 702 is configured to calculate the average pixel value of the detection window according to the pixel value of each pixel coordinate in the ninth feature map in the detection window; Calculate the standard deviation of the pixel values of the ninth feature map; calculate the detection window according to the standard deviation of the pixel values, the difference between the pixel value of each pixel coordinate and the average pixel value of the detection window where the pixel coordinates are located The normalized pixel value of the inner pixel coordinates; the tenth feature map is determined according to the normalized pixel value of the pixel coordinates in the detection window.

在本发明的一个实施例中，训练模块702，配置为将第十特征图输入第六卷积层，得到第十三特征图；对第十三特征图进行上采样，得到第十四特征图；将第十四特征图输入激活函数，得到区域检测结果；其中，第六卷积层为1×1卷积层。In an embodiment of the present invention, the training module 702 is configured to input the tenth feature map into the sixth convolutional layer to obtain the thirteenth feature map; and upsample the thirteenth feature map to obtain the fourteenth feature map ; Input the fourteenth feature map into the activation function to obtain the region detection result; among them, the sixth convolutional layer is a 1×1 convolutional layer.

在本发明的一个实施例中，获取模块701，配置为获取训练图像和区域标签；对区域标签执行膨胀操作，得到膨胀图像；对区域标签执行腐蚀操作，得到腐蚀图像；根据膨胀图像和腐蚀图像，确定边界标签。In an embodiment of the present invention, the acquisition module 701 is configured to acquire training images and region labels; perform a dilation operation on the region labels to obtain a dilated image; perform an erosion operation on the region labels to obtain an eroded image; , to determine the boundary labels.

在本发明的一个实施例中，获取模块701，配置为获取预训练样本；基于预训练样本对检测模型进行预训练；将训练图像输入经过预训练的检测模型，得到区域检测结果和边界检测结果。In an embodiment of the present invention, the acquisition module 701 is configured to acquire pre-training samples; pre-train the detection model based on the pre-training samples; input the training image into the pre-trained detection model to obtain the region detection result and the boundary detection result .

本发明实施例提供了一种电子设备，包括：An embodiment of the present invention provides an electronic device, including:

一个或多个处理器；one or more processors;

当一个或多个程序被一个或多个处理器执行，使得一个或多个处理器实现如上述任一实施例所述的方法。When one or more programs are executed by one or more processors, the one or more processors implement the method as described in any of the above embodiments.

本发明实施例提供了一种计算机可读介质，其上存储有计算机程序，程序被处理器执行时实现如上述任一实施例所述的方法。An embodiment of the present invention provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method described in any of the foregoing embodiments is implemented.

图8示出了可以应用本发明实施例的图像检测方法或图像检测装置的示例性系统架构800。FIG. 8 shows an exemplary system architecture 800 to which an image detection method or an image detection apparatus according to an embodiment of the present invention may be applied.

如图8所示，系统架构800可以包括终端设备801、802、803，网络804和服务器805。网络804用以在终端设备801、802、803和服务器805之间提供通信链路的介质。网络804可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 8 , the system architecture 800 may include terminal devices 801 , 802 , and 803 , a network 804 and a server 805 . The network 804 is a medium used to provide a communication link between the terminal devices 801 , 802 , 803 and the server 805 . Network 804 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备801、802、803通过网络804与服务器805交互，以接收或发送消息等。终端设备801、802、803上可以安装有各种通讯客户端应用，例如购物类应用、网页浏览器应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等(仅为示例)。The user can use the terminal devices 801, 802, 803 to interact with the server 805 through the network 804 to receive or send messages and the like. Various communication client applications may be installed on the terminal devices 801 , 802 and 803 , such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (only examples).

终端设备801、802、803可以是具有显示屏并且支持网页浏览的各种电子设备，包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。The terminal devices 801, 802, 803 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like.

服务器805可以是提供各种服务的服务器，例如对用户利用终端设备801、802、803所浏览的购物类网站提供支持的后台管理服务器(仅为示例)。后台管理服务器可以对接收到的产品信息查询请求等数据进行分析等处理，并将处理结果(例如目标推送信息、产品信息--仅为示例)反馈给终端设备。The server 805 may be a server that provides various services, such as a background management server that provides support for shopping websites browsed by the terminal devices 801 , 802 and 803 (just an example). The background management server can analyze and process the received product information query request and other data, and feed back the processing results (such as target push information, product information—just an example) to the terminal device.

需要说明的是，本发明实施例所提供的图像检测方法一般由服务器805执行，相应地，图像检测装置一般设置于服务器805中。It should be noted that the image detection method provided in the embodiment of the present invention is generally executed by the server 805 , and accordingly, the image detection apparatus is generally set in the server 805 .

应该理解，图8中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 8 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

下面参考图9，其示出了适于用来实现本发明实施例的终端设备的计算机系统900的结构示意图。图9示出的终端设备仅仅是一个示例，不应对本发明实施例的功能和使用范围带来任何限制。Referring next to FIG. 9 , it shows a schematic structural diagram of a computer system 900 suitable for implementing a terminal device according to an embodiment of the present invention. The terminal device shown in FIG. 9 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present invention.

如图9所示，计算机系统900包括中央处理单元(CPU)901，其可以根据存储在只读存储器(ROM)902中的程序或者从存储部分908加载到随机访问存储器(RAM)903中的程序而执行各种适当的动作和处理。在RAM 903中，还存储有系统900操作所需的各种程序和数据。CPU 901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(I/O)接口905也连接至总线904。As shown in FIG. 9, a computer system 900 includes a central processing unit (CPU) 901, which can be loaded into a random access memory (RAM) 903 according to a program stored in a read only memory (ROM) 902 or a program from a storage section 908 Instead, various appropriate actions and processes are performed. In the RAM 903, various programs and data necessary for the operation of the system 900 are also stored. The CPU 901 , the ROM 902 , and the RAM 903 are connected to each other through a bus 904 . An input/output (I/O) interface 905 is also connected to bus 904 .

以下部件连接至I/O接口905：包括键盘、鼠标等的输入部分906；包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分907；包括硬盘等的存储部分908；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分909。通信部分909经由诸如因特网的网络执行通信处理。驱动器910也根据需要连接至I/O接口905。可拆卸介质911，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器910上，以便于从其上读出的计算机程序根据需要被安装入存储部分908。The following components are connected to the I/O interface 905: an input section 906 including a keyboard, a mouse, etc.; an output section 907 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 908 including a hard disk, etc. ; and a communication section 909 including a network interface card such as a LAN card, a modem, and the like. The communication section 909 performs communication processing via a network such as the Internet. A drive 910 is also connected to the I/O interface 905 as needed. A removable medium 911, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 910 as needed so that a computer program read therefrom is installed into the storage section 908 as needed.

特别地，根据本发明公开的实施例，上文参考流程图描述的过程可以被实现为计算机软件程序。例如，本发明公开的实施例包括一种计算机程序产品，其包括承载在计算机可读介质上的计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中，该计算机程序可以通过通信部分909从网络上被下载和安装，和/或从可拆卸介质911被安装。在该计算机程序被中央处理单元(CPU)901执行时，执行本发明的系统中限定的上述功能。In particular, the processes described above with reference to the flowcharts may be implemented as computer software programs in accordance with the disclosed embodiments of the present invention. For example, embodiments disclosed herein include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 909, and/or installed from the removable medium 911. When the computer program is executed by the central processing unit (CPU) 901, the above-described functions defined in the system of the present invention are executed.

需要说明的是，本发明所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本发明中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本发明中，计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、电线、光缆、RF等等，或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present invention may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In the present invention, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present invention, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

附图中的流程图和框图，图示了按照本发明各种实施例的系统、��和��算机程序��品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图或流程图中的每个方框、以及框图或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented in special purpose hardware-based systems that perform the specified functions or operations, or can be implemented using A combination of dedicated hardware and computer instructions is implemented.

描述于本发明实施例中所涉及到的模块可以通过软件的方式实现，也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中，例如，可以描述为：一种处理器包括发送模块、获取模块、确定模块和第一处理模块。其中，这些模块的名称在某种情况下并不构成对该模块本身的限定，例如，发送模块还可以被描述为“向所连接的服务端发送图片获取请求的模块”。The modules involved in the embodiments of the present invention may be implemented in a software manner, and may also be implemented in a hardware manner. The described modules can also be provided in the processor, for example, it can be described as: a processor includes a sending module, an obtaining module, a determining module and a first processing module. Among them, the names of these modules do not constitute a limitation of the module itself in some cases, for example, the sending module can also be described as "a module that sends a request for image acquisition to the connected server".

作为另一方面，本发明还提供了一种计算机可读介质，该计算机可读介质可以是上述实施例中描述的设备中所包含的；也可以是单独存在，而未装配入该设备中。上述计算机可读介质承载有一个或者多个程序，当上述一个或者多个程序被一个该设备执行时，使得该设备包括：As another aspect, the present invention also provides a computer-readable medium, which may be included in the device described in the above embodiments; or may exist alone without being assembled into the device. The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by a device, the device includes:

根据本发明实施例的技术方案，基于检测模型对图像进行边界检测和区域检测，区域检测基于整个图像的篡改区域与背景区域的特征差异，识别篡改区域，其关注的是图像整体特征；边界检测基于篡改边界两侧的特征差异，识别篡改边界。边界检测能够辅助区域检测，更加精确地确定篡改区域，提高图像检测的准确度。According to the technical solutions of the embodiments of the present invention, boundary detection and area detection are performed on the image based on the detection model, and the area detection is based on the feature difference between the tampered area and the background area of the entire image, and the tampered area is identified, which focuses on the overall characteristics of the image; boundary detection Based on the feature difference on both sides of the tampered boundary, the tampered boundary is identified. Boundary detection can assist region detection, determine the tampered region more accurately, and improve the accuracy of image detection.

上述具体实施方式，��不构成对本发明保护范围的限制。本领域技术人员应该明白的是，取决于设计要求和其他因素，可以发生各种各样的修改、组合、子组合和替代。任何在本发明的精神和原则之内所作的修改、等同替换和改进等，均应包含在本发明保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. An image detection method, comprising:

obtaining a training sample; wherein the training samples comprise: training images, area labels and boundary labels;

inputting the training image into a detection model to obtain an area detection result and a boundary detection result;

training the detection model according to the area label, the boundary label, the area detection result and the boundary detection result;

and determining whether the detection image is tampered or not based on the trained detection model.

2. The method of claim 1,

the detection model comprises: a feature extraction layer, a region detection layer and a boundary detection layer;

inputting the training image into the detection model to obtain a region detection result and a boundary detection result, wherein the method comprises the following steps:

inputting the training image into the feature extraction layer to extract a high-order feature map and a low-order feature map from the training image;

inputting the high-order characteristic diagram and the low-order characteristic diagram into the area detection layer to obtain the area detection result;

and inputting the high-order characteristic diagram and the low-order characteristic diagram into the boundary detection layer to obtain the boundary detection result.

3. The method of claim 2,

the inputting the training image into the feature extraction layer to extract a high-order feature map and a low-order feature map from the training image includes:

inputting the training image into a backbone network to obtain the low-order feature map and a first feature map;

extracting multi-scale features from the first feature map based on a multi-scale network to obtain a plurality of second feature maps;

splicing the second feature maps and inputting the second feature maps into a first convolution layer to obtain the high-order feature map;

wherein the backbone network comprises: a first multi-channel convolutional layer and a depth-separable convolutional layer; the first convolutional layer is a 1 × 1 convolutional layer.

4. The method of claim 3,

the multi-scale network comprises: a void convolutional layer, a second convolutional layer and a pooling layer;

wherein the second convolutional layer is a 1 × 1 convolutional layer.

5. The method of claim 2,

the area detection layer includes: the system comprises a first feature fusion layer, a regional anomaly analysis layer and a first result output layer;

the inputting the high-order feature map and the low-order feature map into the area detection layer to obtain the area detection result includes:

inputting the high-order feature map and the low-order feature map into the first feature fusion layer to obtain a third feature map;

determining a fourth feature map according to the third feature map and the regional anomaly analysis layer; the fourth feature map is used for representing the difference of pixel values of the tampered area and the background area in the third feature map;

and inputting the fourth feature map into the first result output layer to obtain the region detection result.

6. The method of claim 5,

inputting the high-order feature map and the low-order feature map into the first feature fusion layer to obtain a third feature map, including:

inputting the low-order feature map into a third convolutional layer to obtain a fifth feature map;

the high-order characteristic diagram is subjected to up-sampling to obtain a sixth characteristic diagram;

splicing the fifth feature map and the sixth feature map, and inputting the spliced fifth feature map and sixth feature map into a second multi-channel convolutional layer to obtain a third feature map;

wherein the third convolutional layer is a 1 × 1 convolutional layer.

7. The method of claim 5,

determining a fourth feature map according to the third feature map and the regional anomaly analysis layer, wherein the determining comprises:

calculating an average pixel value of the third feature map according to the pixel value of each pixel coordinate in the third feature map;

determining a difference between a pixel value of each of the pixel coordinates and the average pixel value;

calculating a pixel value standard deviation of the third feature map according to the difference between the pixel value of each pixel coordinate and the average pixel value;

calculating a normalized pixel value of each pixel coordinate according to the standard deviation of the pixel values and the difference between the pixel value of each pixel coordinate and the average pixel value;

and determining the fourth feature map according to the normalized pixel value of each pixel coordinate.

8. The method according to any one of claims 5 to 7,

the inputting the fourth feature map into the first result output layer to obtain the region detection result includes:

inputting the fourth feature map into a fourth convolutional layer to obtain a seventh feature map;

up-sampling the seventh characteristic diagram to obtain an eighth characteristic diagram;

inputting the eighth feature map into an activation function to obtain the region detection result;

wherein the fourth convolutional layer is a 1 × 1 convolutional layer.

9. The method of claim 2,

the boundary detection layer includes: the second characteristic fusion layer, the boundary anomaly analysis layer and the second result output layer;

the inputting the high-order feature map and the low-order feature map into the boundary detection layer to obtain the boundary detection result includes:

inputting the high-order feature map and the low-order feature map into the second feature fusion layer to obtain a ninth feature map;

determining a tenth feature map according to the ninth feature map and the boundary anomaly analysis layer; wherein the tenth feature map is used for characterizing the difference of pixel values of the tampered region and the background region in the detection window;

and inputting the tenth feature map into the second result output layer to obtain the boundary detection result.

10. The method of claim 9,

inputting the high-order feature map and the low-order feature map into the second feature fusion layer to obtain a ninth feature map, including:

inputting the low-order feature map into a fifth convolutional layer to obtain an eleventh feature map;

performing upsampling on the high-order characteristic diagram to obtain a twelfth characteristic diagram;

splicing the eleventh characteristic diagram and the twelfth characteristic diagram, and inputting the spliced eleventh characteristic diagram and twelfth characteristic diagram into a third multi-channel convolutional layer to obtain a ninth characteristic diagram;

wherein the fifth convolutional layer is a 1 × 1 convolutional layer.

11. The method of claim 9,

determining a tenth feature map according to the ninth feature map and the boundary anomaly analysis layer, including:

calculating the average pixel value of the detection window according to the pixel value of each pixel coordinate in the ninth characteristic diagram in the detection window;

determining the difference between the pixel value of each pixel coordinate and the average pixel value of the detection window where the pixel coordinate is located;

calculating a pixel value standard deviation of the ninth feature map;

calculating a normalized pixel value of a pixel coordinate in a detection window according to the standard deviation of the pixel value, the difference between the pixel value of each pixel coordinate and the average pixel value of the detection window where the pixel coordinate is located;

and determining the tenth feature map according to the normalized pixel value of the pixel coordinate in the detection window.

12. The method according to any one of claims 9 to 11,

inputting the tenth feature map into the second result output layer to obtain the boundary detection result, where the method includes:

inputting the tenth characteristic diagram into a sixth convolutional layer to obtain a thirteenth characteristic diagram;

performing upsampling on the thirteenth characteristic diagram to obtain a fourteenth characteristic diagram;

inputting the fourteenth feature map into an activation function to obtain the region detection result;

wherein the sixth convolutional layer is a 1 × 1 convolutional layer.

13. The method of claim 1,

the obtaining of the training sample includes:

acquiring the training image and the area label;

performing expansion operation on the area label to obtain an expansion image;

performing corrosion operation on the area label to obtain a corrosion image;

and determining the boundary label according to the expansion image and the erosion image.

14. The method of claim 1, further comprising:

obtaining a pre-training sample;

pre-training the detection model based on the pre-training samples;

inputting the training image into a detection model to obtain a region detection result and a boundary detection result, wherein the method comprises the following steps:

and inputting the training image into the pre-trained detection model to obtain the region detection result and the boundary detection result.

15. An image detection apparatus, characterized by comprising:

an acquisition module configured to acquire a training sample; wherein the training samples comprise: training images, area labels and boundary labels;

the training module is configured to input the training image into a detection model to obtain an area detection result and a boundary detection result; training the detection model according to the area label, the boundary label, the area detection result and the boundary detection result;

and the detection module is configured to determine whether the detection image is tampered based on the trained detection model.

16. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-14.

17. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-14.