CN113780523B - Image processing method, device, terminal equipment and storage medium - Google Patents

Image processing method, device, terminal equipment and storage medium Download PDF

Info

Publication number
CN113780523B
CN113780523B CN202110994479.5A CN202110994479A CN113780523B CN 113780523 B CN113780523 B CN 113780523B CN 202110994479 A CN202110994479 A CN 202110994479A CN 113780523 B CN113780523 B CN 113780523B
Authority
CN
China
Prior art keywords
parameter set
fixed point
dimension
input data
scaling factor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110994479.5A
Other languages
Chinese (zh)
Other versions
CN113780523A (en
Inventor
杨海辉
蔡万伟
尹长生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN202110994479.5A priority Critical patent/CN113780523B/en
Publication of CN113780523A publication Critical patent/CN113780523A/en
Application granted granted Critical
Publication of CN113780523B publication Critical patent/CN113780523B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/499Denomination or exception handling, e.g. rounding or overflow
    • G06F7/49905Exception handling
    • G06F7/4991Overflow or underflow
    • G06F7/49915Mantissa overflow or underflow in handling floating-point numbers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Neurology (AREA)
  • Complex Calculations (AREA)
  • Image Processing (AREA)

Abstract

The application provides an image processing method, an image processing device, a terminal device and a storage medium, wherein the image processing method comprises the following steps: acquiring an image to be processed; inputting the image to be processed into a neural network to determine input data of any convolution layer in the neural network; if the input data is represented by the floating point number, obtaining a scaling factor of the input data, and converting the input data from the floating point number representation to the fixed point number representation according to the scaling factor of the input data; obtaining filter parameters expressed by fixed point numbers of a convolution layer and bias parameters expressed by fixed point numbers; and carrying out convolution operation on the input data expressed by the fixed point number, the filter parameters expressed by the fixed point number and the offset parameters expressed by the fixed point number to obtain output data of a convolution layer expressed by the fixed point number, wherein the output data refers to a convolution operation result of the image to be processed. The method and the device can solve the problem that a large amount of storage resources and calculation resources are required to be consumed in convolution operation of the image in the prior art.

Description

Image processing method, device, terminal equipment and storage medium
Technical Field
The application belongs to the technical field of image processing, and particularly relates to an image processing method, an image processing device, terminal equipment and a storage medium.
Background
With the great breakthrough of deep learning in the field of image processing, the research of deep learning based on a neural network is raised. The neural network is a floating point model obtained through training and consists of a series of operators, and a large number of intensive operations are included. The convolution operation is a common operator in the neural network, and when the image is processed through the neural network, a large amount of storage resources and calculation resources are consumed in the convolution operation of the image due to the fact that parameters in the neural network are more and the data size is larger.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device, terminal equipment and a storage medium, so as to solve the problem that a large amount of storage resources and calculation resources are required to be consumed in convolution operation of images in the prior art.
In a first aspect, an embodiment of the present application provides an image processing method, including:
acquiring an image to be processed;
inputting the image to be processed into a neural network to determine input data of any convolution layer in the neural network;
if the input data is represented by floating point numbers, obtaining the scaling factors of the input data, and converting the input data from floating point number representation to fixed point number representation according to the scaling factors of the input data;
Obtaining filter parameters expressed by fixed point numbers of the convolution layer and bias parameters expressed by fixed point numbers;
and carrying out convolution operation on the input data expressed by the fixed point number, the filter parameters expressed by the fixed point number and the offset parameters expressed by the fixed point number to obtain output data of the convolution layer expressed by the fixed point number, wherein the output data refers to a convolution operation result of the image to be processed.
In a second aspect, an embodiment of the present application provides an image processing apparatus, including:
the image acquisition module is used for acquiring an image to be processed;
the image input module is used for inputting the image to be processed into a neural network so as to determine the input data of any convolution layer in the neural network;
the data processing module is used for acquiring the scaling factor of the input data if the input data is represented by floating point numbers, and converting the floating point number representation of the input data into fixed point number representation according to the scaling factor of the input data;
the parameter acquisition module is used for acquiring the filter parameters of the convolution layer and the bias parameters;
And the convolution operation module is used for carrying out convolution operation on the input data expressed by the fixed point number, the filter parameter expressed by the fixed point number and the offset parameter expressed by the fixed point number to determine output data of the convolution layer expressed by the fixed point number, wherein the output data refers to a convolution operation result of the image to be processed.
In a third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the image processing method according to the first aspect described above when the processor executes the computer program.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the image processing method according to the first aspect described above.
In a fifth aspect, embodiments of the present application provide a computer program product for, when run on a terminal device, causing the terminal device to perform the steps of the image processing method as described in the first aspect above.
From the above, the present application can determine the input data of any convolution layer in the neural network by acquiring the image to be processed and inputting the image to be processed into the neural network, and obtain the filter parameter represented by the fixed point number and the offset parameter represented by the fixed point number of the convolution layer, and when the input data is represented by the floating point number, acquire the scaling factor of the input data, convert the input data from the floating point number representation to the fixed point number representation according to the scaling factor of the input data, and then perform the convolution operation on the input data represented by the fixed point number, the filter parameter represented by the fixed point number and the offset parameter represented by the fixed point number, thereby obtaining the output data of the convolution layer represented by the fixed point number (that is, the convolution operation result of the image to be processed). In the convolution operation process, the related data are all represented by fixed point numbers, so that the data calculation amount of the convolution operation of the image to be processed can be reduced, the consumption of storage resources and calculation resources is reduced, and the convolution operation efficiency of the image to be processed is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required for the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flowchart of an implementation of an image processing method according to an embodiment of the present application;
fig. 2 is a schematic implementation flow chart of an image processing method according to a second embodiment of the present application;
fig. 3 is a schematic structural diagram of an image processing apparatus according to a third embodiment of the present application;
fig. 4 is a schematic structural diagram of a terminal device according to a fourth embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
The order of execution should not be construed as to imply that these are necessarily order dependent nor should they be order dependent.
In order to illustrate the technical solutions described in the present application, the following description is made by specific examples.
Referring to fig. 1, a flowchart of an implementation of an image processing method according to an embodiment of the present application is shown, where the image processing method is applied to a terminal device. As shown in fig. 1, the image processing method may include the steps of:
step 101, obtaining an image to be processed.
The image to be processed may refer to an image that needs to be processed (e.g., identified, tracked, etc.) by a neural network. For example, the image to be processed is a photo, the category of the photo is identified through the neural network, and the category of the photo is identified as a portrait.
Step 102, inputting the image to be processed into the neural network to determine the input data of any convolution layer in the neural network.
The neural network may refer to any network including a convolutional layer, for example, a convolutional neural network. When an image to be processed is processed through a neural network, it is generally required to perform convolution operations on the image to be processed, where the number of convolution operations is related to the number of convolution layers in the neural network, for example, the neural network includes two convolution layers, and then it is required to perform convolution operations on the image to be processed twice.
In one embodiment, the neural network may be installed in a terminal device, and the terminal device may acquire data (i.e., input data) input to any one of the convolutional layers when detecting the input of data to that convolutional layer. In order to accelerate the floor deployment of the neural network, a special neural network chip can be integrated in the terminal equipment, and the neural network is processed by adopting the neural network chip.
In one embodiment, after the image to be processed is acquired, the image to be processed may be input to a neural network, so that input data of any one convolution layer in the neural network is determined based on the input image to be processed. Input data may be described using an i-th convolution layer in the neural network as an example, where i is any convolution layer in the neural network and i is greater than zero and less than or equal to the total number of convolution layers in the neural network. When i is equal to 1, determining that the ith convolution layer is a first convolution layer of the neural network, wherein the input data of the first convolution layer is the original input data (i.e. an image to be processed) of the neural network; when i is greater than 1, determining that the input data of the ith layer of convolution layer is the output data of the ith-1 layer of convolution layer, wherein the output data of the ith-1 layer of convolution layer obtained through the method is represented by a fixed point number, and data conversion is not needed.
Since the input data of each convolution layer in the neural network is determined based on the image to be processed, the convolution operations of all the convolution layers in the neural network can be referred to as convolution operations of the image to be processed. For example, the neural network includes two layers of convolution layers, the input data of the first layer of convolution layer is an image to be processed, the convolution operation is performed on the image to be processed in the first layer of convolution layer through the scheme of the application, so that a first convolution operation result (i.e., output data of the first layer of convolution layer) of the image to be processed can be obtained, the first convolution operation result of the image to be processed is the input data of the second layer of convolution layer, and the convolution operation is performed on the first convolution operation result of the image to be processed in the second layer of convolution layer through the scheme of the application, so that a second convolution result (i.e., output data of the second layer of convolution layer) of the image to be processed can be obtained. From the above, the input data of the second layer is the result of the first convolution operation of the image to be processed, so the input data of the second layer is also determined based on the image to be processed.
Step 103, if the input data is represented by a floating point number, obtaining a scaling factor of the input data, and converting the input data from the floating point number representation to the fixed point number representation according to the scaling factor of the input data.
When the convolution layer in step 102 is the first layer convolution layer, the input data of the first layer convolution layer is the original input data of the neural network, and the original input data is usually represented by floating point numbers, so in order to reduce the calculation amount of the first layer convolution layer, the input data may be quantized to convert the input data from the floating point number representation to the fixed point number representation.
When the terminal equipment quantizes the input data, the terminal equipment can firstly acquire the scaling factor of the input data, and quantize the input data according to the scaling factor of the input data. The scaling factor of the input data may be a factor that converts the input data from a floating point number representation to a fixed point number representation, or may be a factor that maps the input data from a floating point number range to a fixed point number range.
In one embodiment, the scaling factor of the input data may be stored in the terminal device in advance, and when the terminal device detects that the input data is represented by a floating point number, the terminal device may quickly acquire the scaling factor of the input data stored in the terminal device, so as to improve the speed of acquiring the scaling factor of the input data. When the neural network chip is integrated in the terminal device, the scaling factor of the input data may be stored in the neural network chip, so as to reduce the occupation of the memory of the terminal device. Alternatively, the scaling factor of the input data may be stored in other devices, and the terminal device may obtain the scaling factor of the input data from the other devices when detecting that the input data is represented by a floating point number, which is not limited herein.
Step 104, obtaining filter parameters expressed by fixed point numbers and offset parameters expressed by fixed point numbers of the convolution layer.
In one embodiment, the filter parameters expressed by the fixed point number and the offset parameters expressed by the fixed point number of the convolution layer can be stored in the terminal device in advance, and the terminal device can quickly acquire the filter parameters expressed by the fixed point number and the offset parameters expressed by the fixed point number stored in the terminal device after acquiring the input data of the convolution layer, so that the acquisition speed of the filter parameters expressed by the fixed point number and the offset parameters expressed by the fixed point number is improved. When the neural network chip is integrated in the terminal device, the filter parameters expressed by the fixed point number and the bias parameters expressed by the fixed point number may be stored in the neural network chip, so as to reduce the occupation of the memory of the terminal device. Optionally, the filter parameters expressed by the fixed point number and the bias parameters expressed by the fixed point number may also be stored in other devices, and when the terminal device obtains the input data, the filter parameters expressed by the fixed point number and the bias parameters expressed by the fixed point number may be obtained from the other devices, which is not limited herein.
Step 105, performing convolution operation on the input data represented by the fixed point number, the filter parameter represented by the fixed point number and the offset parameter represented by the fixed point number to obtain output data of the convolution layer represented by the fixed point number, wherein the output data is a convolution operation result of the image to be processed.
The terminal device may quantize the convolution operators of all the convolution layers in the neural network through the steps 102 to 105 and the steps 202 to 208 in the second embodiment to obtain a fixed-point model (i.e., the neural network quantized by the convolution operators of all the convolution layers), so as to improve the data processing speed of the neural network.
In one embodiment, after obtaining the output data of the convolution layer represented by the fixed point number, the output data of the convolution layer represented by the fixed point number may be input to a specified network of the neural network to obtain a processing result (e.g., a recognition result, a tracking result, etc.) of the image to be processed.
The specified network refers to a network located after a convolution layer in the neural network, such as a pooling layer, a full connection layer and the like. By inputting the data output in step 105 to the designated network, the processing of the image to be processed can be completed, and a processing result can be obtained.
According to the embodiment of the application, the input data of any convolution layer in the neural network can be determined by acquiring the image to be processed and inputting the image to be processed into the neural network, and the output data of the convolution layer expressed by the fixed point number (namely, the convolution operation result of the image to be processed) can be obtained by firstly acquiring the filter parameter expressed by the fixed point number and the offset parameter expressed by the fixed point number of the convolution layer, acquiring the scaling factor of the input data when the input data is expressed by the floating point number, converting the input data from the floating point number to the fixed point number according to the scaling factor of the input data, and carrying out convolution operation on the input data expressed by the fixed point number, the filter parameter expressed by the fixed point number and the offset parameter expressed by the fixed point number. In the convolution operation process, the related data are all represented by fixed point numbers, so that the data calculation amount of the convolution operation of the image to be processed can be reduced, the consumption of storage resources and calculation resources is reduced, and the convolution operation efficiency of the image to be processed is improved.
Referring to fig. 2, a flowchart of an implementation of an image processing method according to a second embodiment of the present application is shown, where the image processing method is applied to a terminal device. As shown in fig. 2, the image processing method may include the steps of:
In step 201, an image to be processed is acquired.
The step 201 is the same as the step 101, and the detailed description of the step 101 will be omitted herein.
Step 202, inputting the image to be processed into a neural network to determine input data of any convolution layer in the neural network.
The step 202 is the same as the step 102, and the detailed description of the step 102 will be omitted herein.
In step 203, if the input data is represented by a floating point number, the fixed point bit number of the input data and N sample input data represented by the floating point number are obtained, where N is an integer greater than 1.
The fixed-point bit number of the input data may refer to the bit number occupied by the input data when the input data is represented by a fixed-point number.
In step 204, the absolute values of the N sample input data represented by the floating point number are taken, and the maximum value of the absolute values of the N sample input data represented by the floating point number is obtained and is used as the threshold value of the input data.
Step 205, determining a scaling factor of the input data according to the threshold value of the input data and the fixed point bit number of the input data.
The threshold value of the input data may refer to a maximum value of the input data. In the actual deployment of the neural network, when different data are input to the convolution layer, the sizes of the input data (i.e. the input data) may be different, so in order to obtain the threshold value of the input data, N sample input data expressed by floating point numbers may be obtained, these sample input data are also input data of the convolution layer, and by taking absolute values of these sample input data, the maximum value of the input data of the convolution layer may be obtained.
The terminal device may obtain the scaling factor of the input data according to formula (1).
Wherein scale_x represents a scaling factor of the input data, threshold_x represents a threshold value of the input data, n 1 Representing the fixed point number of bits of the input data. For example, the fixed point number of bits of the input data is 12 bits, then n 1 12.
Step 206, converting the input data from a floating point representation to a fixed point representation according to the scaling factor of the input data.
Since the number of bits of the fixed point number is limited, in order to improve the accuracy of quantization, overflow after quantization of the input data is avoided, so that the terminal device can consider the maximum value and the minimum value of the input data when the input data is quantized in the fixed point number. The terminal device may implement quantization of the input data by calculating the input data represented by the fixed point number (i.e., the fixed point number representation of the input data) based on the input data represented by the floating point number, the scaling factor of the input data, the minimum value of the input data when represented by the fixed point number, and the maximum value of the input data when represented by the fixed point number. The input data can be quantized specifically by formula (2).
Wherein Q is int_x Representing input data in fixed point number, Q x Representing input data in floating point number, -A x Representing the minimum value of the input data when represented by a fixed point number, A x -1 represents the maximum value of the input data when expressed in fixed point numbers, round represents rounding the data to an integer, clip represents limiting the data between the minimum and maximum values.
Step 207, obtaining filter parameters in fixed point number and bias parameters in fixed point number of the convolution layer.
The step 207 is the same as the step 104, and the detailed description of the step 104 will be omitted herein.
In one embodiment, the convolution layer includes M-dimensional filter parameter sets, each dimensional filter parameter set including at least one filter parameter represented in floating point numbers, M being an integer greater than zero, and obtaining the filter parameters of the convolved layer represented in fixed point numbers includes:
acquiring the fixed-point bit number of the filter parameters;
taking absolute values of filter parameters in a j-th dimension filter parameter set to obtain a maximum value of the absolute values corresponding to the j-th dimension filter parameter set, and taking the maximum value as a threshold value of the j-th dimension filter parameter set, wherein the j-th dimension filter parameter set is any one dimension filter parameter set in an M-dimension filter parameter set;
Determining a scaling factor of the j-th dimension filter parameter set according to the fixed point bit number of the filter parameter and the threshold value of the j-th dimension filter parameter set;
and converting the filter parameters in the j-th dimension filter parameter set from the floating point representation to the fixed point representation according to the scaling factors of the j-th dimension filter parameter set.
Wherein, M refers to the number of channels of the output data of the convolution layer in step 202, so the M-dimensional filter parameter set can be understood as the filter parameters of M channels, and the number of filter parameters of each channel is at least one.
The fixed point number of bits of the filter parameter may refer to the number of bits the filter parameter occupies when expressed in fixed point numbers.
To represent the filter parameters in each dimension of the set of filter parameters in a fixed number of points, an illustration may be made of a j-th dimension of the set of M-dimension filter parameters, j being an integer greater than zero and less than or equal to M, and the threshold of the j-th dimension of the set of filter parameters may refer to the maximum value of the filter parameters in the j-th dimension of the set of filter parameters. When the threshold value of the j-th dimension filter parameter set is obtained, the absolute values of all the filter parameters in the j-th dimension filter parameter set can be taken first, and the maximum value of the absolute values of all the filter parameters in the j-th dimension filter parameter set is the threshold value of the j-th dimension filter parameter set.
The terminal device may obtain the scaling factor of the j-th dimensional filter parameter set according to equation (3).
Wherein scale_w represents the scaling factor of the j-th-dimension filter parameter set, threshold_w represents the threshold value of the j-th-dimension filter parameter set, n 2 A fixed point number of bits representing the filter parameters. For exampleThe fixed point number of bits of the filter parameter is 8 bits, then n 2 8. It should be noted that, since the convolutional layer of the neural network includes M-dimensional filter parameter sets, each of the filter parameter sets has a scaling factor.
Since the number of bits of the fixed point number is limited, in order to improve the accuracy of quantization, overflow of the filter parameters after quantization is avoided, so that the terminal device can consider the maximum value and the minimum value of the filter parameters in the j-th dimension filter parameter set when the filter parameters in the j-th dimension filter parameter set are quantized. The terminal device may calculate the filter parameters in the j-th dimensional filter parameter set expressed by the fixed point number (i.e., the fixed point number representation of the filter parameters in the j-th dimensional filter parameter set) according to the filter parameters in the j-th dimensional filter parameter set expressed by the floating point number, the scaling factors of the j-th dimensional filter parameter set, the minimum value of the filter parameters in the j-th dimensional filter parameter set when expressed by the fixed point number, and the maximum value of the filter parameters in the j-th dimensional filter parameter set when expressed by the fixed point number. The filter parameters of the j-th-dimensional filter parameter set may be quantized specifically by equation (4).
Wherein Q is int_w Representing filter parameters in a j-th dimension filter parameter set represented by a fixed point number, Q w Representing filter parameters in a j-th-dimensional set of filter parameters in floating point numbers, -a w Represents the minimum value of the filter parameters in the j-th dimension filter parameter set when expressed in fixed point number, A w -1 represents the maximum value of the filter parameters in the j-th set of filter parameters when expressed in fixed point numbers,
in one embodiment, the convolutional layer includes M-dimensional bias parameter sets, each of the bias parameter sets including at least one bias parameter in floating point numbers, and obtaining bias parameters in fixed point numbers for the convolutional layer includes:
acquiring the fixed point bit number of the bias parameter;
taking absolute values of bias parameters in a j-th dimension bias parameter set corresponding to the j-th dimension filter parameter set to obtain a maximum value of the absolute values corresponding to the j-th dimension bias parameter set, and taking the maximum value as a threshold value of the j-th dimension bias parameter set;
determining a scaling factor of the j-th dimension bias parameter set according to the fixed point bit number of the bias parameter and the threshold value of the j-th dimension bias parameter set;
and converting the bias parameters in the j-th dimension bias parameter set from the floating point number representation to the fixed point number representation according to the scaling factors of the j-th dimension bias parameter set.
The M-dimensional bias parameter set may be understood as bias parameters of M channels, where the number of bias parameters of each channel is at least one.
The fixed point number of bits of the bias parameter may refer to the maximum number of bits the bias parameter occupies when expressed in fixed point numbers.
To represent the bias parameters in each dimension of the bias parameter sets in a fixed number of points, a j-th dimension of the bias parameter sets corresponding to the j-th dimension of the filter parameter sets in M-dimension may be exemplified, and the threshold value of the j-th dimension of the bias parameter sets may refer to the maximum value of the bias parameters in the j-th dimension of the bias parameter sets. When the threshold value of the j-th dimension bias parameter set is obtained, the absolute values of all bias parameters in the j-th dimension bias parameter set can be firstly taken, and the maximum value in the absolute values of all bias parameters in the j-th dimension bias parameter set is the threshold value of the j-th dimension bias parameter set.
The terminal device may obtain the scaling factor of the j-th dimension bias parameter set according to formula (3).
Wherein scale_b represents the scaling factor of the j-th-dimension bias parameter set, and threshold_b represents the j-th-dimension bias parameter setN, n 3 The fixed point number of bits representing the bias parameter. For example, the fixed point number of bits of the bias parameter is 32 bits, then n 3 32. It should be noted that, since the convolutional layer of the neural network includes M-dimensional offset parameter sets, each of the offset parameter sets has a scaling factor.
Because the number of bits of the fixed point number is limited, in order to improve the accuracy of quantization, overflow after bias quantization is avoided, so that when the terminal equipment quantizes the bias parameters in the j-th dimension bias parameter set, the maximum value and the minimum value of the bias parameters in the j-th dimension bias parameter set when expressed by the fixed point number can be considered. The terminal device may calculate the bias parameters in the j-th dimension bias parameter set expressed by the fixed point number (i.e., the fixed point number of the bias parameters in the j-th dimension bias parameter set) according to the bias parameters in the j-th dimension bias parameter set expressed by the floating point number, the scaling factor of the j-th dimension bias parameter set, the minimum value of the bias parameters in the j-th dimension bias parameter set when expressed by the fixed point number, and the maximum value of the bias parameters in the j-th dimension bias parameter set when expressed by the fixed point number. The bias parameters in the j-th dimension bias parameter set can be quantized specifically by the formula (6).
Wherein Q is int_b Represents the bias parameters in the j-th dimension bias parameter set represented by the fixed point number, Q b Representing bias parameters in a j-th-dimensional bias parameter set represented in floating point numbers, -A b Represents the minimum value of the bias parameters in the j-th dimension bias parameter set when expressed in fixed point number, A b -1 represents the maximum value of the bias parameters in the j-th dimension bias parameter set when expressed in fixed point number,
in one embodiment, the terminal device may further determine a scaling factor of a convolution result of the input data expressed in the fixed point number and the filter parameters in the j-th-dimensional filter parameter set expressed in the fixed point number according to the scaling factor of the input data and the scaling factor of the j-th-dimensional filter parameter set;
and if the scaling factor of the convolution result is not equal to the scaling factor of the j-th dimension offset parameter set, adjusting the filter parameters in the j-th dimension filter parameter set expressed by the fixed point number and the offset parameters in the j-th dimension offset parameter set expressed by the fixed point number so that the scaling factor of the convolution result is equal to the scaling factor of the j-th dimension offset parameter set.
Taking the j-th dimension filter parameter set and the corresponding j-th dimension offset parameter set as an illustration of the output parameter obtaining process, the terminal equipment firstly carries out convolution calculation on the filter parameters in the j-th dimension filter parameter set expressed by the fixed point number and input data expressed by the fixed point number to obtain a convolution result, the convolution result is expressed by the fixed point number, and then the convolution result expressed by the fixed point number and the offset parameters in the j-th dimension offset parameter set expressed by the fixed point number are added to obtain output data expressed by the fixed point number.
When adding the convolution result expressed in the fixed point number to the offset parameters in the j-th-dimension offset parameter set expressed in the fixed point number, in order to ensure that the convolution result expressed in the fixed point number can be added to the offset parameters in the j-th-dimension offset parameter set expressed in the fixed point number, the scaling factor of the convolution result is required to be equal to the scaling factor of the j-th-dimension offset parameter set. Wherein the scaling factor of the convolution result is the product of the scaling factor of the j-th dimension filter parameter set and the scaling factor of the input data.
When the terminal device detects that the scaling factor of the convolution result is not equal to the scaling factor of the j-th dimensional offset parameter set, the terminal device can adjust the filter parameters in the j-th dimensional filter parameter set represented by the fixed point number and the offset parameters in the j-th dimensional offset parameter set represented by the fixed point number to make the scaling factor of the convolution result equal to the scaling factor of the j-th dimensional offset parameter set, so that output data represented by the fixed point number is obtained according to the input data represented by the fixed point number, the filter parameters in the j-th dimensional filter parameter set represented by the fixed point number after adjustment and the offset parameters in the j-th dimensional offset parameter set represented by the fixed point number after adjustment.
In one embodiment, adjusting the filter parameters in the j-th-dimensional filter parameter set expressed in fixed-point numbers and the bias parameters in the j-th-dimensional bias parameter set expressed in fixed-point numbers includes:
determining a maximum value of the scaling factors in the scaling factors of the convolution result and the j-th dimension offset parameter set;
determining an adjustment factor of the convolution result according to the scaling factor and the maximum value of the scaling factor of the convolution result;
according to the adjustment factor of the convolution result, adjusting the filter parameters in the j-th dimension filter parameter set expressed by the fixed point number;
determining an adjustment factor of the j-th dimension bias parameter set according to the scaling factor and the maximum value of the scaling factor of the j-th dimension bias parameter set;
and adjusting the bias parameters in the j-th-dimension bias parameter set expressed in the fixed point number according to the adjustment factors of the j-th-dimension bias parameter set.
The maximum value of the scaling factors refers to the maximum value of the scaling factors of the convolution result and the scaling factors of the j-th dimension offset parameter set. For example, if the scaling factor of the convolution result is 0.25 and the scaling factor of the j-th dimension offset parameter set is 0.5, then the maximum value of the scaling factor is 0.5 can be determined.
The adjustment factor of the convolution result can be calculated by equation (7).
Where adjust_xw represents the adjustment factor of the convolution result, scale_max represents the maximum value of the scaling factor, scale_xw represents the scaling factor of the convolution result. It should be noted that, since the convolution layer of the neural network includes M-dimensional filter parameter sets, each of which corresponds to one convolution result, the convolution layer of the neural network has M convolution results, and each of the M convolution results has an adjustment factor.
The terminal device may calculate the adjusted filter parameters in the j-th dimension filter parameter set expressed in the fixed point number according to the filter parameters in the j-th dimension filter parameter set expressed in the floating point number, the adjustment factor of the convolution result, the scaling factor of the j-th dimension filter parameter set, the minimum value of the filter parameters in the j-th dimension filter parameter set expressed in the fixed point number, and the maximum value of the filter parameters in the j-th dimension filter parameter set expressed in the fixed point number. In particular, the filter parameters in the j-th-dimension filter parameter set expressed in the fixed point number can be adjusted by the formula (8), and the adjusted filter parameters in the j-th-dimension filter parameter set expressed in the fixed point number can be expressed as follows
The adjustment factor for the j-th dimensional bias parameter set can be calculated by equation (9).
Where adjust_b represents the adjustment factor of the j-th dimension bias parameter set.
The terminal device may calculate the bias parameters in the adjusted j-th-dimension bias parameter set expressed in the fixed point number according to the bias parameters in the j-th-dimension bias parameter set expressed in the floating point number, the adjustment factors of the j-th-dimension bias parameter set, the scaling factors of the j-th-dimension bias parameter set, the minimum value of the bias parameters in the j-th-dimension bias parameter set when expressed in the fixed point number, and the maximum value of the bias parameters in the j-th-dimension bias parameter set when expressed in the fixed point number. Specifically, the bias parameters in the j-th-dimension bias parameter set expressed by the fixed point number can be adjusted through the formula (10), and the adjusted bias parameters in the j-th-dimension bias parameter set expressed by the fixed point number can be expressed as follows:
the terminal device may store the filter parameters in the adjusted jth dimension filter parameter set expressed by the fixed point number and the offset parameters in the adjusted jth dimension offset parameter set expressed by the fixed point number in advance into the neural network chip, without readjusting when the neural network chip operates, so as to improve the data processing speed of the neural network chip.
Step 208, performing convolution operation on the input data represented by the fixed point number, the filter parameter represented by the fixed point number and the offset parameter represented by the fixed point number to obtain output data of the convolution layer represented by the fixed point number, wherein the output data is a convolution operation result of the image to be processed.
The step 208 is the same as the step 105, and the detailed description of the step 105 will be omitted herein.
In one embodiment, the terminal device may also obtain a fixed point number of bits of the output data; if the current bit number of the output data expressed by the fixed point number is not equal to the fixed point bit number of the output data, obtaining a scaling factor of the output data; and adjusting the output data expressed in the fixed point number from the current bit number to the corresponding fixed point bit number according to the maximum value of the scaling factor and the scaling factor of the output data.
The fixed point number of bits of the output data may refer to the number of bits occupied by the output data when expressed in a fixed point number.
The terminal device may obtain L sample output data represented by floating point numbers, where L is an integer greater than 1, take absolute values of the L sample output data represented by floating point numbers, obtain a maximum value of the absolute values of the L sample output data represented by floating point numbers, use the maximum value as a threshold of the output data, and obtain a scaling factor of the output data according to the threshold of the output data and the fixed point bit number of the output data. The calculation formula of the scaling factor of the output data may refer to the calculation formula of the scaling factor of the input data, which is not described herein.
After obtaining the output data represented by the fixed point number, the output data represented by the fixed point number is generally stored in the accumulator, the current bit number of the output data represented by the fixed point number can be understood as the fixed point bit number of the accumulator, after obtaining the fixed point bit number of the output data, the fixed point bit number of the output data can be compared with the current bit data of the output data represented by the fixed point number, and when the current bit number is not equal to the fixed point bit number, the output data represented by the fixed point number can be adjusted from the current bit number to the fixed point bit number by a scaling factor maximum value and a scaling factor of the output data. For example, the fixed point number of bits of the accumulator is 32 bits, and the fixed point number of bits of the output data is 12 bits, then the output data needs to be adjusted from 32 bits to 12 bits when expressed in fixed point number.
Because the convolution layer of the neural network comprises M-dimensional filter parameter sets, each dimensional filter parameter set has a scaling factor maximum value, the M-dimensional filter parameter sets have M scaling factor maximum values, the M scaling factor maximum values can be traversed firstly to obtain the maximum value of the M scaling factor maximum values, the maximum value of the M scaling factor maximum values is compared with the scaling factor of the output data, the maximum value of the M scaling factor maximum values is used as the scaling factor of the output data, and the output data can be adjusted from the current bit number to the fixed bit number according to the scaling factor and the scaling factor maximum value taking the j-th dimensional filter parameter set as an example.
The terminal device may calculate the adjusted output data according to the input data represented by the fixed point number, the filter parameters in the j-th dimension filter parameter set represented by the fixed point number, the offset parameters in the j-th dimension offset parameter set represented by the fixed point number, the maximum value of the scaling factors in the scaling factors of the convolution result and the scaling factors of the j-th dimension offset parameter set, and the scaling factor of the output data. Specifically, the current bit number of the output data can be adjusted through a formula (11), so that the adjusted output data is obtained.
Wherein Q is int_y Representing the output data after the adjustment,representing the convolution symbols, scale_y represents the scaling factor of the output data.
From the descriptions of formulas (1), (3), and (5), it can be determined that all scaling factors are powers of 2, soThe result of (2) is a power of 1 or less, can be noted as adjust_y, and by calculating the logarithm of adjust_y with 2 as the base, the number of bits r_n of the output data shifted rightward can be obtained. Specifically, r_n can be obtained by the formula (12).
r_n=-log 2 (adjust_y) (12)
Since the convolutional layer of the neural network includes M scaling factor maxima, the convolutional layer of the neural network also includes M r_n. The fixed point number format in the accumulator is [1, M, H ] 3 ,W 3 ]The storage mode is continuous M blocks, and the size of each block is H 3 *W 3 And sequentially shifting the output data in the M blocks by r_n to the right, so that the adjusted output data can be obtained.
It should be noted that, in this embodiment, when the fixed-point bit number of the input data, the fixed-point bit number of the filter parameter, the fixed-point bit number of the bias parameter, and the fixed-point bit number of the output data are set, quantization accuracy and efficiency of execution of the fixed-point model on the terminal device are considered, and after a large number of neural networks are investigated, accuracy of the fixed-point bit number of the input data, the fixed-point bit number of the filter parameter, the fixed-point bit number of the bias parameter, and the fixed-point bit number of the output data can be improved, so that accuracy of quantization of the input data, the filter parameter, the bias parameter, and the output data is improved, and quantization accuracy and execution efficiency of the neural network on the terminal device are improved.
When the convolution operation is carried out on the image to be processed through the neural network, the scaling factor of each data can be determined by acquiring the fixed-point bit number of each data in the convolution calculation process, so that the quantization of each data is completed according to the scaling factor of each data, the calculated amount of a convolution operator is reduced, the consumption of storage resources and calculation resources is reduced, and the convolution operation efficiency of the image to be processed is improved.
Referring to fig. 3, which is a schematic structural diagram of an image processing apparatus provided in the third embodiment of the present application, only a portion related to the embodiment of the present application is shown for convenience of explanation.
The image processing apparatus includes:
an image acquisition module 31 for acquiring an image to be processed;
an image input module 32 for inputting an image to be processed into the neural network to determine input data of any convolution layer in the neural network;
a data processing module 33, configured to obtain a scaling factor of the input data if the input data is represented by a floating point number, and convert the input data from the floating point number representation to a fixed point number representation according to the scaling factor of the input data;
a parameter obtaining module 34, configured to obtain a filter parameter represented by a fixed point number and a bias parameter represented by a fixed point number of the convolution layer;
the convolution operation module 35 is configured to perform convolution operation on input data represented by fixed point numbers, filter parameters represented by fixed point numbers, and offset parameters represented by fixed point numbers to obtain output data of a convolution layer represented by fixed point numbers, where the output data is a convolution operation result of an image to be processed.
In one embodiment, the data processing module 33 is specifically configured to:
Acquiring the fixed-point bit number of input data;
acquiring N sample input data represented by floating point numbers, wherein N is an integer greater than 1;
taking absolute values of N sample input data represented by floating point numbers, obtaining maximum values of the absolute values of the N sample input data represented by the floating point numbers, and taking the maximum values as threshold values of the input data;
and determining a scaling factor of the input data according to the threshold value of the input data and the fixed-point bit number of the input data.
In one embodiment, the convolution layer includes M-dimensional filter parameter sets, each including at least one filter parameter represented by a floating point number, M being an integer greater than zero, and the parameter acquisition module 34 is specifically configured to:
acquiring the fixed-point bit number of the filter parameters;
taking absolute values of filter parameters in a j-th dimension filter parameter set to obtain a maximum value of the absolute values corresponding to the j-th dimension filter parameter set, and taking the maximum value as a threshold value of the j-th dimension filter parameter set, wherein the j-th dimension filter parameter set is any one dimension filter parameter set in an M-dimension filter parameter set;
determining a scaling factor of the j-th dimension filter parameter set according to the fixed point bit number of the filter parameter and the threshold value of the j-th dimension filter parameter set;
And converting the filter parameters in the j-th dimension filter parameter set from the floating point representation to the fixed point representation according to the scaling factors of the j-th dimension filter parameter set.
In one embodiment, the convolution layer includes M-dimensional bias parameter sets, each of which includes at least one bias parameter represented in floating point numbers, and the parameter acquisition module 34 is specifically configured to:
acquiring the fixed point bit number of the bias parameter;
taking absolute values of bias parameters in a j-th dimension bias parameter set corresponding to the j-th dimension filter parameter set to obtain a maximum value of the absolute values corresponding to the j-th dimension bias parameter set, and taking the maximum value as a threshold value of the j-th dimension bias parameter set;
determining a scaling factor of the j-th dimension bias parameter set according to the fixed point bit number of the bias parameter and the threshold value of the j-th dimension bias parameter set;
and converting the bias parameters in the j-th dimension bias parameter set from the floating point number representation to the fixed point number representation according to the scaling factors of the j-th dimension bias parameter set.
In one embodiment, the image processing apparatus further includes:
a factor determining module for determining a scaling factor of a convolution result of the input data expressed in the fixed point number and the filter parameters in the j-th dimension filter parameter set expressed in the fixed point number according to the scaling factor of the input data and the scaling factor of the j-th dimension filter parameter set;
And the parameter adjustment module is used for adjusting the filter parameters in the j-th dimension filter parameter set expressed by the fixed point number and the offset parameters in the j-th dimension offset parameter set expressed by the fixed point number if the scaling factors of the convolution result are not equal to the scaling factors of the j-th dimension offset parameter set, so that the scaling factors of the convolution result are equal to the scaling factors of the j-th dimension offset parameter set.
The parameter adjustment module is specifically used for:
determining a maximum value of the scaling factors in the scaling factors of the convolution result and the j-th dimension offset parameter set;
determining an adjustment factor of the convolution result according to the scaling factor and the maximum value of the scaling factor of the convolution result;
according to the adjustment factor of the convolution result, adjusting the filter parameters in the j-th dimension filter parameter set expressed by the fixed point number;
determining an adjustment factor of the j-th dimension bias parameter set according to the scaling factor and the maximum value of the scaling factor of the j-th dimension bias parameter set;
and adjusting the bias parameters in the j-th-dimension bias parameter set expressed in the fixed point number according to the adjustment factors of the j-th-dimension bias parameter set.
In one embodiment, the image processing apparatus further includes:
the output acquisition module is used for acquiring the fixed-point bit number of the output data;
The scaling acquisition module is used for acquiring the scaling factor of the output data if the current bit number of the output data expressed by the fixed point number is not equal to the fixed point bit number of the output data;
and the output adjustment module is used for adjusting the output data expressed by the fixed point number from the current bit number to the corresponding fixed point bit number according to the maximum value of the scaling factor and the scaling factor of the output data.
The image processing apparatus provided in the embodiment of the present application may be applied to the first and second embodiments of the foregoing method, and details refer to the description of the first and second embodiments of the foregoing method, which are not repeated herein.
Fig. 4 is a schematic structural diagram of a terminal device according to a fourth embodiment of the present application. As shown in fig. 4, the terminal device 4 of this embodiment includes: one or more processors 40 (only one shown), a memory 41, and a computer program 42 stored in the memory 41 and executable on the at least one processor 40. The steps of the various image processing method embodiments described above are implemented when the processor 40 executes the computer program 42.
The terminal device 4 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, etc. The terminal device may include, but is not limited to, a processor 40, a memory 41. It will be appreciated by those skilled in the art that fig. 4 is merely an example of the terminal device 4 and does not constitute a limitation of the terminal device 4, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the terminal device may further include an input-output device, a network access device, a bus, etc.
The processor 40 may be a neural network chip, which may also be a central processing unit (Central Processing Unit, CPU), other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 41 may be an internal storage unit of the terminal device 4, such as a hard disk or a memory of the terminal device 4. The memory 41 may be an external storage device of the terminal device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the terminal device 4. Further, the memory 41 may also include both an internal storage unit and an external storage device of the terminal device 4. The memory 41 is used for storing the computer program as well as other programs and data required by the terminal device. The memory 41 may also be used for temporarily storing data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above device may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each method embodiment described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
The implementation of all or part of the flow of the method in the foregoing embodiment may also be implemented by a computer program product, which when executed on a terminal device, causes the terminal device to implement the steps in the foregoing embodiments of the image processing method.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (9)

1. An image processing method, characterized in that the image processing method comprises:
acquiring an image to be processed;
inputting the image to be processed into a neural network to determine input data of any convolution layer in the neural network;
if the input data is represented by floating point numbers, obtaining the scaling factors of the input data, and converting the input data from floating point number representation to fixed point number representation according to the scaling factors of the input data;
Obtaining filter parameters expressed by fixed point numbers of the convolution layer and bias parameters expressed by fixed point numbers;
performing convolution operation on the input data represented by the fixed point number, the filter parameters represented by the fixed point number and the offset parameters represented by the fixed point number to obtain output data of the convolution layer represented by the fixed point number, wherein the output data refers to a convolution operation result of the image to be processed;
the image processing method further includes:
acquiring the fixed-point bit number of the output data;
if the current bit number of the output data expressed by the fixed point number is not equal to the fixed point bit number of the output data, obtaining a scaling factor of the output data; the current bit number of the output data refers to the fixed-point bit number of the accumulator, and the fixed-point bit number of the output data refers to the bit number occupied by the output data when the output data is represented by fixed-point numbers;
according to the maximum value of the scaling factor and the scaling factor of the output data, adjusting the output data expressed by fixed point numbers from the current bit number to the corresponding fixed point bit number; wherein each dimension of the set of filter parameters has one of said scaling factor maxima.
2. The image processing method of claim 1, wherein obtaining a scaling factor of the input data comprises:
acquiring the fixed-point bit number of the input data;
acquiring N sample input data represented by floating point numbers, wherein N is an integer greater than 1;
taking absolute values of N sample input data represented by floating point numbers, obtaining maximum values of the absolute values of the N sample input data represented by the floating point numbers, and taking the maximum values of the absolute values of the sample input data as threshold values of the input data;
and determining a scaling factor of the input data according to the threshold value of the input data and the fixed-point bit number of the input data.
3. The image processing method according to claim 1 or 2, wherein the convolution layer includes M-dimensional filter parameter sets, each of the filter parameter sets including at least one filter parameter expressed in floating point numbers, M being an integer greater than zero, and acquiring the filter parameter expressed in fixed point numbers of the convolution layer includes:
acquiring the fixed-point bit number of the filter parameter;
taking absolute values of filter parameters in a j-th dimension filter parameter set to obtain a maximum value of the absolute values corresponding to the j-th dimension filter parameter set, and taking the maximum value of the absolute values corresponding to the j-th dimension filter parameter set as a threshold value of the j-th dimension filter parameter set, wherein the j-th dimension filter parameter set is any one dimension filter parameter set in the M-dimension filter parameter set;
Determining a scaling factor of the j-th dimension filter parameter set according to the fixed-point bit number of the filter parameter and the threshold value of the j-th dimension filter parameter set;
and converting the filter parameters in the j-th dimension filter parameter set from floating point representation to fixed point representation according to the scaling factors of the j-th dimension filter parameter set.
4. The image processing method of claim 3, wherein the convolutional layer comprises a set of M-dimensional bias parameters, each set of bias parameters comprising at least one bias parameter expressed in floating point numbers, and wherein obtaining the bias parameters of the convolutional layer expressed in fixed point numbers comprises:
acquiring the fixed-point bit number of the bias parameter;
taking absolute values of bias parameters in a j-th-dimension bias parameter set corresponding to the j-th-dimension filter parameter set, obtaining the maximum value of the absolute values corresponding to the j-th-dimension bias parameter set, and taking the maximum value of the absolute values corresponding to the j-th-dimension bias parameter set as a threshold value of the j-th-dimension bias parameter set;
determining a scaling factor of the j-th dimension bias parameter set according to the fixed-point bit number of the bias parameter and the threshold value of the j-th dimension bias parameter set;
and converting the bias parameters in the j-th dimension bias parameter set from floating point representation to fixed point representation according to the scaling factors of the j-th dimension bias parameter set.
5. The image processing method according to claim 4, wherein the image processing method further comprises:
determining a scaling factor of a convolution result of the input data expressed in fixed point numbers and filter parameters in the j-th dimensional filter parameter set expressed in fixed point numbers according to the scaling factor of the input data and the scaling factor of the j-th dimensional filter parameter set;
and if the scaling factor of the convolution result is not equal to the scaling factor of the j-th dimension offset parameter set, adjusting the filter parameters in the j-th dimension filter parameter set expressed by the fixed point number and the offset parameters in the j-th dimension offset parameter set expressed by the fixed point number so that the scaling factor of the convolution result is equal to the scaling factor of the j-th dimension offset parameter set.
6. The image processing method of claim 5, wherein said adjusting filter parameters in said j-th-dimensional filter parameter set expressed in fixed-point numbers and bias parameters in said j-th-dimensional bias parameter set expressed in fixed-point numbers comprises:
determining a maximum value of a scaling factor of the convolution result and the scaling factor of the j-th dimension offset parameter set;
Determining an adjustment factor of the convolution result according to the scaling factor of the convolution result and the maximum value of the scaling factor;
according to the adjustment factor of the convolution result, adjusting the filter parameters in the j-th dimension filter parameter set expressed by fixed point numbers;
determining an adjustment factor of the j-th dimension bias parameter set according to the scaling factor of the j-th dimension bias parameter set and the maximum value of the scaling factor;
and adjusting the bias parameters in the j-th dimension bias parameter set expressed in fixed point numbers according to the adjustment factors of the j-th dimension bias parameter set.
7. An image processing apparatus, characterized in that the image processing apparatus comprises:
the image acquisition module is used for acquiring an image to be processed;
the image input module is used for inputting the image to be processed into a neural network so as to determine the input data of any convolution layer in the neural network;
the data processing module is used for acquiring the scaling factor of the input data if the input data is represented by floating point numbers, and converting the floating point number representation of the input data into fixed point number representation according to the scaling factor of the input data;
the parameter acquisition module is used for acquiring the filter parameters of the convolution layer and the bias parameters;
The convolution operation module is used for carrying out convolution operation on the input data represented by the fixed point number, the filter parameters represented by the fixed point number and the offset parameters represented by the fixed point number to obtain output data of the convolution layer represented by the fixed point number, wherein the output data refers to a convolution operation result of the image to be processed;
the image processing apparatus further includes:
the output acquisition module is used for acquiring the fixed-point bit number of the output data;
the scaling acquisition module is used for acquiring a scaling factor of the output data if the current bit number of the output data expressed by the fixed point number is not equal to the fixed point bit number of the output data; the current bit number of the output data refers to the fixed-point bit number of the accumulator, and the fixed-point bit number of the output data refers to the bit number occupied by the output data when the output data is represented by fixed-point numbers;
the output adjustment module is used for adjusting the output data expressed by the fixed point number from the current bit number to the corresponding fixed point bit number according to the maximum value of the scaling factor and the scaling factor of the output data; wherein each dimension of the set of filter parameters has one of said scaling factor maxima.
8. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the image processing method according to any one of claims 1 to 6 when the computer program is executed.
9. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the image processing method according to any one of claims 1 to 6.
CN202110994479.5A 2021-08-27 2021-08-27 Image processing method, device, terminal equipment and storage medium Active CN113780523B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110994479.5A CN113780523B (en) 2021-08-27 2021-08-27 Image processing method, device, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110994479.5A CN113780523B (en) 2021-08-27 2021-08-27 Image processing method, device, terminal equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113780523A CN113780523A (en) 2021-12-10
CN113780523B true CN113780523B (en) 2024-03-29

Family

ID=78839469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110994479.5A Active CN113780523B (en) 2021-08-27 2021-08-27 Image processing method, device, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113780523B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327637B (en) * 2021-12-24 2025-03-21 安谋科技(中国)有限公司 Data conversion method, device, electronic device, medium and computer program product
CN114936634A (en) * 2022-04-12 2022-08-23 瑞泰生医科技(香港)有限公司 Neural network model training method and system
CN116720563B (en) * 2022-09-19 2024-03-29 荣耀终端有限公司 A method, device and electronic equipment for improving the accuracy of fixed-point neural network models
CN115328438B (en) * 2022-10-13 2023-01-10 华控清交信息科技(北京)有限公司 Data processing method and device and electronic equipment

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018140294A1 (en) * 2017-01-25 2018-08-02 Microsoft Technology Licensing, Llc Neural network based on fixed-point operations
CN109002881A (en) * 2018-06-28 2018-12-14 郑州云海信息技术有限公司 The fixed point calculation method and device of deep neural network based on FPGA
CN109063825A (en) * 2018-08-01 2018-12-21 清华大学 Convolutional neural networks accelerator
CN109740740A (en) * 2019-01-03 2019-05-10 厦门美图之家科技有限公司 The fixed point accelerating method and device of convolutional calculation
WO2019143026A1 (en) * 2018-01-16 2019-07-25 한국과학기술원 Image processing method and device using feature map compression
CN110062246A (en) * 2018-01-19 2019-07-26 杭州海康威视数字技术股份有限公司 The method and apparatus that video requency frame data is handled
CN111461302A (en) * 2020-03-30 2020-07-28 杭州嘉楠耘智信息科技有限公司 Data processing method, device and storage medium based on convolutional neural network
CN112232477A (en) * 2019-07-15 2021-01-15 阿里巴巴集团控股有限公司 Image data processing method, device, equipment and medium
CN112287968A (en) * 2020-09-23 2021-01-29 深圳云天励飞技术股份有限公司 Image model training method, image processing method, chip, device and medium
CN112381205A (en) * 2020-09-29 2021-02-19 北京清微智能科技有限公司 Neural network low bit quantization method
WO2021083154A1 (en) * 2019-10-30 2021-05-06 Huawei Technologies Co., Ltd. Method and apparatus for quantization of neural networks post training
CN112990438A (en) * 2021-03-24 2021-06-18 中国科学院自动化研究所 Full-fixed-point convolution calculation method, system and equipment based on shift quantization operation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102808577B1 (en) * 2018-04-27 2025-05-15 삼성전자주식회사 Method and apparatus for quantizing parameters of neural network

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018140294A1 (en) * 2017-01-25 2018-08-02 Microsoft Technology Licensing, Llc Neural network based on fixed-point operations
WO2019143026A1 (en) * 2018-01-16 2019-07-25 한국과학기술원 Image processing method and device using feature map compression
CN110062246A (en) * 2018-01-19 2019-07-26 杭州海康威视数字技术股份有限公司 The method and apparatus that video requency frame data is handled
CN109002881A (en) * 2018-06-28 2018-12-14 郑州云海信息技术有限公司 The fixed point calculation method and device of deep neural network based on FPGA
CN109063825A (en) * 2018-08-01 2018-12-21 清华大学 Convolutional neural networks accelerator
CN109740740A (en) * 2019-01-03 2019-05-10 厦门美图之家科技有限公司 The fixed point accelerating method and device of convolutional calculation
CN112232477A (en) * 2019-07-15 2021-01-15 阿里巴巴集团控股有限公司 Image data processing method, device, equipment and medium
WO2021083154A1 (en) * 2019-10-30 2021-05-06 Huawei Technologies Co., Ltd. Method and apparatus for quantization of neural networks post training
CN111461302A (en) * 2020-03-30 2020-07-28 杭州嘉楠耘智信息科技有限公司 Data processing method, device and storage medium based on convolutional neural network
CN112287968A (en) * 2020-09-23 2021-01-29 深圳云天励飞技术股份有限公司 Image model training method, image processing method, chip, device and medium
CN112381205A (en) * 2020-09-29 2021-02-19 北京清微智能科技有限公司 Neural network low bit quantization method
CN112990438A (en) * 2021-03-24 2021-06-18 中国科学院自动化研究所 Full-fixed-point convolution calculation method, system and equipment based on shift quantization operation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于FPGA的卷积神经网络加速器的设计与实现;张榜;来金梅;;复旦学报(自然科学版)(第02期);全文 *

Also Published As

Publication number Publication date
CN113780523A (en) 2021-12-10

Similar Documents

Publication Publication Date Title
CN113780523B (en) Image processing method, device, terminal equipment and storage medium
CN111950723B (en) Neural network model training method, image processing method, device and terminal equipment
CN113326930B (en) Data processing method, neural network training method, related device and equipment
CN113132723B (en) An image compression method and device
CN111488985B (en) Deep neural network model compression training methods, devices, equipment and media
CN110175641B (en) Image recognition methods, devices, equipment and storage media
CN110610237A (en) Quantitative training method and device of model and storage medium
WO2019238029A1 (en) Convolutional neural network system, and method for quantifying convolutional neural network
US11423313B1 (en) Configurable function approximation based on switching mapping table content
CN114419086B (en) Edge extraction method, device, electronic device and storage medium
CN112488297B (en) A neural network pruning method, model generation method and device
CN114169513B (en) Neural network quantization method and device, storage medium and electronic equipment
CN112561050B (en) A neural network model training method and device
CN112990438B (en) Full-fixed-point convolution calculation method, system and equipment based on shift quantization operation
CN114222997B (en) Method and apparatus for post-training quantification of neural networks
CN109359542B (en) Vehicle damage level determining method based on neural network and terminal equipment
CN116912556A (en) Image classification method, device, electronic device and storage medium
CN112686365A (en) Method and device for operating neural network model and computer equipment
CN116266274A (en) Neural network adjustment method and corresponding device
CN112232477B (en) Image data processing method, device, equipment and medium
CN111626298A (en) A real-time image semantic segmentation device and segmentation method
CN118446253A (en) Full integer convolutional neural network quantization optimization method and system for hardware calculation
CN110955405B (en) Input data processing and index value acquisition method and device and electronic equipment
CN114882247B (en) Image processing method and device and electronic equipment
CN117376977B (en) A mobile phone 5G wireless signal testing system, method, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant