Disclosure of Invention
The application provides a 3D laparoscopic surgery data processing method, a device, equipment and a storage medium, which aim to solve one of the technical problems in the prior art at least to a certain extent.
In order to solve the problems, the application provides the following technical scheme:
A 3D laparoscopic surgical data processing method, comprising:
Framing an original operation data video stream by adopting a video processing technology to obtain a laparoscope image sequence corresponding to left and right cameras of a binocular laparoscope system;
calibrating the laparoscopic image sequence based on camera parameters to obtain a laparoscopic image sequence for eliminating image distortion;
And respectively adopting a smoke removal technology based on a dark channel priori algorithm and a surgical instrument segmentation algorithm based on DeepLabv & lt3+ & gt architecture to remove smoke and segment surgical instruments on the calibrated laparoscopic image sequence, so as to generate a laparoscopic image sequence without smoke interference and instrument shielding.
The technical scheme adopted by the embodiment of the application also comprises the steps that the video processing technology is adopted to frame the original operation data video stream to obtain a laparoscope image sequence corresponding to the left camera and the right camera of the binocular laparoscope system, and the method specifically comprises the following steps:
And acquiring each frame of image in the original operation data video stream, respectively dividing each frame of image into a left part and a right part along the horizontal direction, and respectively corresponding to the laparoscopic images of the left camera and the right camera of the binocular laparoscopic system.
The technical scheme adopted by the embodiment of the application also comprises the steps of calibrating the laparoscopic image sequence based on camera parameters to obtain the laparoscopic image sequence for eliminating image distortion, and specifically comprises the following steps:
Shooting an N Zhang Qipan-grid image by using a camera;
detecting corner points of the checkerboard image, and calculating an internal reference matrix and a distortion coefficient of a camera according to 3D world coordinates and 2D image coordinates of the checkerboard image by using a calibration function of OpenCV, wherein the internal reference matrix of the camera comprises a focal length or a principal point position, and the distortion coefficient comprises barrel-shaped or pillow-shaped distortion;
and performing de-distortion treatment on the framing laparoscopic image sequence by using the calculated internal reference matrix and the distortion coefficient, and recovering the real geometric shape of the laparoscopic image sequence.
The technical scheme adopted by the embodiment of the application further comprises that the smoke removal technology based on the dark channel prior algorithm and the surgical instrument segmentation algorithm based on DeepLabv3+ architecture are adopted to remove smoke and segment surgical instruments of the calibrated laparoscopic image sequence respectively, and the method comprises the following steps:
Calculating the minimum value of each pixel point from RGB channels of each frame of laparoscope image to form a dark channel image, and reversely calculating a transmittance map caused by atmospheric scattering according to the relationship between the dark channel image and the calibrated laparoscope image on the assumption that the dark channel image represents the minimum light intensity under the condition of no fog, wherein the transmittance calculation formula is as follows:
Wherein t represents a transmittance map, I represents a calibrated laparoscopic image, a represents an atmospheric light intensity, ω is an adjustment parameter for controlling defogging intensity;
Globally analyzing the dark channel image, selecting the brightest pixel with a set proportion in the dark channel image as the atmospheric light intensity, combining the transmissivity image and the atmospheric light intensity, and calculating a defogged laparoscope image by using an image restoration formula as follows:
wherein J represents the defogged laparoscopic image.
The technical scheme adopted by the embodiment of the application further comprises that the smoke removal technology based on the dark channel prior algorithm and the surgical instrument segmentation algorithm based on DeepLabv3+ architecture are adopted to remove smoke and segment surgical instruments from the calibrated laparoscopic image sequence, and the method further comprises the following steps:
Inputting the calibrated laparoscopic image into a DeepLabv & lt3+ & gt model, wherein the DeepLabv & lt3+ & gt model comprises an encoder and a decoder, the encoder is based on a modified ResNet architecture and is used for extracting an advanced semantic feature map of the laparoscopic image through hole volume and multi-scale feature fusion, the decoder is used for carrying out pixel-by-pixel upsampling operation on the feature map output by the encoder, restoring the low-resolution feature map to the same size as the input image and outputting a segmentation probability map, the segmentation probability map is a multi-channel image, each channel corresponds to one class probability distribution, pixel-by-pixel analysis is carried out on the segmentation probability map, class probability of each pixel is mapped into a specific class label, and a final surgical instrument segmentation mask is generated.
The technical scheme adopted by the embodiment of the application further comprises the steps of carrying out pixel-by-pixel analysis on the segmentation probability map, mapping the class probability of each pixel into a specific class label, and generating a final surgical instrument segmentation mask, wherein the specific steps are as follows:
The surgical instrument segmentation mask is represented in the form of a binary image, wherein a surgical instrument region is labeled as foreground and a background region is labeled as background.
The embodiment of the application adopts another technical scheme that a 3D laparoscopic surgery data processing device comprises:
The image framing module is used for framing the original operation data video stream by adopting a video processing technology to obtain a laparoscope image sequence corresponding to the left camera and the right camera of the binocular laparoscope system;
the image calibration module is used for calibrating the laparoscopic image sequence based on camera parameters to obtain a laparoscopic image sequence for eliminating image distortion;
The image preprocessing module is used for respectively adopting a smoke removal technology based on a dark channel priori algorithm and a surgical instrument segmentation algorithm based on DeepLabv & lt3+ & gt architecture to remove smoke and segment surgical instruments on the calibrated laparoscopic image sequence, so as to generate a laparoscopic image sequence without smoke interference and instrument shielding.
The technical scheme adopted by the embodiment of the application further comprises that the image preprocessing module respectively adopts a smoke removal technology based on a dark channel priori algorithm and a surgical instrument segmentation algorithm based on DeepLabv & lt3+ & gt architecture to remove smoke and segment surgical instruments of the calibrated laparoscopic image sequence, and specifically comprises the following steps:
Calculating the minimum value of each pixel point from RGB channels of each frame of laparoscope image to form a dark channel image, and reversely calculating a transmittance map caused by atmospheric scattering according to the relationship between the dark channel image and the calibrated laparoscope image on the assumption that the dark channel image represents the minimum light intensity under the condition of no fog, wherein the transmittance calculation formula is as follows:
Wherein t represents a transmittance map, I represents a calibrated laparoscopic image, a represents an atmospheric light intensity, ω is an adjustment parameter for controlling defogging intensity;
Globally analyzing the dark channel image, selecting the brightest pixel with a set proportion in the dark channel image as the atmospheric light intensity, combining the transmissivity image and the atmospheric light intensity, and calculating a defogged laparoscope image by using an image restoration formula as follows:
wherein J represents the defogged laparoscopic image;
Inputting the calibrated laparoscopic image into a DeepLabv & lt3+ & gt model, wherein the DeepLabv & lt3+ & gt model comprises an encoder and a decoder, the encoder is based on a modified ResNet architecture and is used for extracting an advanced semantic feature map of the laparoscopic image through hole volume and multi-scale feature fusion, the decoder is used for carrying out pixel-by-pixel upsampling operation on the feature map output by the encoder, restoring the low-resolution feature map to the same size as the input image and outputting a segmentation probability map, the segmentation probability map is a multi-channel image, each channel corresponds to one class probability distribution, pixel-by-pixel analysis is carried out on the segmentation probability map, class probability of each pixel is mapped into a specific class label, and a final surgical instrument segmentation mask is generated.
The embodiment of the application adopts a further technical scheme that the device comprises a processor and a memory coupled with the processor, wherein,
The memory stores program instructions for implementing the 3D laparoscopic surgical data processing method;
The processor is configured to execute the program instructions stored by the memory to control a 3D laparoscopic surgical data processing method.
According to another technical scheme adopted by the embodiment of the application, a storage medium is used for storing program instructions which can be operated by a processor, and the program instructions are used for executing the 3D laparoscopic surgery data processing method.
Compared with the prior art, the 3D laparoscopic surgery data processing method, device and equipment and the storage medium have the advantages that the laparoscopic surgery image is preprocessed by combining the dark channel prior algorithm and the DeepLabv3+ framework, smoke is quickly removed by the dark channel prior algorithm, the DeepLabv3+ framework is utilized to ensure high precision of segmentation of surgical instrument masks, reliable support is provided for real-time decision in the surgery process, the problems of smoke interference and instrument shielding in the laparoscopic image can be simultaneously solved, balance between instantaneity and high precision is achieved, the image quality and performance of subsequent tasks are remarkably improved, the preprocessed image data can be directly applied to surgery navigation and instrument detection tasks, the surgery accuracy and safety are improved, additional adaptation is not needed, and the method has good expandability and wide application prospect.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," "third," and the like in this disclosure are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first", "a second", and "a third" may explicitly or implicitly include at least one such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise. All directional indications (such as up, down, left, right, front, rear) in embodiments of the present application are merely used to explain the relative positional relationship, movement, etc. between the components in a particular pose (as shown in the drawings), and if the particular pose changes, the directional indication changes accordingly. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
Specifically, please refer to fig. 1, which is a flowchart of a 3D laparoscopic surgery data processing method according to an embodiment of the present application. The 3D laparoscopic surgery data processing method provided by the embodiment of the application comprises the following steps:
S100, acquiring an original operation data video stream from a binocular laparoscope system;
In this step, the binocular laparoscopic system includes left and right cameras respectively used for capturing laparoscopic images of an operation area during an operation process, and generating an original operation data video stream of the 3D laparoscopic operation.
S110, framing an original operation data video stream by adopting a video processing technology, and separating a laparoscope image sequence corresponding to each of the left camera and the right camera;
In this step, since the original surgical data video stream is acquired by the left and right cameras, in the 3D laparoscopic surgical data processing process, the original surgical data video stream needs to be framed first to separate the laparoscopic image sequences corresponding to the left and right cameras. The specific framing mode is that each frame of image in the original operation data video stream is acquired, each frame of image is respectively divided into a left part and a right part along the horizontal direction, and the left part and the right part correspond to the laparoscope images of the left camera and the right camera respectively.
S120, calibrating the framing laparoscopic image sequence based on camera parameters to obtain a laparoscopic image sequence for eliminating image distortion;
In this step, the calibration process is based on camera parameters, and uses a checkerboard as a calibration tool to calibrate the framed laparoscopic image sequence, and the specific calibration process includes:
s121, shooting N Zhang Qipan grids of images by using a camera, wherein the shooting quantity N of the grid images can be set according to actual application scenes;
S122, detecting corner points of the checkerboard image, calculating an internal reference matrix and a distortion coefficient of the camera according to 3D world coordinates of the checkerboard and the detected 2D image coordinates by using a calibration function of OpenCV, and optimizing camera parameters to minimize a re-projection error, wherein the internal reference matrix of the camera comprises a focal length, a principal point position and the like, and the distortion coefficient comprises barrel-shaped distortion, pillow-shaped distortion and the like;
S123, performing de-distortion processing on the framing laparoscopic image sequence by using the calculated internal reference matrix and the distortion coefficient, and recovering the real geometric shape of the laparoscopic image sequence.
It can be appreciated that the above framing and calibration process enables the laparoscopic image sequence to have higher quality and consistency, providing a reliable basis for surgical navigation and instrument detection tasks.
S130, respectively adopting a smoke removal technology based on a dark channel prior algorithm and a surgical instrument segmentation algorithm based on DeepLabv & lt3+ & gt architecture to remove smoke and segment surgical instruments on the calibrated laparoscopic image sequence, so as to generate a laparoscopic image sequence without smoke interference and instrument shielding;
In the step, through combining a dark channel prior algorithm with a DeepLabv3+ framework, firstly, adopting a smoke removal technology based on the dark channel prior algorithm to analyze the dark channel characteristics of a laparoscopic image, rapidly removing the interference of smoke on the laparoscopic image so as to recover a clear operation visual field, and then using the DeepLabv3+ framework to carry out operation instrument mask segmentation on the laparoscopic image, the problems of smoke interference and instrument shielding in the laparoscopic image can be simultaneously solved, the balance between instantaneity and high precision is realized, the image quality and the performance of subsequent tasks are obviously improved, and reliable support is provided for real-time decision in the operation process. Fig. 2 is a schematic view of smoke removal effect according to an embodiment of the present application, and fig. 3 is a schematic view of surgical instrument segmentation effect according to an embodiment of the present application.
Further, aiming at the image data with smoke, the application adopts a smoke removing technology based on a dark channel prior algorithm to remove the smoke. The specific smoke removal process includes first calculating the minimum value of each pixel point from the RGB channels of each frame of laparoscopic image, forming a dark channel image. Dark channel a priori theory holds that in a localized region of the haze-free image, the pixel values of at least one color channel approach zero. The details of the dark channels are further enhanced by morphological erosion operations, providing a basis for subsequent transmittance estimation. Assuming that the dark channel image represents the minimum light intensity under the haze-free condition, the transmittance map caused by atmospheric scattering is reversely calculated according to the relation between the dark channel image and the original laparoscopic image (namely, the calibrated laparoscopic image), wherein the transmittance calculation formula is as follows:
where t represents the transmittance map, I represents the original laparoscopic image, a represents the atmospheric light intensity, ω is an adjustment parameter for controlling the defogging intensity.
By globally analyzing the dark channel image, the brightest pixel with a set proportion in the dark channel image is selected as the atmospheric light intensity, wherein the atmospheric light is a part with uniformly weakened illumination in the atomization process, and the value of the atmospheric light is usually close to the RGB value of white light. The setting proportion of the brightest pixel in the embodiment of the application is 0.1%, and the setting can be specifically performed according to an actual application scene.
Combining the transmissivity graph and the atmospheric light intensity, and calculating a defogged laparoscope image by using an image restoration formula as follows:
wherein J represents the defogged laparoscopic image.
Through the operation, the smoke interference in the laparoscopic image can be effectively removed, the clear operation visual field is recovered, and high-quality image support is provided for subsequent operation navigation and instrument detection tasks.
Further, aiming at the laparoscopic image with instrument shielding, the application adopts a DeepLabv3+ architecture-based surgical instrument segmentation algorithm to segment the surgical instrument. The specific segmentation process includes first inputting the calibrated laparoscopic image into a DeepLabv < 3+ > model, deepLabv < 3+ > model including an encoder and a decoder, wherein the encoder is based on a modified ResNet architecture for extracting high-level semantic feature maps of the laparoscopic image through a hole volume and multi-scale feature fusion, thereby capturing key information of the surgical instrument while preserving sufficient context information to support accurate segmentation. Subsequently, the feature map output from the encoder is subjected to a pixel-by-pixel up-sampling operation by the decoder section, the low-resolution feature map is restored to the same size as the input image, and a segmentation probability map is output. The segmentation probability map is a multi-channel image, with each channel corresponding to a probability distribution of a class. In the post-processing stage, the segmentation probability map is analyzed pixel by pixel, the class probability of each pixel is mapped to a specific class label by thresholding or maximum indexing operations, ensuring that each pixel is assigned to the most likely class, thereby generating the final surgical instrument segmentation mask.
Specifically, the finally generated surgical instrument segmentation mask is represented in the form of a binary image, wherein a surgical instrument area is marked as a foreground (for example, the value is 1), a background area is marked as a background (for example, the value is 0), and the generated surgical instrument segmentation mask is directly applied to a surgical navigation or instrument detection task, so that the accuracy and safety of a surgery can be greatly improved.
Based on the above, the 3D laparoscopic surgery data processing method of the embodiment of the present application performs preprocessing on the laparoscopic surgery image by combining the dark channel prior algorithm and DeepLabv3+ architecture, rapidly removes smoke by using the dark channel prior algorithm, ensures high precision of surgical instrument mask segmentation by using DeepLabv3+ architecture, provides reliable support for real-time decision in the surgery process, can simultaneously solve the problems of smoke interference and instrument shielding in the laparoscopic image, achieves balance between instantaneity and high precision, significantly improves the image quality and performance of subsequent tasks, and the preprocessed image data can be directly applied to surgical navigation and instrument detection tasks, improves the accuracy and safety of surgery, and has good expandability and wide application prospects without additional adaptation.
Referring to fig. 4, a schematic structural diagram of a 3D laparoscopic surgery data processing device according to an embodiment of the application is shown. The 3D laparoscopic surgery data processing method device 40 of the embodiment of the present application includes:
The image framing module 41 is used for framing the original operation data video stream by adopting a video processing technology to obtain a laparoscope image sequence corresponding to the left camera and the right camera of the binocular laparoscope system;
an image calibration module 42, configured to calibrate the laparoscopic image sequence based on camera parameters, so as to obtain a laparoscopic image sequence with image distortion removed;
The image preprocessing module 43 is configured to perform smoke removal and surgical instrument segmentation on the calibrated laparoscopic image sequence by using a smoke removal technology based on a dark channel prior algorithm and a surgical instrument segmentation algorithm based on DeepLabv3+ architecture, so as to generate a laparoscopic image sequence without smoke interference and instrument shielding.
It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.
The device provided by the embodiment of the present application may be applied to the foregoing method embodiment, and details refer to the description of the foregoing method embodiment, which is not repeated herein.
Fig. 5 is a schematic diagram of an apparatus structure according to an embodiment of the application. The apparatus 50 comprises:
a memory 51 storing executable program instructions;
A processor 52 connected to the memory 51;
The processor 52 is configured to invoke executable program instructions stored in the memory 51 and perform the steps of framing an original surgical data video stream by using a video processing technique to obtain a laparoscopic image sequence corresponding to left and right cameras of the binocular laparoscopic system, calibrating the laparoscopic image sequence based on camera parameters to obtain a laparoscopic image sequence for eliminating image distortion, and performing smoke removal and surgical instrument segmentation on the calibrated laparoscopic image sequence by using a smoke removal technique based on a dark channel prior algorithm and a surgical instrument segmentation algorithm based on DeepLabv3+ architecture, respectively, so as to generate a laparoscopic image sequence without smoke interference and instrument occlusion.
The processor 52 may also be referred to as a CPU (Central Processing Unit ). The processor 52 may be an integrated circuit chip having signal processing capabilities. Processor 52 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Fig. 6 is a schematic structural diagram of a storage medium according to an embodiment of the application. The storage medium of the embodiment of the application stores program instructions 61 capable of realizing the steps of framing an original operation data video stream by adopting a video processing technology to obtain a laparoscopic image sequence corresponding to left and right cameras of a binocular laparoscopic system, calibrating the laparoscopic image sequence based on camera parameters to obtain a laparoscopic image sequence for eliminating image distortion, and respectively adopting a smoke removal technology based on a dark channel priori algorithm and a surgical instrument segmentation algorithm based on DeepLabv & lt+ & gt architecture to remove smoke and segment surgical instruments on the calibrated laparoscopic image sequence to generate a laparoscopic image sequence without smoke interference and instrument shielding. The program instructions 61 may be stored in the storage medium as a software product, and include instructions for causing a device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods according to the embodiments of the present application. The storage medium includes a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program instructions, or a terminal device such as a computer, a server, a mobile phone, a tablet, etc. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the system embodiments described above are merely illustrative, e.g., the partitioning of elements is merely a logical functional partitioning, and there may be additional partitioning in actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not implemented. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The foregoing is only the embodiments of the present application, and therefore, the patent scope of the application is not limited thereto, and all equivalent structures or equivalent processes using the descriptions of the present application and the accompanying drawings, or direct or indirect application in other related technical fields, are included in the scope of the application.