Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a block diagram according to an embodiment of the present invention.
The invention fuses 3D point cloud data (also called first data) obtained by a laser radar (also called first sensor) and 360-degree panoramic image data (also called second data) obtained by a look-around camera (also called second sensor), adopts a mode of dividing a drivable region on an image, projects the point cloud onto a division result through a calibration relation among the sensors, and finally obtains a ground point cloud (also called first region) and a drivable region (also called second region). Further, the ground point cloud may be set as the second area, and the travelable area may be set as the first area. Furthermore, the obtained point cloud of the travelable region can be used for point cloud filtering. Therefore, the laser radar point cloud data and the panoramic image data of the panoramic camera can be fused, and a plurality of tasks of division of a travelable area and filtering of the point cloud ground are solved. Here, one lidar and a plurality of look-around cameras are employed. However, the number of the lidar and the looking-around camera is not limited thereto, and for example, a plurality of lidar and a plurality of looking-around cameras may be employed. In addition, the present invention may employ various types of lidar such as 16-wire, 32-wire, 40-wire, 64-wire, etc., and the radar employed is not limited as long as the present invention can be completed. In addition, the present invention may employ a color camera, an infrared camera, or the like as the looking-around camera, and the looking-around camera employed is not limited as long as the present invention can be completed.
The sensor calibration is a basic requirement of automatic driving, the sensor calibration is to determine the coordinate relationship between a plurality of sensors carried by a vehicle body through an algorithm, and the work can be divided into an internal reference calibration and an external reference calibration, wherein the internal reference calibration is used for determining the mapping relationship inside the sensor, and the external reference calibration is used for determining the conversion relationship between the sensor and a specific external coordinate system.
The specific method for calibrating the sensor is as follows.
The Light Detection AND RANGING is a short term of a laser Detection and ranging system, and data obtained by scanning a laser radar is laser radar point cloud data. The lidar data may be represented as a matrix of N x 4, where N represents N scan points in the frame lidar, each scan point is represented by four-dimensional features, which are respectively the x-coordinate, y-coordinate, z-coordinate, and reflectivity (intensity) of the object under the lidar coordinate system. Let a point in the lidar be P, P may be expressed as (x, y, z, intensity).
The image data is a gray scale image, which can be seen as a matrix of w×h, the values of the elements in the matrix are between 0 and 255, W represents the width of the image, and H represents the height of the image.
The world coordinate system is converted to the world camera coordinate system through the world matrix of the world camera, one point coordinate in the world coordinate system is set as (x W,yW,zW), and when the point coordinate is multiplied by the world matrix, the point coordinate is converted to the camera coordinate system from the world coordinate system. Wherein the extrinsic matrix may be represented as.
Where R represents a rotation matrix and T represents a translation matrix.
The reference matrix I of the looking-around camera is that the coordinate system of the looking-around camera is converted into the image pixel coordinate system through the reference matrix, one point coordinate of the looking-around camera coordinate system is set as (x C,yC,zC), and when the point coordinate is multiplied by the reference matrix, the coordinate of the looking-around camera under the image pixel coordinate system can be obtained.
The world coordinate system is converted to the radar coordinate system through the external reference matrix of the laser radar, one point coordinate in the world coordinate system is set as (x W,yW,zW), and when the point coordinate is multiplied by the external reference matrix, the point coordinate is converted to the radar coordinate system from the world coordinate system.
The transformation matrix T l C from the laser radar coordinate system to the looking-around camera coordinate system can be obtained from the looking-around camera external parameter E (i.e. the transformation matrix from the world coordinate system to the camera coordinate system) and the external parameter L of the laser radar (i.e. the transformation matrix from the world coordinate system to the laser radar coordinate system). Assuming that the coordinate of a certain point in the world coordinate system is P, the coordinate of P in the looking-around camera coordinate system is P C =e×p, and the coordinate of P in the laser radar coordinate system is P l =l×p, so E -1*PC=L-1*Pl, i.e. P C=E*L-1*Pl, and T l C= E*L-1, P C=Tl C*Pl,Tl C is a transformation matrix from the laser radar coordinate system to the looking-around camera coordinate system. Therefore, the coordinates of the laser radar point cloud point projected under the image pixel coordinate system can be obtained by calibrating the internal parameters and the external parameters of the looking-around camera and the external parameters of the laser radar. The traditional around-looking camera calibration technology is divided into a linear calibration method, a nonlinear optimization calibration method and a two-step calibration method. The camera calibration is carried out by adopting a Zhang Zhengyou calibration method of a two-step calibration method. And obtaining an external reference matrix from the laser radar to the world coordinate system through matching the point cloud and the high-precision map.
As shown in fig. 1, the system inputs four front, rear, left, and right images obtained for the looking-around camera, and performs division of the travelable region in parallel. And 3D point cloud data are projected onto four paths of images through a calibration relation. If the point cloud falls within the drivable region segmented on the image, then the point belongs to a ground point cloud. The image-based drivable region segmentation module adopts a coding-decoding (Encoder-Decoder) main frame structure, takes depth residual error network (Resnet) and DENSEASPP (Densely connected Atrous SPACIAL PYRAMID Pooling (densely connected convolution space pyramid pool)) modules as codes (Encoder), and adds a compact decoding (Decoder) module to improve the segmentation effect of object edges. Here, the depth network faces the problem of gradient disappearance during back propagation with increasing number of layers, and the degradation problem that the training error rate and the test error rate increase instead with increasing number of layers. The residual Network uses the cross-layer linking thought of a high-speed Network (Highway Network), but improves the residual Network, so that the Network layer number is deepened, and the final classification effect is improved.
The basic principle of the multi-sensor information fusion technology is just like the process of comprehensively processing information by the brain of a person, and various sensors (such as a laser radar serving as a first sensor, a looking-around camera serving as a second sensor and the like) are subjected to multi-level and multi-space information complementation and optimized combination processing, so that consistency interpretation of an observation environment is finally generated.
In order to solve the above key problems with strong correlation, the application provides a new method for filtering point cloud data and detecting a travelable area. The method utilizes pixel information (such as pixel information of 3D point cloud data) obtained by a camera and space point cloud information (such as space point cloud information of panoramic image data) obtained by a laser radar to fuse. The fusion method comprises the steps of firstly carrying out pixel-by-pixel segmentation on the picture, segmenting out pixel points belonging to the ground, and then projecting laser radar points under a camera pixel coordinate system through a calibration relation among sensors, so that space point cloud information belonging to the ground can be obtained. The devices used are a look-around camera for obtaining pixel information and a lidar for obtaining point cloud data. That is, the data fusion method of the laser radar point cloud data and the 360-degree panoramic image data is to project the laser radar point cloud onto the panoramic image through calibration between the laser radar serving as a sensor and the panoramic camera, obtain an internal reference (internal reference) I of the panoramic camera through calibration, set a coordinate of a point P under a laser radar coordinate system as P l (x, y, z) in a transformation matrix T l C of the laser radar to the panoramic camera, set a coordinate of the point P under the coordinate system of the panoramic camera as P C=Tl C*Pl, and obtain a coordinate of the point P under an image coordinate system according to the internal reference of the panoramic camera as P i=IPC=I*Tl C*Pl. And projecting the laser radar point cloud point P onto a picture, wherein if the coordinates of the laser radar point cloud point P fall in the segmented travelable region, the point P is a ground point cloud, and otherwise, the point P is a non-ground point cloud.
Specifically, a travelable region is obtained through an image segmentation algorithm, and then points from a laser radar are mapped onto a segmentation result of an image, so that a ground point cloud is obtained. The image segmentation is to predict the category or object to which each pixel in the image belongs, and the image segmentation algorithm is a traditional image segmentation method and a deep learning method. The image segmentation step by the deep learning method comprises the steps that the model input is a picture, the feature map after dimension reduction is obtained through a plurality of convolution layers after the model is entered, and a classification result of each pixel of the feature map is obtained through a plurality of convolution layers.
As shown in fig. 1, the travelable region segmentation and the ground point cloud recognition are performed by fusing the point cloud data and the image data, and a detailed program flow chart is shown in fig. 1. Specifically, in FIG. 1, an Image queue (ImageQueue) is formed by dividing a travelable region for each panoramic Image in parallel to obtain a division result and putting SEGM ENTLIST (division result) into Front images (Front images), rear images (REAR IMAGE), left images (LEFT IMAGE), and right images (RIGHT IMAGE), which are images acquired by Front, rear, left, and right cameras, respectively. The network is divided, a main frame of Encoder-Decoder is adopted, the Encode part extracts features from the depth residual error network, the DENSEASPP is used for expanding the feature receiving field without sacrificing the spatial resolution of the features, the Decoder fuses low-level features and high-level features, the feature map is up-sampled to be the original map size, and a travelable region (Drivable _Are a) is obtained, and other regions except the travelable region Are Non-travelable regions (Non-Drivable _Area). The projection of the point cloud to the image comprises the steps of reading a calibration file, calculating a projection matrix P_ velodyne _to_ imgage, and loading Lei Dadian cloud data X to obtain a coordinate Y=P_ vel odyne _to_ imgage X of X projected onto the two-dimensional image. And setting a certain point in the point cloud as X, projecting the point cloud to a two-dimensional image as Y, wherein if Y is in Drivable _Ares area, X is the ground point cloud, and otherwise, the point cloud is the non-ground point cloud. Here, the two-dimensional image has been divided into a travelable region and a non-travelable region, and each pixel is classified, and the point cloud X point is projected to a certain pixel point Y on the two-dimensional image, and based on the result of the division, it is possible to know whether Y is within the travelable region.
As shown in fig. 1, the detailed flow is as follows:
(1) Defining and initializing a global variable image queue ImageQueue to store images obtained by the front, rear, left and right cameras, initializing a segmentation result SEGMENTLIST to store segmentation results of a drivable region of the images, initializing a projection result ProjectList to store projection results of point clouds to pictures, and initializing a ground point cloud GroundPoints to store the obtained ground point cloud results;
(2) Sequentially accessing four cameras and adding the read latest images to the tail of a queue ImageQueue;
(3) And judging SegNum (dividing the thread number of the threads), reading and removing ImageQueue queue head images when SegNum is less than 4, transmitting the read images to a movable area dividing module for dividing, and adding SEGMENTLIST. And simultaneously transmitting the read picture to a projection module for projection of the point cloud to the 2D image, and adding ProjectList as a result. SegNum plus 1;
(4) When SegNum > =4, SEGMENTLIST and ProjectList are in one-to-one correspondence, and according to the segmentation result of the feasible region, the attribute is given to the point projected on the image, namely, the point falling on the feasible region is given the ground attribute, otherwise, the non-ground attribute is given, so that GroundPoints, segNum is given the value of 0;
(5) Judging whether the program is ended, if so, ending the program, otherwise, executing the step (2).
Fig. 2 is a schematic diagram of steps of a travelable region detection method according to an embodiment of the present invention.
Specifically, as shown in fig. 2, the travelable region detection method of the present invention includes the following steps.
S201, obtaining first data by using a first sensor;
s202, obtaining second data by using a second sensor;
s203, carrying out driving area segmentation on the second data;
s204, calibrating the first sensor and the second sensor;
s205, projecting the first data to the second data for fusion after the calibration is carried out;
And S206, judging whether the result (first data as a judging result) obtained after the fusion is positioned in a drivable area of the second data and is divided into first area data and second area data.
The first sensor is a laser radar, the second sensor is an all-around camera, the first data are 3D point cloud data obtained by the laser radar, the second data are panoramic image data obtained by the all-around camera, the first area data are ground point cloud data, and the second area data are non-ground point cloud data. Of course, the present invention is not limited to the above specific examples, and is not limited at all as long as the gist function of the present invention can be achieved. For example, the first sensor and the second sensor are not limited to the above lidar and the look-around camera, but may be other sensors.
Further, the method may further include a step of performing point cloud filtering on the ground point cloud data as the first area data to remove the point cloud data. In this way, the travelable region segmentation and the point cloud ground filtering can be solved simultaneously, i.e. a plurality of tasks are completed simultaneously.
Example 1
The invention can be used in the automatic driving field of vehicles, and the areas where the vehicles can run and the areas where the vehicles cannot run are required to be known in real time, and the planning control module makes decisions, straight running, parking or detouring and the like according to the areas.
Example 2
When the laser radar point cloud data is used for 3D detection of the object, the ground point cloud can influence the accuracy of 3D detection of the object, so that the detection result can be improved by accurately removing the ground point cloud.
Example 3
The invention can also be used in the field of automatic driving of ships, and the areas where the ships can travel (water surface) and the areas where the ships cannot travel (land or other obstacles) are required to be known in real time, so that the planning control module makes decisions, moves straight, stops the ship or bypasses the ship.
Fig. 3 is a schematic diagram of main modules of a travelable region detection apparatus according to an embodiment of the present invention.
As shown in fig. 3, the travelable region detection apparatus 400 includes a laser radar 401 as a first acquisition module, a looking-around camera 402 as a second acquisition module, a processing unit 403, a calibration unit 404, a fusion unit 405, and a judgment unit 406.
The first acquisition module 401 obtains first data. The second acquisition module 402 obtains second data. The processing unit 403 performs travelable region division on the second data. The calibration unit 404 calibrates the first acquisition module 401 and the second acquisition module 402. The fusing unit 405 fuses the first data and the second data. The judging unit 406 judges whether the first data is located in a drivable area of the second data and is divided into the first area data and the second area data.
Fig. 4 illustrates an exemplary system architecture 500 of a travelable region detection method or travelable region detection apparatus to which embodiments of the present invention can be applied.
As shown in fig. 4, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 is used as a medium to provide communication links between the terminal devices 501, 502, 503 and the server 505. The network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 505 via the network 504 using the terminal devices 501, 502, 503 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc., may be installed on the terminal devices 501, 502, 503.
The terminal devices 501, 502, 503 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server providing support for shopping-type websites browsed by the user using the terminal devices 501, 502, 503. The background management server may analyze the received data such as the request for detecting the drivable area, and may feed back the processing result (for example, the detection result) to the terminal device.
It should be noted that, the method for detecting a travelable region according to the embodiment of the present invention is generally executed by the server 505, and accordingly, the travelable region detecting device is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks and servers in fig. 4 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 5, there is illustrated a schematic diagram of a computer system 600 suitable for use in implementing an electronic device of an embodiment of the present application. The electronic device shown in fig. 5 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present application.
As shown in fig. 5, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM602, and RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Connected to the I/O interface 605 are an input section 606 including a keyboard, a mouse, and the like, an output section 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like, a storage section 608 including a hard disk, and the like, and a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
In particular, according to the disclosed embodiments of the application, the processes described above with reference to the main step schematic diagrams may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the main step schematic. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 601.
The computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
The primary step diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the main step diagrams or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or main step diagrams, and combinations of blocks in the block diagrams or main step diagrams, can be implemented by special purpose hardware-based systems that perform the specified functions or operations, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in the processor, and may be described as a travelable region detection device comprising a first acquisition module, a second acquisition module, a processing unit, a calibration unit, a fusion unit, a determination unit 406, for example. Wherein the names of the modules do not constitute a limitation of the module itself in some cases,
As a further aspect, the invention also provides a computer readable medium which may be comprised in the device described in the above embodiments or may be present alone without being fitted into the device. The computer readable medium carries one or more programs, which when executed by a device, cause the device to include a step of obtaining first data with a first sensor, a step of obtaining second data with a second sensor, a step of dividing a travelable region of the second data, a step of calibrating the first sensor and the second sensor, a step of projecting the first data onto the second data for fusion, and a step of determining whether the first data is located in the travelable region of the second data and divided into first region data and second region data.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.