You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: gsoc-gpu.md
+59-3Lines changed: 59 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,7 @@ Due to these discoveries, the scope of the GPU modernization effort was expande
26
26
27
27
### Fixes to Octree search methods and modernizing CUDA functions
28
28
29
-
Related PRs: [[#4146]](https://github.com/PointCloudLibrary/pcl/pull/4146)[[#4306]](https://github.com/PointCloudLibrary/pcl/pull/4306)[[#4313]](https://github.com/PointCloudLibrary/pcl/pull/4313)
29
+
Related PRs: [[4146]](https://github.com/PointCloudLibrary/pcl/pull/4146)[[4306]](https://github.com/PointCloudLibrary/pcl/pull/4306)[[4313]](https://github.com/PointCloudLibrary/pcl/pull/4313)
30
30
31
31
After comprehensively going through the GPU search methods to investigate their functionality and the causes of the above issues, we identified two separate bugs as the underlying cause:
32
32
1. In approximate nearest search and K nearest search, an outdated method was being used to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed in https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/ .
@@ -36,15 +36,15 @@ Since much of the code inside the above functions utilized an outdated concept o
36
36
37
37
### Implementation of new traversal mechanism of approximate nearest search
38
38
39
-
Related PRs: [[#4294]](https://github.com/PointCloudLibrary/pcl/pull/4294)
39
+
Related PRs: [[4294]](https://github.com/PointCloudLibrary/pcl/pull/4294)
40
40
41
41
The existing implementation of approximate nearest search utilized a simple traversal mechanism which traverses down octree nodes until an empty node is found. Once an empty node is discovered, all points within the parent are searched exhaustively for the closest point. However the CPU counterpart of the approximate nearest search algorithm uses a heuristic (distance from query point to voxel center) to determine the most appropriate voxel to traverse, in case an empty node is discovered. Thus this algorithm will always traverse to the lowest level of an octree. The same traversal method was adapted to the morton code based octree traversal mechanism and implemented for the two GPU approximate nearest search methods.
42
42
43
43
In addition a new test was designed to assess the functionality of the new traversal mechanism and to ensure that it tallies with that of the CPU approximate nearest search.
44
44
45
45
### Modifying search functions to return square distances
46
46
47
-
Related PRs: [[#4338]](https://github.com/PointCloudLibrary/pcl/pull/4338)[[4340]](https://github.com/PointCloudLibrary/pcl/pull/4340)
47
+
Related PRs: [[4338]](https://github.com/PointCloudLibrary/pcl/pull/4338)[[4340]](https://github.com/PointCloudLibrary/pcl/pull/4340)
48
48
49
49
One noticeable flaw in the current GPU search implementations was the inability to return square distances to the identified result points. In order to counter this, the search methods were modified to keep track of and return the distances to the identified results. For Approximate nearest search and K nearest search this was relatively easy, and did not incur a time penalty.
50
50
@@ -83,6 +83,62 @@ One additional drawback with the current GPU K nearest neighbour search algorith
83
83
84
84
## Introducing flexible types for indices
85
85
86
+
As laser scans and LIDAR becomes more popular, the need arises for handling clouds with a very large number of points. However, many algorithms within the Point Cloud Library are incapable of handling point clouds containing over 2 billion points due to their indices being limited to 32 bits, which caps the size of supported point clouds at 2 billion. Furthermore, there currently isn’t one standard type being used for indices, instead, a variety of types such as `int`, `long`, `unsigned_int`, and others are being used. Therefore, there is a pressing need to switch to a standard type for indices.
87
+
88
+
On the flip side, due to the increased memory usage of types with larger capacity, the memory efficiency of the library may be significantly reduced, and further complications with caching etc. may arise, which can be a serious concern considering the large variety of platforms that PCL is used on. Thus, the ideal solution would be to allow the user to choose which point type to utilize at compile-time, based on his intended use case and platform.
89
+
90
+
This flexibility can be offered to the user by transitioning the PCL library’s various modules to the `pcl::index_t` type.
91
+
92
+
### Providing compile time options to select index types
93
+
94
+
Related PRs: [[4166]](https://github.com/PointCloudLibrary/pcl/pull/4166)
95
+
96
+
CMake options were added to allow users to select:
97
+
- Type of index (signed / unsigned – signed by default);
98
+
- Sign of index (8 / 16 / 32 / 64 – 32 by default);
99
+
at compile-time, from PCL 1.12 onwards.
100
+
101
+
### Adding a CI job for testing 64bit unsigned index type
102
+
103
+
Related RPs: [[4184]](https://github.com/PointCloudLibrary/pcl/pull/4184)
104
+
105
+
An additional job was added to the CI pipeline to check for any failures that may arise when compiling/running tests with 64 bit, unsigned indices, as opposed to the default 32 bit, signed indices. The CI job was initially configured to only build the modules that have been transitioned to the new index type, so that, as more modules are transitioned, they could be added to the build configuration.
106
+
107
+
### Transitioning fundamental classes to the `index_t` type
108
+
109
+
Related PRs: [[4173]](https://github.com/PointCloudLibrary/pcl/pull/4173)[[4199]](https://github.com/PointCloudLibrary/pcl/pull/4199)[[4198]](https://github.com/PointCloudLibrary/pcl/pull/4198)[[4205]](https://github.com/PointCloudLibrary/pcl/pull/4205)[[4211]](https://github.com/PointCloudLibrary/pcl/pull/4211)[[4224]](https://github.com/PointCloudLibrary/pcl/pull/4224)[[4228]](https://github.com/PointCloudLibrary/pcl/pull/4228)[[4231]](https://github.com/PointCloudLibrary/pcl/pull/4231)[[4256]](https://github.com/PointCloudLibrary/pcl/pull/4256)[[4257]](https://github.com/PointCloudLibrary/pcl/pull/4257)
110
+
111
+
A set of fundamental classes such as `pcl::PointCloud` lie at the core of PCL. These classes contain various data representations which did not have a common type of index. In order to mitigate this issue, all such indices from these classes were switched to the `index_t` type.
112
+
113
+
For situations where unsigned indices were required, a new type called `uindex_t` was also introduced, which acts as an unsigned version of the `index_t`.
114
+
115
+
This transition was carried out for the following classes:
116
+
- PointCloud
117
+
- PCLPointCloud2
118
+
- PCLBase
119
+
- PCLPointField
120
+
- Correspondences
121
+
- Vertices
122
+
- PCLImage
123
+
124
+
During the above transition process, it was discovered that significant additional work was required to address the numerous sign comparison warnings and other errors that arose from the transition in some of the above classes, which took up considerable time.
125
+
126
+
Furthermore, any changes beyond transitioning the above fundamental classes would have required additional workarounds to carry on, if they were to be carried out before the changes to the fundamental classes have been merged. (These features were planned to be merged in in PCL 1.12). Thus, work was shifted to the GPU module at this point.
127
+
128
+
In addition, while the common module had already been modified to make it compatible with `index_t`, the tests for this module had not been modified. This was achieved with a very straightforward replacement of integer vectors with `index_t` vectors.
129
+
130
+
### Transitioning the octree module
131
+
132
+
Related PRs: [[4179]](https://github.com/PointCloudLibrary/pcl/pull/4179)
133
+
134
+
All indices within the octree module were converted to `index_t` and its derivatives. This was also a fairly straightforward process of replacing types, once the fundamental types have already been transitioned. The tests for the octree module were also modified to achieve the same effect.
135
+
136
+
### Conclusion and future work
137
+
138
+
The work carried out primarily focuses on setting the stage for an easy transition towards flexible index types. To this end, the `index_t` type has been adapted into the fundamental classes and resulting complications have been addressed. The CI pipeline has also been modified to verify the success of this transition.
139
+
140
+
However, since the above changes only partially cover the transition to flexible index types, additional work must be carried out to complete the transition. Specifically, the rest of the modules must be converted to index_t, and findings from the work done for the transition of the octree module demonstrate a fairly straightforward path towards the conversion of these modules and their tests.
141
+
86
142
## Summary
87
143
88
144
The internship period was focused on tackling “modernization of the GPU octree module” and “Introducing flexible types for indices”. The scope of these tasks were initially under-estimated in the original proposal, and additional requirements were discovered, which resulted in skipping some of the other goals in favour of prioritizing the above tasks.
0 commit comments