Skip to content

Commit ea4d840

Browse files
committed
Add section on flexible indices
1 parent 343d485 commit ea4d840

File tree

1 file changed

+59
-3
lines changed

1 file changed

+59
-3
lines changed

gsoc-gpu.md

Lines changed: 59 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Due to these discoveries, the scope of the GPU modernization effort was expande
2626

2727
### Fixes to Octree search methods and modernizing CUDA functions
2828

29-
Related PRs: [[#4146]](https://github.com/PointCloudLibrary/pcl/pull/4146) [[#4306]](https://github.com/PointCloudLibrary/pcl/pull/4306) [[#4313]](https://github.com/PointCloudLibrary/pcl/pull/4313)
29+
Related PRs: [[4146]](https://github.com/PointCloudLibrary/pcl/pull/4146) [[4306]](https://github.com/PointCloudLibrary/pcl/pull/4306) [[4313]](https://github.com/PointCloudLibrary/pcl/pull/4313)
3030

3131
After comprehensively going through the GPU search methods to investigate their functionality and the causes of the above issues, we identified two separate bugs as the underlying cause:
3232
1. In approximate nearest search and K nearest search, an outdated method was being used to synchronize data between threads in order to sort distances across warp threads. This was fixed by replacing the functionality with warp level primitives introduced in CUDA 9.0 detailed in https://developer.nvidia.com/blog/using-cuda-warp-level-primitives/ .
@@ -36,15 +36,15 @@ Since much of the code inside the above functions utilized an outdated concept o
3636

3737
### Implementation of new traversal mechanism of approximate nearest search
3838

39-
Related PRs: [[#4294]](https://github.com/PointCloudLibrary/pcl/pull/4294)
39+
Related PRs: [[4294]](https://github.com/PointCloudLibrary/pcl/pull/4294)
4040

4141
The existing implementation of approximate nearest search utilized a simple traversal mechanism which traverses down octree nodes until an empty node is found. Once an empty node is discovered, all points within the parent are searched exhaustively for the closest point. However the CPU counterpart of the approximate nearest search algorithm uses a heuristic (distance from query point to voxel center) to determine the most appropriate voxel to traverse, in case an empty node is discovered. Thus this algorithm will always traverse to the lowest level of an octree. The same traversal method was adapted to the morton code based octree traversal mechanism and implemented for the two GPU approximate nearest search methods.
4242

4343
In addition a new test was designed to assess the functionality of the new traversal mechanism and to ensure that it tallies with that of the CPU approximate nearest search.
4444

4545
### Modifying search functions to return square distances
4646

47-
Related PRs: [[#4338]](https://github.com/PointCloudLibrary/pcl/pull/4338) [[4340]](https://github.com/PointCloudLibrary/pcl/pull/4340)
47+
Related PRs: [[4338]](https://github.com/PointCloudLibrary/pcl/pull/4338) [[4340]](https://github.com/PointCloudLibrary/pcl/pull/4340)
4848

4949
One noticeable flaw in the current GPU search implementations was the inability to return square distances to the identified result points. In order to counter this, the search methods were modified to keep track of and return the distances to the identified results. For Approximate nearest search and K nearest search this was relatively easy, and did not incur a time penalty.
5050

@@ -83,6 +83,62 @@ One additional drawback with the current GPU K nearest neighbour search algorith
8383

8484
## Introducing flexible types for indices
8585

86+
As laser scans and LIDAR becomes more popular, the need arises for handling clouds with a very large number of points. However, many algorithms within the Point Cloud Library are incapable of handling point clouds containing over 2 billion points due to their indices being limited to 32 bits, which caps the size of supported point clouds at 2 billion. Furthermore, there currently isn’t one standard type being used for indices, instead, a variety of types such as `int`, `long`, `unsigned_int`, and others are being used. Therefore, there is a pressing need to switch to a standard type for indices.
87+
88+
On the flip side, due to the increased memory usage of types with larger capacity, the memory efficiency of the library may be significantly reduced, and further complications with caching etc. may arise, which can be a serious concern considering the large variety of platforms that PCL is used on. Thus, the ideal solution would be to allow the user to choose which point type to utilize at compile-time, based on his intended use case and platform.
89+
90+
This flexibility can be offered to the user by transitioning the PCL library’s various modules to the `pcl::index_t` type.
91+
92+
### Providing compile time options to select index types
93+
94+
Related PRs: [[4166]](https://github.com/PointCloudLibrary/pcl/pull/4166)
95+
96+
CMake options were added to allow users to select:
97+
- Type of index (signed / unsigned – signed by default);
98+
- Sign of index (8 / 16 / 32 / 64 – 32 by default);
99+
at compile-time, from PCL 1.12 onwards.
100+
101+
### Adding a CI job for testing 64bit unsigned index type
102+
103+
Related RPs: [[4184]](https://github.com/PointCloudLibrary/pcl/pull/4184)
104+
105+
An additional job was added to the CI pipeline to check for any failures that may arise when compiling/running tests with 64 bit, unsigned indices, as opposed to the default 32 bit, signed indices. The CI job was initially configured to only build the modules that have been transitioned to the new index type, so that, as more modules are transitioned, they could be added to the build configuration.
106+
107+
### Transitioning fundamental classes to the `index_t` type
108+
109+
Related PRs: [[4173]](https://github.com/PointCloudLibrary/pcl/pull/4173) [[4199]](https://github.com/PointCloudLibrary/pcl/pull/4199) [[4198]](https://github.com/PointCloudLibrary/pcl/pull/4198) [[4205]](https://github.com/PointCloudLibrary/pcl/pull/4205) [[4211]](https://github.com/PointCloudLibrary/pcl/pull/4211) [[4224]](https://github.com/PointCloudLibrary/pcl/pull/4224) [[4228]](https://github.com/PointCloudLibrary/pcl/pull/4228) [[4231]]( https://github.com/PointCloudLibrary/pcl/pull/4231) [[4256]](https://github.com/PointCloudLibrary/pcl/pull/4256) [[4257]](https://github.com/PointCloudLibrary/pcl/pull/4257)
110+
111+
A set of fundamental classes such as `pcl::PointCloud` lie at the core of PCL. These classes contain various data representations which did not have a common type of index. In order to mitigate this issue, all such indices from these classes were switched to the `index_t` type.
112+
113+
For situations where unsigned indices were required, a new type called `uindex_t` was also introduced, which acts as an unsigned version of the `index_t`.
114+
115+
This transition was carried out for the following classes:
116+
- PointCloud
117+
- PCLPointCloud2
118+
- PCLBase
119+
- PCLPointField
120+
- Correspondences
121+
- Vertices
122+
- PCLImage
123+
124+
During the above transition process, it was discovered that significant additional work was required to address the numerous sign comparison warnings and other errors that arose from the transition in some of the above classes, which took up considerable time.
125+
126+
Furthermore, any changes beyond transitioning the above fundamental classes would have required additional workarounds to carry on, if they were to be carried out before the changes to the fundamental classes have been merged. (These features were planned to be merged in in PCL 1.12). Thus, work was shifted to the GPU module at this point.
127+
128+
In addition, while the common module had already been modified to make it compatible with `index_t`, the tests for this module had not been modified. This was achieved with a very straightforward replacement of integer vectors with `index_t` vectors.
129+
130+
### Transitioning the octree module
131+
132+
Related PRs: [[4179]](https://github.com/PointCloudLibrary/pcl/pull/4179)
133+
134+
All indices within the octree module were converted to `index_t` and its derivatives. This was also a fairly straightforward process of replacing types, once the fundamental types have already been transitioned. The tests for the octree module were also modified to achieve the same effect.
135+
136+
### Conclusion and future work
137+
138+
The work carried out primarily focuses on setting the stage for an easy transition towards flexible index types. To this end, the `index_t` type has been adapted into the fundamental classes and resulting complications have been addressed. The CI pipeline has also been modified to verify the success of this transition.
139+
140+
However, since the above changes only partially cover the transition to flexible index types, additional work must be carried out to complete the transition. Specifically, the rest of the modules must be converted to index_t, and findings from the work done for the transition of the octree module demonstrate a fairly straightforward path towards the conversion of these modules and their tests.
141+
86142
## Summary
87143

88144
The internship period was focused on tackling “modernization of the GPU octree module” and “Introducing flexible types for indices”. The scope of these tasks were initially under-estimated in the original proposal, and additional requirements were discovered, which resulted in skipping some of the other goals in favour of prioritizing the above tasks.

0 commit comments

Comments
 (0)