kdtree_cuda by koide3 · Pull Request #86 · koide3/gtsam_points

koide3 · 2025-12-06T23:29:28Z

No description provided.

Copilot

Pull request overview

This pull request adds GPU-accelerated GICP (Generalized Iterative Closest Point) matching using KdTree on CUDA. The implementation provides a GPU-based alternative to the existing CPU-based GICP factor, allowing for faster point cloud registration on systems with CUDA support.

Changes:

Introduces KdTreeGPU class for GPU-accelerated nearest neighbor search using KdTree data structure
Adds IntegratedGICPFactorGPU factor class for GPU-accelerated GICP matching cost computation
Implements CUDA kernels for KdTree search, correspondence computation, and GICP derivatives linearization
Adds comprehensive test coverage for both KdTree GPU functionality and GICP GPU factor integration
Updates demo and CI configuration to support the new GPU GICP factor

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`include/gtsam_points/ann/kdtree_gpu.hpp`	Header defining KdTreeGPU class with GPU nearest neighbor search interface
`include/gtsam_points/factors/integrated_gicp_factor_gpu.hpp`	Header defining IntegratedGICPFactorGPU class for GPU-accelerated GICP matching
`include/gtsam_points/factors/integrated_gicp_derivatives.cuh`	CUDA header for GICP derivatives computation utilities
`include/gtsam_points/cuda/kernels/kdtree.cuh`	CUDA kernel for KdTree nearest neighbor search on GPU
`include/gtsam_points/cuda/kernels/gicp_derivatives.cuh`	CUDA kernels for computing GICP derivatives and error
`include/gtsam_points/cuda/kernels/correspondence.hpp`	Header defining Correspondence structure for point matching
`src/gtsam_points/ann/kdtree_gpu.cpp`	CPU-side implementation of KdTreeGPU (tree construction, memory management)
`src/gtsam_points/ann/kdtree_gpu.cu`	GPU-side implementation of KdTree nearest neighbor search kernel
`src/gtsam_points/factors/integrated_gicp_factor_gpu.cpp`	Implementation of IntegratedGICPFactorGPU factor methods
`src/gtsam_points/factors/integrated_gicp_derivatives.cu`	Base implementation of GICP derivatives computation
`src/gtsam_points/factors/integrated_gicp_derivatives_inliers.cu`	Inlier correspondence management for GICP
`src/gtsam_points/factors/integrated_gicp_derivatives_linearize.cu`	GICP factor linearization on GPU
`src/gtsam_points/factors/integrated_gicp_derivatives_compute.cu`	GICP error computation on GPU
`src/test/test_kdtree.cpp`	Adds KdTreeGPU unit tests for nearest neighbor search accuracy
`src/test/test_matching_cost_factors.cpp`	Adds GICP_CUDA to integration tests for factor optimization
`src/demo/demo_matching_cost_factors.cpp`	Adds GICP_GPU option to matching cost factors demo
`CMakeLists.txt`	Adds new CUDA source files to build configuration
`.github/workflows/build.yml`	Removes CUDA 12.2 build, keeping 12.5 and 13.1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-16T15:06:26Z

+  check_error << cudaMemcpyAsync(nodes, h_nodes.data(), sizeof(KdTreeNodeGPU) * num_nodes, cudaMemcpyHostToDevice, stream);
+}
+
+KdTreeGPU::~KdTreeGPU() {


Inconsistent stream parameter in cudaFreeAsync calls. In the destructor, cudaFreeAsync is called with nullptr for the stream parameter, but these allocations were done with a specific stream in the constructor (line 52-55). The stream should match the allocation stream or be synchronized before freeing. Using nullptr may cause undefined behavior if the allocation stream is still using the memory.

Suggested change

KdTreeGPU::~KdTreeGPU() {

KdTreeGPU::~KdTreeGPU() {

// Ensure all GPU work that might use 'indices' or 'nodes' has completed

check_error << cudaDeviceSynchronize();

Fixed in the latest commit. Added cudaDeviceSynchronize() before the cudaFreeAsync calls to ensure all GPU work using those allocations has completed before freeing.

Copilot · 2026-02-16T15:06:26Z

+      if (stack_size > MAX_STACK_SIZE - 2) {
+        printf("kdtree stack overflow!!");
+      } else if (cut_sq_dist < result.second) {
+        search_stack[stack_size].first = other_child;
+        search_stack[stack_size++].second = cut_sq_dist;
+      }
+
+      search_stack[stack_size].first = best_child;
+      search_stack[stack_size++].second = 0.0f;


Stack overflow handling is incomplete. When stack_size exceeds MAX_STACK_SIZE - 2, a warning is printed but execution continues, potentially writing beyond array bounds on lines 59-60. After printing the warning, the function should either return early with an error result or skip adding to the stack to prevent buffer overflow.

Fixed in the latest commit. Added continue after the overflow warning so both the other_child and best_child pushes are skipped when the stack is near-full, preventing any writes beyond array bounds.

Copilot · 2026-02-16T15:06:26Z

+
+  /**
+   * @brief Create a unary GICP_GPU factor between a fixed target pose and an active source pose.
+   * @param targfixed_target_pose  Fixed target pose


Typo in parameter name documentation. "targfixed_target_pose" should be "fixed_target_pose" to match the actual parameter name on line 60.

Suggested change

* @param targfixed_target_pose Fixed target pose

* @param fixed_target_pose Fixed target pose

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

- kdtree.cuh: Add `continue` after stack overflow warning to prevent unconditional best_child push from writing beyond array bounds - kdtree_gpu.cpp: Add cudaDeviceSynchronize() before cudaFreeAsync calls in destructor to ensure GPU work on allocated memory has completed

Copilot

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

+void IntegratedGICPDerivatives::update_inliers(int new_num_inliers) {
+  if (inlier_evaluation_point_gpu == nullptr) {
+    return;
+  }
+
+  // For now, we keep all correspondences without filtering
+  // The correspondence filtering can be added later with validity checks
+  this->num_inliers = new_num_inliers;
+}


Copilot · 2026-06-12T00:33:37Z

+namespace {
+
+/// @brief Kernel to compute source-target correspondence by KdTree NN search
+struct kdtree_correspondence_kernel {
+  kdtree_correspondence_kernel(


Fixed in the latest commit. Removed the unused kdtree_correspondence_kernel anonymous-namespace struct and trimmed the includes down to only those actually required by issue_compute_error: thrust/iterator/transform_iterator.h, cub/device/device_reduce.cuh, gicp_derivatives.cuh, and stream_temp_buffer_roundrobin.hpp.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

…rated_gicp_derivatives_compute.cu

koide3 added 11 commits December 7, 2025 08:29

kdtree_cuda

e2826fc

gicp_gpu (wip)

56925ca

Merge branch 'master' of github.com:koide3/gtsam_points into kdtree_cuda

eef66d8

gicp_gpu

6e251c8

set_max_correspondence_distance

9c50910

cache point correspondences

87c594f

unify buffers

2cb203c

make config.hpp visible to gtsam_points_cuda

95407ca

Merge branch 'master' of github.com:koide3/gtsam_points into kdtree_cuda

8e45c43

drop cuda12.2

d74f3df

avoid using thrust in header

4cfd41f

koide3 requested a review from Copilot February 16, 2026 15:00

Copilot started reviewing on behalf of koide3 February 16, 2026 15:01 View session

Copilot AI reviewed Feb 16, 2026

View reviewed changes

koide3 and others added 2 commits February 21, 2026 11:20

Update include/gtsam_points/cuda/kernels/kdtree.cuh

649f5cb

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update include/gtsam_points/factors/integrated_gicp_factor_gpu.hpp

8c232ff

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot started work on behalf of koide3 June 12, 2026 00:02 View session

Copilot finished work on behalf of koide3 June 12, 2026 00:06

koide3 requested a review from Copilot June 12, 2026 00:09

Copilot started reviewing on behalf of koide3 June 12, 2026 00:09 View session

Copilot AI reviewed Jun 12, 2026

View reviewed changes

koide3 and others added 2 commits June 12, 2026 09:26

Potential fix for pull request finding

75d2b12

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Potential fix for pull request finding

173cad2

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot started work on behalf of koide3 June 12, 2026 00:32 View session

Remove unused kdtree_correspondence_kernel and trim includes in integ…

f9e1f15

…rated_gicp_derivatives_compute.cu

Copilot finished work on behalf of koide3 June 12, 2026 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kdtree_cuda#86

kdtree_cuda#86
koide3 wants to merge 17 commits into
masterfrom
kdtree_cuda

koide3 commented Dec 6, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Jun 12, 2026

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI Jun 12, 2026

Uh oh!

Uh oh!

Copilot AI Feb 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	* @param targfixed_target_pose Fixed target pose
	* @param fixed_target_pose Fixed target pose

Conversation

koide3 commented Dec 6, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants