代码之家 › 专栏 › 技术社区 › erik

在均匀网格上查找到点云中最近点的距离

algorithm

erik · 技术社区 · 15 年前

我有一个3dbxc大小的网格,网格中各点之间的距离相等。给定多个点,根据下面的假设,找到每个网格点到最近点的距离(每个网格点应包含到点云中最近点的距离)的最佳方法是什么?

假设A、B和C与D的关系相当大,得到一个大约500x500x500的网格,大约有100万个点。

另外,假设到最近点的距离超过d,我们不关心最近点的距离,可以安全地设置为某个大数字(d可能是d的2到10倍)。

由于将有大量的网格点和要搜索的点,简单详尽:

for each grid point:
   for each point:
     if distance between points < minDistance:
       minDistance = distance between points

不是一个好的选择。

我在考虑做一些事情:

create a container of size A*B*C where each element holds a container of points
for each point:
  define indexX = round((point position x - grid min position x)/d)
  // same for y and z
  add the point to the correct index of the container

for each grid point:
  search the container of that grid point and find the closest point
  if no points in container and D > 0.5d:
    search the 26 container indices nearest to the grid point for a closest point
  .. continue with next layer until a point is found or the distance to that layer 
        is greater than D

基本上:将点放在桶中,向外进行径向搜索,直到找到每个网格点的点。这是解决问题的好方法,还是有更好/更快的方法?最好采用有利于并联的解决方案。

5 回复 | 直到 15 年前

Keith Randall 15 年前

实际上,我认为我有更好的方法,因为网格点的数量远远大于采样点的数量。设grid=n,samples=m,则最近邻搜索算法类似于o(n lg m),因为您需要查找所有n个网格点,而每次查找都是(最佳情况)o(lg m)。

相反,循环采样点。为每个网格点存储迄今为止找到的最近采样点。对于每个采样点,只需检查距离d内的所有网格点,以查看当前采样是否比以前处理的任何采样更近。

运行时间为O(n+(d/d)^3 m),当d/d很小时,最好是O(n+(d/d)^3 m)。

即使在D/D更大的时候,如果你能制定出一个截止策略的话,你还是可以的。例如,如果我们检查的是距样本5的网格点,并且该网格点已标记为距前一个样本1的距离,则不需要检查“超出”该网格点的所有网格点,因为前一个样本保证比我们正在处理的当前样本更近。你所要做的就是(我认为这并不容易,但应该是可行的)定义“超越”的含义,并找出如何在网格中迭代,以避免为“超越”网格点的区域做任何工作。

Amber 15 年前

看一看 octrees . 它们是一种数据结构,通常用于有效地划分三维空间,从而提高在空间上相互靠近的对象的查找效率。

Keith Randall 15 年前

你可以建造一个 nearest neighbor search structure (维基百科)在你的采样点,然后要求它为你的每个网格点。维基百科页面上提到了很多算法。也许八叉树、kd树或r树是合适的。

High Performance Mark 15 年前

一种可能适合或不适合您的应用程序的方法是,重新定义您的思想,并将每个网格“点”定义为将空间划分为单元的立方体的中心。然后,您有一个这样的单元的三维数组,并将这些点存储在单元中——选择最合适的数据结构。用你自己的话, 把分数放进桶里 首先。

我想您可能正在运行某种大规模的模拟,我建议的方法在此类应用程序中并不常见。在每一个时间步骤(如果我猜对了的话)中,您必须重新计算从单元格到最近点的距离,并将点从单元格移动到单元格。这很容易平行。

编辑:搜索粒子和 粒子粒子粒子网格 可能会为你提出一些想法。

denis 15 年前

关于基思·兰德尔方法的注释, 围绕起点展开壳或立方体:
可以按不同的顺序展开。下面是一些Python风格的伪代码:

S = set of 1m startpoints
near = grid 500x500x500 -> nearest s in S
    initially s for s in S, else 0
for r in 1 .. D:
    for s in S:
        nnew = 0 
        for p in shell of radius r around s:
            if near[p] == 0:
                near[p] = s
                nnew += 1
        if nnew == 0:
            remove s from S  # bonk, stop expanding from s

“从S开始停止膨胀”在1d中很好(bonk-left,bonk-right); 但是二维/三维壳是不规则的。
一次完成整个立方块会更容易/更快:

near = grid 500x500x500 -> { dist, nearest s in S }
    initially { 0, s } for s in self, else { infinity, 0 }
for s in S:
    for p in approximatecube of radius D around s:
        if |p - s| < near[p].dist:  # is s nearer ?
            near[p] = { |p - s|, s }

这里的“近似值”可能是一个完整的dxdxd立方体, 或者你可以像(这里是2d)一样去掉角落。

另外, 根据埃里克的数据,平均有500^3/1米~2^7~5^3个空瓶。每个采样点。所以我一开始以为5x5x5个立方体在1米的采样点附近将覆盖整个网格的大部分。不是这样,~1/e的网格点是空的——泊松分布。