我仍然不确定你的通知有什么问题,尽管确实有一些问题,正如评论中已经提到的。
-
仅考虑最近的四个(或就此而言,任何固定数量的)邻居可能会导致死胡同或图形的某些部分被完全切断,例如,不在其任何邻居的“最近x”范围内的孤立城市
-
你在表格上的支票
x in dataframe.values
将检查是否
x
是
任何
返回的numpy数组中的值
values
,不一定是id字段
-
使用DelaFrAMs代替开放列表的适当堆,并且关闭列表的哈希集使得搜索不会很慢,因为您必须一直搜索和排序整个列表(不确定大熊猫是否可以加快索引查找,但排序确实需要时间)。
不管怎样,我发现这是一个有趣的问题并尝试了一下事实证明,使用DATAFAFRAMS作为某种伪堆确实很慢,而且我发现DATAFRARM索引非常混乱(并且可能出错)。,所以我更改了代码以使用
namedtuple
为了数据和适当的
heapq
堆为
openlist
以及
dict
将节点映射到其父节点
closedlist
是的。此外,检查的次数比代码中的要少(例如,节点是否已经在openlist中),而这些并不重要。
import csv, geopy.distance, collections, heapq
Location = collections.namedtuple("Location", "ID name latitude longitude country".split())
data = {}
with open("stations.csv") as f:
r = csv.DictReader(f)
for d in r:
i, n, x, y, c = int(d["id"]), d["name"], d["latitude"], d["longitude"], d["country"]
if c == "GB":
data[i] = Location(i,n,x,y,c)
def calcH(start, end):
coords_1 = (data[start].latitude, data[start].longitude)
coords_2 = (data[end].latitude, data[end].longitude)
distance = (geopy.distance.vincenty(coords_1, coords_2)).km
return distance
def getneighbors(startlocation, n=10):
return sorted(data.values(), key=lambda x: calcH(startlocation, x.ID))[1:n+1]
def getParent(closedlist, index):
path = []
while index is not None:
path.append(index)
index = closedlist.get(index, None)
return [data[i] for i in path[::-1]]
startIndex = 25479
endIndex = 8262
Node = collections.namedtuple("Node", "ID F G H parentID".split())
h = calcH(startIndex, endIndex)
openlist = [(h, Node(startIndex, h, 0, h, None))]
closedlist = {}
while len(openlist) >= 1:
_, currentLocation = heapq.heappop(openlist)
print(currentLocation)
if currentLocation.ID in closedlist:
continue
closedlist[currentLocation.ID] = currentLocation.parentID
if currentLocation.ID == endIndex:
print("Complete")
for p in getParent(closedlist, currentLocation.ID):
print(p)
break
for other in getneighbors(currentLocation.ID):
g = currentLocation.G + calcH(currentLocation.ID, other.ID)
h = calcH(other.ID, endIndex)
f = g + h
heapq.heappush(openlist, (f, Node(other.ID, f, g, h, currentLocation.ID)))
这给了我一条从赫斯勒到利兹的路,这似乎更合理:
Location(ID=25479, name='Hessle', latitude='53.717567', longitude='-0.442169', country='GB')
Location(ID=8166, name='Brough', latitude='53.726452', longitude='-0.578255', country='GB')
Location(ID=25208, name='Eastrington', latitude='53.75481', longitude='-0.786612', country='GB')
Location(ID=25525, name='Howden', latitude='53.764526', longitude='-0.86068', country='GB')
Location(ID=7780, name='Selby', latitude='53.78336', longitude='-1.06355', country='GB')
Location(ID=26157, name='Sherburn-In-Elmet', latitude='53.797142', longitude='-1.23176', country='GB')
Location(ID=25308, name='Garforth Station', latitude='53.796211', longitude='-1.382083', country='GB')
Location(ID=8262, name='Leeds', latitude='53.795158', longitude='-1.549089', country='GB')
即使你不能用这个因为你
不得不
使用熊猫(?),也许这有助于你最终发现自己的实际错误。