A*

November 14, 2021About 13 min

A*

寻路在游戏里的等同于A*算法。跟Dijkstra不一样，A*是用于点对点的路径查询而不是解决图中的最短路径问题。

The Problem

给一个图（一个有向非负权重图）和两个点（开始点和目标点），找到俩个点直接连通的一条最短花费的路径。

The Algorithm

这个算法的工作原理跟Dijkstra差不多。也是使用迭代器来遍历计算。不同的是A*关注的是最有可能导致总体路径最短的节点，而不是到当前最短花销值的节点。这就会产生如果这节点并不是最有可能产生最短路径的节点，那么A*的效率并不如Dijkstra。

Processing the Current Node

我们在当前节点中还是会存有：每个节点的连接，每个连接的花销，以及连接的尾结点。

额外的我们还需要记录一个值：从起始节点到此节点再到目标的路径的总花销的估计值。这个估计值是两个值得总和：到目前为止当前节点的总花销加上我们丛当前节点到目标的距离的估计。

上图显示了一些节点在图中的计算值。其中heuristic就是预计当前点到终点需要的花销。这个值不属于当前算法，是由其他程序生成的，暂时不管。

The Node List

同样的这里还是有两个列表：

open ：记录已经被发现的点，但是还未被处理的点。在处理点时把该点连接尾的点加入其中。
closed ：记录已经被处理过的点，处理完该点就把该点加入该列表。

不像Dijkstra，在每次遍历开始就选最少花销的节点，而是选择预计花销最短的点。

Calculating Cost-So-Far for Open and Closed Nodes

我们计算当前节点总花销跟Dijkstra是一样的，比较当前节点加上连接的花销与之前尾结点的总花销做比较就可以。

不同的是，A*可以发现更好的路径在closed列表中。因为每次遍历并不是找的最小花销的。所以在发现更好的路径在closed里时，可以把这个节点的总花销更新了，然后把它再次放入open列表中。

我们可以看到上图是已经遍历过几次了。当前节点C,发现起始点到E的总共花销小于之前算出的。我们就可以把E的总花销数值更新了，然后把它加入到open列表中再次计算。这样就可以到达终点G。

Terminating the Algorithm

在很多实现中，跟Dijkstra一样，当目标节点在open列表里是最小的节点时，就结束算法。

但是在A*中，我们每次取出的节点并不是最少花销的节点。所以我们并不能保证取出来的节点生成的路径是最短的路径。

但是如果我们为了找到最短路径时，反复去查找，那么这样所带来的的性能消耗，跟Dijkstra是差不多的。所以我们需要做的是对预测方法的调整。并且在第一次访问目标节点时终止。

Retrieving the Path

得到路径后再反序该路径即可。

Pseudo-Code

# This structure is used to keep track of the
# information we need for each node.
class NodeRecord:
    node: Node
    connection: Connection
    costSoFar: float
    estimatedTotalCost: float

function pathfindAStar (graph: Graph, 
                        start: Node, 
                        end: Node, 
                        heuristic: Heuristic 
                        ) -> Connection[]:

    # Initialize the record for the start node.
    startRecord = new NodeRecord()
    startRecord.node = start
    startRecord.connection = null
    startRecord.costSoFar = 0
    startRecord.estimatedTotalCost = heuristic.estimate(start)

    # Initialize the open and closed lists.
    open = new PathfindingList()
    open += startRecord

    closed = new PathfindingList()

    # Iterate through processing each node.
    while length(open) > 0:
        # Find the smallest element in the open list (using the
        # estimatedTotalCost).
        current = open.smallestElement()

        # If it is the goal node, then terminate.
        if current.node == goal:
            break

        # Otherwise get its outgoing connections.
        connections = graph.getConnections(current)

        # Loop through each connection in turn.
        for connection in connections:
            # Get the cost estimate for the end node.
            endNode = connection.getToNode()
            endNodeCost = current.costSoFar + connection.getCost()

            # If the node is closed we may have to skip, or remove it
            # from the closed list.
            if closed.contains(endNode):
                # Here we find the record in the closed list
                # corresponding to the endNode.
                endNodeRecord = closed.find(endNode)

                # If we didn’t find a shorter route, skip.
                if endNodeRecord.costSoFar <= endNodeCost:
                    continue

                # Otherwise remove it from the closed list.
                closed -= endNodeRecord

                # We can use the node’s old cost values to calculate
                # its heuristic without calling the possibly expensive
                # heuristic function.
                endNodeHeuristic = endNodeRecord.estimatedTotalCost - 
                endNodeRecord.costSoFar

            # Skip if the node is open and we’ve not found a better
            # route.
            else if open.contains(endNode):
                # Here we find the record in the open list
                # corresponding to the endNode.
                endNodeRecord = open.find(endNode)

                # If our route is no better, then skip.
                if endNodeRecord.costSoFar <= endNodeCost:
                    continue
                
                # Again, we can calculate its heuristic.
                endNodeHeuristic = endNodeRecord.cost -
                endNodeRecord.costSoFar

            # Otherwise we know we’ve got an unvisited node, so make a
            # record for it.
            else:
                endNodeRecord = new NodeRecord()
                endNodeRecord.node = endNode

                # We’ll need to calculate the heuristic value using
                # the function, since we don’t have an existing record
                # to use.
                endNodeHeuristic = heuristic.estimate(endNode)

                # We’re here if we need to update the node. Update the
                # cost, estimate and connection.
                endNodeRecord.cost = endNodeCost
                endNodeRecord.connection = connection
                endNodeRecord.estimatedTotalCost = endNodeCost + endNodeHeuristic

            # And add it to the open list.
            if not open.contains(endNode):
                open += endNodeRecord

            # We’ve finished looking at the connections for the current
            # node, so add it to the closed list and remove it from the
            # open list.
            open -= current
            closed += current

    # We’re here if we’ve either found the goal, or if we’ve no more
    # nodes to search, find which.
    if current.node != goal:
        # We’ve run out of nodes without finding the goal, so there’s
        # no solution.
        return null

    else:
        # Compile the list of connections in the path.
        path = []

        # Work back along the path, accumulating connections.
        while current.node != start:
            path += current.connection
            current = current.connection.getFromNode()
    
        # Reverse the path, and return it.
        return reverse(path)

Changes from Dijkstra

这个算法基本和Dijkstra一样。它多了一个检测在closed里的节点是否需要更新并从新计算该节点。同时也额外的计算了预估花销。

我们可以把每个节点的预估花销缓存起来，这样就不需要每次都重新计算一次。

Data Structure and Interface

其他数据基本都和Dijkstra一样，但是这里多了一个预估到终点的方法。

Pathfinding List

用来存储open和close列表，这两个列表对该算法非常重要，这直接关系到算法效率的好坏。这个列表主要有4个操作：

添加节点到列表
从列表中移除移除节点
找到最小的花销的节点
找到列表中中特殊的节点

其中3，4是最具有优化空间的。有很多种方式去优化，我们这里使用优先级队列（priority queue）来做优化。

Priority Queue

最简单的实现是open列表一直都是排序好了的。也就是说列表中的第一个就是我们需要取的数值。

我们只需要在添加节点时，保证把这个节点放入正确的位置即可。也就是说在添加的时候就按照排序的规则来添加。

这里我们可以使用链表来做基础的存储数据结构，但是我们在添加时还是需要遍历已有数据来获取正确的索引。

如果我们这里使用数组来作为基础数据，我们可以使用二分搜索（binary search）来找到插入点。这比使用链表快很多。如果数据量更大的话，这里的提升会更高些。

Priority Heaps

优先堆是一个以数组为基础的树形数据结构。每一个节点都有两个子节点并且子节点的值都比它大。

可以看上图，树是一个平衡树，因此没有一个分支比其他任何层次都深超过一个层次。此外，它还从左到右填充了每一层。

同时这个数也可以很好的映射到内存上。比如：当前节点是i,那么他的子节点的位置就是2i和2i+1。同时也可以使用堆排序，让树保持节点的有序性。取出和添加时间复杂度都是O(logn)，n是列表中元素的个数。

Bucketed Priority Queues

分桶优先队列是更复杂的数据结构，它可以对部分数据排序。这个数据结构提供了跨不同操作的混合性能，它对删除增加都可以做到很快。

要添加到这类优先级队列中，请在桶中进行搜索，以找到节点所在的桶。然后将其添加到桶列表的开头中。

简单说就是把数据分块，先找需要插入数据属于那个桶，再在桶中排序。这样就减少了循环。

Implementations

简单地说：

大图，百万级别的节点数，使用分桶优先队列更好
小图，几千，几万节点的，使用优先堆更好

Heuristic Function

我们可以简单设置预估花销接口如下：

class Heuristic:
    # An estimated cost to reach the goal from the given node.
    function estimate(node: Node) -> float

A Heuristic for Any Goal

由于为世界上每个可能的目标生成不同的启发式函数不方便，因此启发式通常由目标节点参数化。通过这种方式，可以编写一个通用的启发式实现来估计图中任意两个节点之间的距离。改接口差不多是这样：

class Heuristic:
    # Stores the goal node that this heuristic is estimating for.
    goalNode: Node

    # Estimated cost to reach the stored goal from the given node.
    function estimate(fromNode: Node) -> float:
        return estimate(fromNode, goalNode)
    
    # Estimated cost to move between any two nodes.
    function estimate(fromNode: Node, toNode: Node) -> float

然后可以用来在代码中调用路径查找器，例如：

pathfindAStar(graph, start, end, new Heuristic(end))

Heuristic Speed

这个计算是在很底层调用的，所以我们需要时刻注意这个算法的效率。

Algorithm Performance

决定A*算法性能好坏主要因数是这几个的数据结构：寻路列表，图和启发式计算。

A*执行的迭代次数由总估计路径成本小于目标的节点数给出。我们叫这个数为l，这个数比Dijkstra中的n要小。通用m代表每个节点平均的连接数。他的时间复杂度就为O(lm)，空间复杂度也为O(lm)。

除了Dijkstra对寻路列表和图的性能关注外，我们还添加了启发式函数。在上面的伪代码中，启发式值为每个节点计算一次，然后再重用。尽管如此，这在循环中发生得非常低，以O(l)时间的顺序排列。如果启发式值没有被重用，它将被称为O(lm)次数。

Node Array A*

在该算法中有一个关键步骤，在列表中搜索与特定节点对应的节点记录

Keeping a Node Array

我们可以使用增加内存的方式来提高运行效率，我需要申请一个拥有所有图中节点的数组。

如果节点使用整数进行顺序编号，那么我们根本不需要在这两个列表中搜索一个节点。我们可以简单的使用索引号来查找其在数组中的记录。

Checking if a Node Is in Open or Closed

当我们检测一个节点是否在open或close,那么就需要遍历列表。我们可以直接在节点信息中加一个类型来存储该节点是什么类型的。

# The structure used to track the information we need for each node.
class NodeRecord:
    node: Node
    connection: Connection
    costSoFar: float
    estimatedTotalCost: float
    category: {CLOSED, OPEN, UNVISITED}

如果我们使用类型来区分节点，那么我们是可以不需要close这个列表的。

The Open List Implementation

我们不能直接去掉open列表，因为还是需要求得最小的值。但是我们可以把记录改造成链表式，让每个记录点都记录着它后面的一个记录点。这样我们就需要修改节点的数据结构，如下：

# The structure used to track the information we need for each node.
class NodeRecord:
    node: Node
    connection: Connection
    costSoFar: float
    estimatedTotalCost: float
    category: {CLOSED, OPEN, UNVISITED}
    nextRecordInList: NodeRecord