In that spirit I will point out in many ways why YAPF sucks and can use some improvements, if not replacement. But I am also aware that all existing alternatives suck even more. So if I sound like an ungrateful b******, I really don't mean to.
The (hopefully) attached savegame illustrates the problem I personally like to tinker with. How to move the most ridiculous amount of bulk goods over the longest possible distance. Sadly, the game always disintegrates after some time due to YAPF.
If you let the safegame run for a while, you will notice a large number of "lost" trains that shouldn't get lost. In particular, trains leaving the "North End" tend to get lost at a fairly high rate. AFAICS almost all of those trains should be able to reach their destinations without a problem. There may be a couple of exceptions, but those get lost between the false positives.
If you run the game a while longer, you will notice that trains start to accumulate at certain depots and keep trying to leave the depot towards the nearest "Wrong Way" sign, reverse back into the depot and try again. Trains taking an obviously stupid happen all the time, but at this stage my network is large enough to completely confuse YAPF. Ick.
And of course there is the performance issue as well. This particular savegame isn't so bad, but on occasions I have stopped playing because the performance became unbearable and I suppose that YAPF is the cause.
So much for the motivation. Let us take a look at the design next.
A* considered harmful
AFAICS YAPF still uses the A* algorithm, plus a large number of optimizations that help performance and hurt understanding of the code - along with ease of modifications, finding and fixing bugs, etc. Ignoring the optimizations, A* is rightfully considered the best algorithm for finding a path on a map. However, that is not how OpenTTD uses A*.
A* is a good algorithm for finding _one_ path _once_. If OpenTTD invoked A* once every time a trains leaves a station to find the path to the next, that would be fine. But instead it redoes the complete pathfinding calculation over and over again, whenever it encounters another intersection. Plus, OpenTTD invokes A* for a potentially large number of trains operating on the same map. As a result, A* tends to redo the same work many times over. I would expect it to perform just as bad as it does in reality.
Now adding optimizations on top of A* does not really change any of this. A* still has to redo most calculations and the bulk of time is spent redoing work that has been done many times before. Plus, their drawbacks on the maintenance-side can help explain some of the problems seen in my savegame. I would be surprised if many others played the game the same way I do. And as always, untested code and untested scenarios tend to give unwanted results.
The shining solution
Or not so shining. Not even a solution at all yet. It still mostly handwaving, really. And in particular it is a very special solution for my very special style of playing. I believe it can be generalized sufficiently to suit others as well. But when explaining the principle, it helps to keep it simple.
Main idea is, as so often, to cache results. So we create a graph of the map, consising of two types of nodes: Waypoints and Others. Graph is directed, so each node has a list of predecessor nodes (nodes from which a train may reach it) and a list of successor nodes (nodes that can be reached from here).
Next we assign a cost estimate to each edge. One option is to take the distance between the two nodes.
Next step is to sum up the cost estimates for each waypoint. Afterwards each node should have a list of reachable waypoints, along with the cost towards these. Sample pseudocode:
Code: Select all
for (each waypoint) {
set waypoint cost to 0
add waypoint to temp list;
for (each node on temp list) {
for (each predecessor of node) {
set waypoint cost to this node's cost plus predecessor->node cost;
if (predecessor already had lower waypoint cost)
restore old waypoint cost;
else
add predecessor to temp list;
}
}
}
The not so shining details
Construction
Obviously, when the map changes we need to redo the calculations above. One can play some optimization games here as well. An easy one is not to recalculate immediately but wait a short moment.
Congestion
The cost estimate can take this into account and f.e. double the cost estimate for an edge if it is currently occupied. Congestion changes over time, but it doesn't chance all that fast. So it should be sufficient to redo the cost calculation every so often and still have a fairly good estimate.
How to pick nodes
Noticed that I didn't specify this before? An obvious solution would be intersections. But my savegame illustrates that tens of thousands of intersactions are rather possible. For my particular map, signals would be a good choice. But maps with insane numbers of signals are possible as well.
One way to limit the total number of nodes would be to use grid lines. Say we pick all rails that exist on (X, Y) where either X or Y is a multiple of 20. Or we could start with intersections and join two nodes if they are adjacent.
I really don't have a good answer here yet. The main point is that for N nodes and W waypoints, we need to store N*W cost estimates, so we should try to limit N to some reasonable number.
Servicing
This I suspect to be the reason behind many of YAPF's bad decisions. If the service interval calls, a train should go look for a depot. But if the cost estimate after the depot is significantly worse than the current one, it may be a wiser decision to pick a different depot instead.
Also, trains can do opportunistic servicing. If a train has already expended, say, 50% of the service interval and would have to wait for a free path and could enter a depot instead of waiting and the cost estimate after leaving the depot is not (or not significantly) worse, just use the depot.
More
I'm sure there is more I have missed. But in very rough terms, this is the kind of pathfinder I would like to use instead of YAPF.