YAPF rewrite

joern · Post by **joern** » 04 Oct 2010 13:03

"All mail clients suck. This one just sucks less." -- Michael R. Elkins

In that spirit I will point out in many ways why YAPF sucks and can use some improvements, if not replacement. But I am also aware that all existing alternatives suck even more. So if I sound like an ungrateful b******, I really don't mean to.

The (hopefully) attached savegame illustrates the problem I personally like to tinker with. How to move the most ridiculous amount of bulk goods over the longest possible distance. Sadly, the game always disintegrates after some time due to YAPF.

If you let the safegame run for a while, you will notice a large number of "lost" trains that shouldn't get lost. In particular, trains leaving the "North End" tend to get lost at a fairly high rate. AFAICS almost all of those trains should be able to reach their destinations without a problem. There may be a couple of exceptions, but those get lost between the false positives.

If you run the game a while longer, you will notice that trains start to accumulate at certain depots and keep trying to leave the depot towards the nearest "Wrong Way" sign, reverse back into the depot and try again. Trains taking an obviously stupid happen all the time, but at this stage my network is large enough to completely confuse YAPF. Ick.

And of course there is the performance issue as well. This particular savegame isn't so bad, but on occasions I have stopped playing because the performance became unbearable and I suppose that YAPF is the cause.

So much for the motivation. Let us take a look at the design next.

A* considered harmful

AFAICS YAPF still uses the A* algorithm, plus a large number of optimizations that help performance and hurt understanding of the code - along with ease of modifications, finding and fixing bugs, etc. Ignoring the optimizations, A* is rightfully considered the best algorithm for finding a path on a map. However, that is not how OpenTTD uses A*.

A* is a good algorithm for finding _one_ path _once_. If OpenTTD invoked A* once every time a trains leaves a station to find the path to the next, that would be fine. But instead it redoes the complete pathfinding calculation over and over again, whenever it encounters another intersection. Plus, OpenTTD invokes A* for a potentially large number of trains operating on the same map. As a result, A* tends to redo the same work many times over. I would expect it to perform just as bad as it does in reality.

Now adding optimizations on top of A* does not really change any of this. A* still has to redo most calculations and the bulk of time is spent redoing work that has been done many times before. Plus, their drawbacks on the maintenance-side can help explain some of the problems seen in my savegame. I would be surprised if many others played the game the same way I do. And as always, untested code and untested scenarios tend to give unwanted results.

The shining solution

Or not so shining. Not even a solution at all yet. It still mostly handwaving, really. And in particular it is a very special solution for my very special style of playing. I believe it can be generalized sufficiently to suit others as well. But when explaining the principle, it helps to keep it simple.

Main idea is, as so often, to cache results. So we create a graph of the map, consising of two types of nodes: Waypoints and Others. Graph is directed, so each node has a list of predecessor nodes (nodes from which a train may reach it) and a list of successor nodes (nodes that can be reached from here).

Next we assign a cost estimate to each edge. One option is to take the distance between the two nodes.

Next step is to sum up the cost estimates for each waypoint. Afterwards each node should have a list of reachable waypoints, along with the cost towards these. Sample pseudocode:

Code: Select all

for (each waypoint) {
	set waypoint cost to 0
	add waypoint to temp list;
	for (each node on temp list) {
		for (each predecessor of node) {
			set waypoint cost to this node's cost plus predecessor->node cost;
			if (predecessor already had lower waypoint cost)
				restore old waypoint cost;
			else
				add predecessor to temp list;
		}
	}
}

Once we have all this information, we no longer have to run A* all the way to the destination. All we have to do when a train reaches a node is pick one of the successor nodes, preferrably one that has a low cost estimate towards the train's waypoint.

The not so shining details

Construction

Obviously, when the map changes we need to redo the calculations above. One can play some optimization games here as well. An easy one is not to recalculate immediately but wait a short moment.

Congestion

The cost estimate can take this into account and f.e. double the cost estimate for an edge if it is currently occupied. Congestion changes over time, but it doesn't chance all that fast. So it should be sufficient to redo the cost calculation every so often and still have a fairly good estimate.

How to pick nodes

Noticed that I didn't specify this before? An obvious solution would be intersections. But my savegame illustrates that tens of thousands of intersactions are rather possible. For my particular map, signals would be a good choice. But maps with insane numbers of signals are possible as well.

One way to limit the total number of nodes would be to use grid lines. Say we pick all rails that exist on (X, Y) where either X or Y is a multiple of 20. Or we could start with intersections and join two nodes if they are adjacent.

I really don't have a good answer here yet. The main point is that for N nodes and W waypoints, we need to store N*W cost estimates, so we should try to limit N to some reasonable number.

Servicing

This I suspect to be the reason behind many of YAPF's bad decisions. If the service interval calls, a train should go look for a depot. But if the cost estimate after the depot is significantly worse than the current one, it may be a wiser decision to pick a different depot instead.

Also, trains can do opportunistic servicing. If a train has already expended, say, 50% of the service interval and would have to wait for a free path and could enter a depot instead of waiting and the cost estimate after leaving the depot is not (or not significantly) worse, just use the depot.

More

I'm sure there is more I have missed. But in very rough terms, this is the kind of pathfinder I would like to use instead of YAPF.

Eddi · Post by **Eddi** » 04 Oct 2010 23:18

by experience, 99% of all cases where a train gets lost when the player definitely is sure that they should not, it's a missing catenary or wrongly oriented signal.

other common source of lost trains is turning around on long wait times. set the three signal wait times to 255 to disable turning around.

joern · Post by **joern** » 05 Oct 2010 08:21

by experience, 99% of all cases where a train gets lost when the player definitely is sure that they should not, it's a missing catenary or wrongly oriented signal.

You didn't actually look at the savegame, did you?

Between 1926 and about 1945 or so you would have been right. The size of the map, the number of signals, intersections, trains, etc., all that is much smaller. YAPF is coping just fine and if things go wrong, it is often a wrong signal. Not exactly 99%, but surely 50%.

But even at that time, I get another 50% lost trains where a train is just exiting a depot and has missed the correct intersection to the mine by one signal. That is clearly either a YAPF bug or "works as designed" with a design I personally wouldn't be proud of.

After about 1945 something fundamental is changing. I have no idea whether it is the number of trains, signals, intersections or whatnot. But the result is that for every wrong signal and every depot just behind the intersection, I get dozens of trains being lost right after leaving North End. Your 99% number is shrinking to 5% or less.

other common source of lost trains is turning around on long wait times. set the three signal wait times to 255 to disable turning around.

Does not apply either. I did spend a few workdays reading the wiki, searching the forum and reading the source code. If my problem was a common one, I'm pretty sure I should have found the solution already.

Post by **planetmaker** » 05 Oct 2010 08:48

Well, the problem which you have in your savegame, why many (all?) of your trains will get lost is that you use the forced servicing in a way that there's no path to the station except going through a dead-end depot. The solution for this particular behaviour would be to consider these depots not the end-of-line but just a (highly) penalized tile. This behaviour has though nothing to do with using A* or any other PF algorithm - it "just" is a matter of how to weigh tiles and where to consider a path to be possible.

It's a completely different matter that A* is called on every intersection the train encounters. If I read commit logs correctly, the PF code is just in the process of being ported from C to C++. And you're certainly welcome to propose an alternative PF which is less heavy on the CPU especially for large amounts of trains and huge networks (and I don't mean it only in the 'usual' way of "if you don't like it, do it yourself", but it might indeed be interesting to have some more light weight PF there - which offers a similar quality of PF, though) Mind also, that YAPF indeed does already cache its results and re-ususes them - where and how exactly, I don't know.

joern · Post by **joern** » 05 Oct 2010 12:25

Well, the problem which you have in your savegame, why many (all?) of your trains will get lost is that you use the forced servicing in a way that there's no path to the station except going through a dead-end depot.

Sounds reasonable, but a quick test does not agree. Attached savegame has added a track so the depot before the station is no longer forced. Therefore the depot should no longer be considered a dead-end and my trains should no longer get lost. They still do in pretty much the same way. Running the two savegames side by side seems to indicate no change in the rate of trains getting lost either. I would feel confident to claim that at most 10% of my lost trains can be explained by dead-end depots. It may be none at all, but that is a much harder claim to make.

If I read commit logs correctly, the PF code is just in the process of being ported from C to C++. And you're certainly welcome to propose an alternative PF which is less heavy on the CPU

YAPF is certainly written in C++. Being a C person myself, that causes a certain amount of discomfort when trying to read the code.

Post by **planetmaker** » 05 Oct 2010 12:39

joern wrote:
Well, the problem which you have in your savegame, why many (all?) of your trains will get lost is that you use the forced servicing in a way that there's no path to the station except going through a dead-end depot.
Sounds reasonable, but a quick test does not agree. Attached savegame has added a track so the depot before the station is no longer forced. Therefore the depot should no longer be considered a dead-end and my trains should no longer get lost.

I'm unable to follow this in the attached savegame; the trains going to that particular station do not seem to be lost, but there are going a zillion other lost trains that way. Maybe you can create a simpler example; in general, if there's a no-dead-end way, trains don't get lost.

Post by **Rubidium** » 05 Oct 2010 14:13

joern wrote:YAPF is certainly written in C++. Being a C person myself, that causes a certain amount of discomfort when trying to read the code.

I can tell you that, even for people that have seen and written a lot of C++, the YAPF code causes discomfort when trying to understand the code.

Besides that, the rail network is, in the eyes of the pathfinders, by far not a static network. Both pathfinders derive their pathfinder penalties from the layout and state of the network. So, if there are two tracks and one is slightly longer but without a red signal, it will take the slightly longer route. If a route is filled with trains, i.e. path reservations, it will try to find another route with less reserved rail, i.e. the arguably less congested route.

Ofcourse you can update the representation and penalties of the network whenever a train enters or leaves a tile, i.e. tiles are (un)reserved and signal blocks are left, but I can almost assure you that that approach will be more expensive than pathfinding whenever there is an actual choice to be made.

Please show me how to consider congestion without checking for it at every point a choice can be made.

joern · Post by **joern** » 05 Oct 2010 14:35

Rubidium wrote:Please show me how to consider congestion without checking for it at every point a choice can be made.

Calculate a map representation, including the current congestion state. Over time the real congestion state will diverge from the precalculated representation, but it will take a moment before the differences start to really matter[1]. So after a while the representation is completely redone, including the then current congestion state.

[1] If congestion means 10 reserved tiles, that does not really matter. If congestion means 100s of trains creating one massive traffic jam, that matters a lot. But it will also take a moment for the jam to dissolve, so short term changes don't change the overall picture too much.

joern · Post by **joern** » 05 Oct 2010 14:40

planetmaker wrote:Maybe you can create a simpler example; in general, if there's a no-dead-end way, trains don't get lost.

I wish I could. There seems to be a complexity cliff somewhere. Simple networks work just fine. Moderately networks work fine as well. But at some point the network becomes complex enough that trains get lost in large numbers.

"Complex" of course is a bad description. It may be due to a number of reasons, number of trains, number of intersections, etc. But since building a network that falls off the complexity cliff take quite some time, I probably won't run hundreds of testbeds.

Post by **planetmaker** » 05 Oct 2010 14:59

joern wrote:"Complex" of course is a bad description. It may be due to a number of reasons, number of trains, number of intersections, etc. But since building a network that falls off the complexity cliff take quite some time, I probably won't run hundreds of testbeds.

You know... the size definitely is not a measure for "fails to find a path". It is always a failure of the builder to build properly. Look at the ~200 games in the #openttdcoop PublicServer Savegame Archive. Trains are not lost there (usually - it may be different for some network designs which make use of exactly that). And there are networks with more than a bit of complexity...

joern · Post by **joern** » 05 Oct 2010 16:30

Rubidium wrote:I can tell you that, even for people that have seen and written a lot of C++, the YAPF code causes discomfort when trying to understand the code.

Where would be a good place to ask code questions? I was trying to deduce the pathfinder API and relevant data structures. But usually it doesn't take too long for the discomfort to become overwhelming.

Post by **planetmaker** » 05 Oct 2010 16:47

Probably the development forum is a good place for code discussions. Or #openttd on irc.oftc.net.

Nickel_Plate · Post by **Nickel_Plate** » 05 Oct 2010 17:18

Have been running this game.

One must ask if this is a YAPF problem or something else.

1) I have one of the largest train jams in history and upon checking this was caused by a BAD bit of track laying. Also noticed one or two other places where track had not been layed correctly also coursing problems.

2) Looking at all the depots one must ask why, NOTICE depots are only on the inside 8 tracks with NO depots on the 2 outer tracks so trains on these tracks could be getting lost by branching off from main line looking for depot.

Everyone as their own style of play and good luck to them.

Nommy · Post by **Nommy** » 05 Oct 2010 17:47

Your trains are getting lost because it's hitting the yapf.max_search_nodes (10000).

Code: Select all

dbg: [yapf] [YAPFt]! 284- 16897 us - 10001 rounds - 1333 open - 10000 closed - CHR 92.8% - C -1 D -1 - c0(sc0, ts0, o0) --

This is the lost train.
Its hitting this limit because there are so many ways to get to the destination, all slightly different, but with very similar path cost that it needs to compare them all to know for sure which is the best. E.g if there were 3 sections, with 5 parallel paths, when it's searching at the 1st branch is has 5 paths to check, the 2nd there are 5*5, the 3rd 5*5*5... This is not strictly what happens, but it's the participle behind the problem I think, and you have like 100 of these sections/branches.

Some solutions:

set yapf.max_search_nodes higher e.g
Code: Select all
```
patch yapf.max_search_nodes 20000
```
It will still run slow though.
Put in way points along the way so it only has to look across 10 branches, rather than 100.
Make 1 or 2 tracks have a much lower path cost than all the rest by putting something to penalise the other tracks, so it doesn't 'look down' every combination of them all. E.g you can put hills on all the middle tracks and set yapf.rail_slope_penalty 5000 or something. Then every hill is the same as a 50 tile detour, and it will mostly just look down the 2 edge tracks then stop when it knows which one is best. Not sure if that will work with pbs though - normally I'd use a red 2way to force it down the penalised path if needed. There's lots of games on the ottdcoop site in the archive that do stuff like this, using stations that aren't in the orders to discourage that track usually I think. But in your case, I think you might need waypoints too, cost there's just so many branches and if you use too many hills/penalties you might reach the path cost limit. I forgot what the max penalty is, but I can tell you how to find out if you hit it.
Code: Select all
```
developer 5
debug_level yapf=3
If you want to check a lost train, it's easier to write the log to a file:
script log.txt
... run it a day or two before the train is lost, then after lost message:
script log.txt
```
Then open log.txt and search for the train number. If it isn't at max nodes/rounds, and there is actually a way to get to where the train is going from where it is, then the reason it's lost is coz the path cost limit was hit, due to whatever penalties. Lost trains have a ! in front.

Also, you can set

Code: Select all

yapf.rail_look_ahead_max_signals = 2

Or 3 if they are not going the right way. This will let it use the cache after 2, or 3 signals, instead of after 10, so that will help performance a bit, but it's a drop in the ocean with your current setup I'm afraid.

Vitus · Post by **Vitus** » 05 Oct 2010 18:00

Depots aren't treated as end of line, indeed. Default pathfinder penalty for using depot to turn-around is 5000, if my memory serves me right.

Nommy · Post by **Nommy** » 05 Oct 2010 18:04

Oops, I think you wrote that when I was editing and I took that bit out. Yeah, they are not the problem I don't think.

joern · Post by **joern** » 05 Oct 2010 20:02

Nommy wrote:Your trains are getting lost because it's hitting the yapf.max_search_nodes (10000).
[...]

Excellent! That explains things rather well. Given a magic lamp and a free wish, I would still rather rewrite the pathfinder than work around it using one of your solutions. But I have to admit that magic lamps have become a rarity these days.

Many thanks!

Eddi · Post by **Eddi** » 05 Oct 2010 23:38

joern wrote:
Nommy wrote:Your trains are getting lost because it's hitting the yapf.max_search_nodes (10000).
[...]
Excellent! That explains things rather well.

Since you're the first guy i have ever seen to trigger this, i still guess that my 99% figure above is fairly accurate...

Nickel_Plate · Post by **Nickel_Plate** » 06 Oct 2010 16:05

Have you got a copy of an earlier game.

el koeno · Post by **el koeno** » 06 Oct 2010 16:46

joern wrote: ...

If you run the game a while longer, you will notice that trains start to accumulate at certain depots and keep trying to leave the depot towards the nearest "Wrong Way" sign, reverse back into the depot and try again.

Regardless of other YAPF settings that cause you problems, I don't think trains should go to wrong way signals. It's kind of annoying to find an entire section of your network jammed after you've edited a piece of track, causing one silly train on the other side of the map which was planning on using that piece of track to go to the nearest wrong way sign and block a junction. Especially since the place where you edited is sometimes on the other side of the map from the jam, and it might take months for you to discover it....

(Just thought I'd share my pet peeve with YAPF here, since you kind of mentioned it too. Otherwise, I never have lost trains.)

Transport Tycoon Forums

YAPF rewrite

YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Re: YAPF rewrite

Who is online