In fact, the only feature that could help with performances in OTTD should be multi-threading.
Most of the computer have now 2 or 4 cores, and even more.
Currently, OTTD mainly works in a single thread. As a result, modern processors are not fully exploited.
But changing a mono-thread program to a multi-thread program is like to change a train into 200 cars with mad drivers running everywhere in the wild : it takes a lot of work to decide wich feature could work on which thread and how to synchronize everyone… In a city, there are trafic lights… and trafic jams. Not in the subway.
Had no time yesterday to answer this one, so with a day delay:
Multi-threading only works if you can do stuff in parallel. Basically this boils down to being able to compute different parts independently, where the input data is fixed or is easily accessible and the order of getting results from the various computations has no influence on the final answer.
Unfortunately this does not hold for OpenTTD.
Let's take train-movement as example. The simulation-engine has one big central data structure, namely the map. Trains that move read the map for deciding where to go, and update the map with block reservations.
So what happens if you try to compute tha path of several trains at the same time, one thread for each train for example?
1. Path-finding is mostly "inspect a tile, do a small update, inspect next tile, do a small update, inspect .." etc, for a few hundred tiles or so (for one train). You constantly need map access to move the computation forward. If nothing writes to the map, this is easy, but trains do update the map with new blocks claimed and old blocks released at least.
2. Without protection at all, if you write to data that other threads read, the written data in the map is not consistently updated between threads (especially if different threads run at different CPUs). Threads may read old data, they may read partly updated data, or read the true updated data, The partly updated data is simply invalid, you never want that to happen (it will crash the program randomly).
3. The standard solution to deal with a data structure that is both being read and written from different threads, is by giving exclusive access to that data structure using a semaphore. Basically, a semaphore can give one thread exclusive access, it does its thing, and then releases the semaphore for the next thread. (Much like one train can use a block at a time.)
4. That solves the read/write problem, but creates two new problems.
- One is that your multi-threading is now a complicated form of single threading. As explained before, a thread needs constant map access to do a computation. If you fold a semaphore around the access, at most 1 thread can access the map, all others are blocked. It's like having eg ten people at a table with one fork on the table to eat. Only 1 person can use the fork at any time. Effectively, you have 1 person eating, and all others are waiting for the fork. It's just as fast (if not faster) to have each person eat in turn.
For the map data, still only a single train is being computed (or several at an equally slower rate), so there is no significant gain at all, you only added a complex multi-threading thing around it.
- The second problem is that the input data is now not fixed any more. Tile information may change at any moment (from another thread). This means that inspecting a tile has lost its meaning, since right after you looked at it, it may (or may not) change, so a computation may or may not use obsolete data, and produce incorrect results (two threads read the same tile after each other, both conclude the tile is free, and both reserve that tile for their train).
You can check the final result or so to avoid collisions, but that makes path-finding more expensive (you have to check after you found a path), and it doesn't solve the MP game problem, where two different computers are going to have to compute the exact same result each and every tick for half a day or whatever how long you play the game.
In particular, a very old single core computer and a 16-core monster in one MP game should work. Discarding paths based on timing of thread execution is a sure-fire way to desync in less than a few seconds.
You can come up with a few variations on the theme, but you either get results that vary on each run, or you lock down so much that you loose any significant gain.
This is why OpenTTD still uses single threading for path finding. It's simply the fastest solution that meets all requirements.