The problem here is that all those tiles need to be stored somewhere, without consuming an infinite amount of memory and without giving a performance hit.
Memory is mostly irrelevant, performance is the only relevant factor here.
You mostly get performance by minimizing memory access, and by localizing the access you cannot avoid. In that way, you get maximum profit of the L1 and L2 cpu caches. To get an idea on their speed, getting data from the L2 cache is about 10x as slow as data from the L1 cache, and data from main memory is at least 10x slower than data from the L2 cache.
The current implementation is a single array of tiles, where the address of the tile is computed, so the first memory access is already reading proper tile information needed for moving the train. That information is not much bigger than a stored pointer. In other words, by the time you have loaded the pointer to the tile-stack, the current implementation has already seen all tile information, and is working on the next tile.
The pointed-to address is likely not in the L1/L2 caches, and this likely also holds for your other stack-segments in the sorted list (ie you're quite likely to work at very slow main-memory speed). I don't know how you intend to find the tile at the correct level, but pretty soon you're at least doubling or tripling the amount of memory accesses, in particular at random points in the pool, ie outside the L1 and L2 caches, so you're likely waiting eons (in cpu ticks) for main memory.
The additional access immediately translates to 1/2 or 1/3 of the number of trains that you can run in the game before you hit the CPU limit. You'd have to implement it for a real test though. openttdcoop has a few nice test-cases for you
You have a nice idea, and it would work if you're not in a hurry, but here, every memory access counts, and jumps to possibly far-away locations are quite deadly for speed, you can do a lot of L1 access compared to main memory access.
This is basically why there is no multi-level implementation yet. A single flat array of tiles is so fast, it's hard or perhaps impossible to find anything which is even close to comparable in performance.
I don't think a new data structure will suffice, it may need a different way of computing paths, without killing the deterministic properties that we have now.