Page 1 of 1

Queue Sorted List v2

Posted: 17 Apr 2020 21:35
by xarick
Updated story 07-05-2020:

I decided to upload the successor to the previously named "Native Heap". I called it Sorted List.
Improvements:
- It still uses a list of indexes, but now it stores the items in a table instead of an array.
- By using the same index name for the table slots and AIList items, the items can now be tracked and deletable from the queue!
- Since the items can be tracked, the "Exists" function can now be used, and it makes the queue no longer require absurd amounts of memory!

You can make Graph.Aystar use Sorted List, by changing the _queue_class.

Code: Select all

	_queue_class = import("queue.sorted_list", "", 2);
Repository: https://github.com/SamuXarick/Queue.SortedList
Bananas: https://bananas.openttd.org/package/ai-library/5155534c

Original topic:
[+] Spoiler
I've been looking for ways to speed up the pathfinders of my AI for some time, and I noticed that Graph.AyStar is a common requirement for pathfinders. What I've also noticed is that Graph.AyStar also makes use of a Queue. Binary Heap has been the choice of Graph.AyStar library available on bananas, but then I noticed, there are more Queue libraries available! https://bananas.openttd.org/en/ailibrary/

So I wondered, what if I used a different queue?

I went on a quest, experimenting the other available queues for pathfinding:
- Priority Queue was extremely slow. Pathfinding took much longer than Binary Heap. It's really really slow.
- Fibonacci Heap was faster than Binary Heap. Nothing major, but still noticeably faster.

Then I decided to look for other Queues not from the library, and I found this "Native" heap in ShipAI.
Decided to test whether it was faster than Binary Heap or Fibonacci Heap.

"Simply using AIList of indexes is faster than any squirrel implementation" was the premise of this queue. It turns out it's way faster than both!

I've decided to add some improvements of my own and uploaded this heap to Bananas.
To use it, set up your own custom 'AyStar.nut', and change the _queue_class.

Code: Select all

	_queue_class = import("queue.native_heap", "", 1);
I've also created a repository for it. Source: https://github.com/SamuXarick/Queue.NativeHeap
You may notice in the source code lines saying "The complexity of this operation is UNKNOWN". I used the word UNKNOWN as a placeholder, as I don't really understand time complexities, nor how it is actually implemented in OpenTTD. If someone knows them, please tell.

Tomorrow I'll be posting some test results. Currently got no time.

Re: Queue Native Heap

Posted: 17 Apr 2020 22:04
by jfs
It's not a heap, it's just an ordered map. Specifically, the ScriptList class is a std::map, and also maintaining a reverse mapping from values to keys in another std::map.
Typically std::map is implemented via a balanced search tree (most often a red-black tree) which is not the same as a heap.

Theoretically, a red-black tree has worse asymptotic performance than a heap, when all you want is to pull items out of a collected in sorted order. But, if you have a heap implemented in a poorly optimised interpreted language like Squirrel and compare it to a red-black tree implemented in C++ native code, the red-black tree ends up having better constant factors.


Finding the smallest element in a search tree is traversing down the left branch to the leftmost leaf node. In a balanced tree this visits log(n) nodes, so this is an O(log(n)) operation. (In a heap this is a constant time operation.)
Inserting an item in a red-black tree is O(log(n)), and can involve a lot or re-balancing. (On average a binary heap is constant time, but worst-case is also O(log(n)).)
Removing an item in a red-black tree is O(log(n)). (Same as a heap.)
Getting count of items is a constant time operation. (Same as a heap.)
Checking for existence of an item in a balanced search tree is O(log(n)). (Better than a heap, which is O(n).)

If you were comparing a priority queue implemented as a heap in C++ with one implemented as a red-black tree in C++, the heap would win, for several of these reasons. One of the big reasons would likely be cache locality, since a heap uses an array with contiguous memory, while a tree uses individual allocations for each item.

But the name "NativeHeap" is wrong. It's not a heap.

Re: Queue Native Heap

Posted: 18 Apr 2020 00:24
by _dp_
ScriptList uses not just map but a map of sets so worst case some logs can be squared.

Re: Queue Native Heap

Posted: 18 Apr 2020 08:28
by jfs
Yes the map of sets is when you access it sorted by value instead of by key. It seems it could as well have been done with a std::multimap and possibly be easier to follow and better time complexity.

Re: Queue Native Heap

Posted: 18 Apr 2020 17:00
by xarick

Code: Select all

From         | To           | Distance  | Binary Heap    | Fibonacci Heap | Native Heap
------------ | ------------ | --------- | -------------- | -------------- | --------------
Wrunnton     | Mondinghead  | 104 tiles |       63 ticks |       60 ticks |       39 ticks
Wrunnton     | Trenfingway  |  95 tiles |       49 ticks |       44 ticks |       28 ticks
Wrunnton     | Londstone    |  55 tiles |       10 ticks |        8 ticks |        6 ticks
Wrunnton     | Grinfingford |  67 tiles |       27 ticks |       26 ticks |       17 ticks
Wrunnton     | Sledingworth | 181 tiles |      228 ticks |      212 ticks |      142 ticks
Mondinghead  | Wrunnton     | 104 tiles |       71 ticks |       67 ticks |       44 ticks
Mondinghead  | Trenfingway  | 145 tiles |      238 ticks |      219 ticks |      148 ticks
Mondinghead  | Londstone    | 155 tiles |      226 ticks |      207 ticks |      137 ticks
Mondinghead  | Grinfingford | 171 tiles |      225 ticks |      206 ticks |      140 ticks
Mondinghead  | Sledingworth | 285 tiles |      787 ticks |      717 ticks |      504 ticks
Trenfingway  | Wrunnton     |  95 tiles |       39 ticks |       37 ticks |       24 ticks
Trenfingway  | Mondinghead  | 145 tiles |      205 ticks |      189 ticks |      126 ticks
Trenfingway  | Londstone    |  40 tiles |        6 ticks |        6 ticks |        4 ticks
Trenfingway  | Grinfingford | 140 tiles |      128 ticks |      114 ticks |       76 ticks
Trenfingway  | Sledingworth | 140 tiles |      130 ticks |      120 ticks |       79 ticks
Londstone    | Wrunnton     |  55 tiles |        8 ticks |        8 ticks |        6 ticks
Londstone    | Mondinghead  | 155 tiles |      196 ticks |      180 ticks |      120 ticks
Londstone    | Trenfingway  |  40 tiles |        7 ticks |        6 ticks |        5 ticks
Londstone    | Grinfingford | 100 tiles |       50 ticks |       45 ticks |       30 ticks
Londstone    | Sledingworth | 130 tiles |       98 ticks |       90 ticks |       60 ticks
Grinfingford | Wrunnton     |  67 tiles | NH    32 ticks |       31 ticks |       20 ticks
Grinfingford | Mondinghead  | 171 tiles |      191 ticks |      178 ticks |      117 ticks
Grinfingford | Trenfingway  | 140 tiles |      122 ticks | NH   114 ticks |       74 ticks
Grinfingford | Londstone    | 100 tiles |       62 ticks |       58 ticks |       38 ticks
Grinfingford | Sledingworth | 114 tiles |       66 ticks |       61 ticks |       40 ticks
Sledingworth | Wrunnton     | 181 tiles | FH   252 ticks |      230 ticks |      153 ticks
Sledingworth | Mondinghead  | 285 tiles |      834 ticks |      767 ticks |      539 ticks
Sledingworth | Trenfingway  | 140 tiles |      132 ticks |      122 ticks |       81 ticks
Sledingworth | Londstone    | 130 tiles |      125 ticks |      109 ticks |       72 ticks
Sledingworth | Grinfingford | 114 tiles |       72 ticks |       68 ticks |       45 ticks
------------ | ------------ | --------- | -------------- | -------------- | --------------
250K#opcodes | Very Fast    | 1024 MiB  | BH  4679 ticks | FH  4299 ticks | NH  2915 ticks

Re: Queue Native Heap

Posted: 18 Apr 2020 17:42
by xarick
Here's an AI that tests road pathfinding towns to towns. It includes all the necessary libraries. I had to create clones of existing ones so that they could be linked to the different queues.

Re: Queue Native Heap

Posted: 19 Apr 2020 08:56
by RailwAI
Interesting. Based on your findings I tried to optimize the current libraries.

For Queue.BinaryHeap I can achieve a 10% speedup by code optimizations. This is without any changes in the algorithm. Using built-in array functions from our squirrel language seems to be faster than implementing count functionality manually.
The Queue.PriorityQueue has a really bad (and slow) implementation for its Insert function. I can make it faster than the BinaryHeap implementation, by using binary search (reduce time complexity, to the same level as BinaryHeap) and using array.insert. Furthermore I've dropped the check if te item is already in the priority queue. That check is already being performed in the pathfinder, and I think an item may occur multiple times in a queue. It's not as fast as using the "native heap", but it is a bit more competitive.

Code: Select all

From                 To               	Tiles	bh1	nh	bh2*	pq3*
Frondhattan Ridge -> Ponningpool Falls	80	566	367	510	458
Frondhattan Ridge -> Matown         	101	2258	1349	2004	1757
Frondhattan Ridge -> Carhill         	36	266	175	242	234
Frondhattan Ridge -> Wutborough        	34	27	19	24	23
Frondhattan Ridge -> Gronfingbridge     60	267	173	243	218
Ponningpool Falls -> Frondhattan Ridge  80	1331	794	1180	1009
Ponningpool Falls -> Matown         	77	670	422	595	532
Ponningpool Falls -> Carhill         	74	886	545	790	693
Ponningpool Falls -> Wutborough        	46	246	159	223	199
Ponningpool Falls -> Gronfingbridge     140	2913	1766	2575	2236

Re: Queue Native Heap

Posted: 19 Apr 2020 10:00
by RailwAI
Bytheway, for the NativeHeap class I would suggest to remove the Exists function. I bet it wouldn't work when I pass any object, similar to the object passed to the insert call :wink:

Re: Queue Native Heap

Posted: 19 Apr 2020 10:03
by xarick
I noticed Exists is currently bugged. Also the Pop is missing the count check. I plan to release a new version sometime.

EDIT: Another issue. Should Insert return true? Fibonacci Heap doesn't have it return anything. All the other libraries have it return true. Does it make a difference?

Re: Queue Native Heap

Posted: 19 Apr 2020 16:23
by xarick
Another question: As I plan on releasing a new version, I was suggested to rename the queue, due to it not being a heap. I'm not too sure if I should proceed with a rename here and how to do it.

If I am to rename it, it would be named "Sorted List Queue". for import it would be:

Code: Select all

_queue_class = import("queue.sorted_list", "", 2);
I'm not sure about the 4 unique characters. Would they remain the same, QUNH, or should be different, QUSL, and would bananas accept a version 2 with a different unique 4 characters than the previous version? I wouldn't want to upload something to bananas just to confirm if such is possible, as then once uploaded, there's no way to delete it.

EDIT: Here's what I have for a possible version 2 .. https://github.com/SamuXarick/Queue.Nat ... -version-2

Re: Queue Native Heap

Posted: 20 Apr 2020 06:18
by RailwAI
xarick wrote: 19 Apr 2020 10:03 EDIT: Another issue. Should Insert return true? Fibonacci Heap doesn't have it return anything. All the other libraries have it return true. Does it make a difference?
I don't think it should be required, unless you would be able to distinguish between a successful insert (return true) or a failed insert (return false). Note that if you don't return a value, it is still faster to add an explicit return; statement.

Re: Queue Native Heap

Posted: 20 Apr 2020 20:50
by Michi_cc
PR#8091 might or might not speed up pathfinding even more.

Re: Queue Native Heap

Posted: 20 Apr 2020 20:56
by xarick
I discovered something really bad about my implementation. Look at the system memory usage.

Image

Re: Queue Native Heap

Posted: 04 May 2020 13:41
by xarick
RailwAI, here's a little bit more speed for pq3.

BTW, I noticed you have made some modifications since, in RailwAI v25, by adding a _priorities list. I tested it, and while that makes it require less ticks, it costs cpu cycles for some reason. It becomes noticeable with 250k ops/very fast.

So there's pq4, based on a mix of pq3 + RailwAI v25 queue.
- I made it not return true for Insert
- Also made it ignore the count check on Pop

If those are really necessary, just remove the commenting out on lines 68 and 73, at the cost of some ticks.

Re: Queue Native Heap

Posted: 05 May 2020 07:52
by RailwAI
I didn't check any influence on the CPU, but the execution time of AI are limited in number of squirrel operations. Some of these operations are light to perform, probably some others might be more complex or cannot be interrupted, giving the AI a small advantage.
Ignoring the count check would indeed be an improvement. For me returning true for insert is not necessary (I suppose the same is valid for most AI, as all have similar pathfinders). Not that it is slightly faster to add explicit return statements, thus with a small change we can save a few more ticks.

Re: Queue Sorted List v2

Posted: 07 May 2020 13:44
by xarick
Just uploaded the sucessor to Native Heap. Called it Sorted List. See first post.