Page 1 of 3

[Patch] Improve large map creation speed

Posted: 12 Jan 2014 18:22
by MJP
Using a bit instead of a bool when generating rivers leads to better speed (by decreasing the amount of memset()-ed memory).
I used Bilbo's extra-large-maps patch with pavel1269's fix to go further the limitations.

Times in millisecond taken by _GenerateWorld() (seed "1234" and default options each time):

Code: Select all

map size    time bytes   time bits
1024*1024       1 686       1 550  ->  91.93 %
2048*2048       9 231       7 623  ->  82.58 %
4096*4096      73 750      49 524  ->  67.15 %
8192*4096     242 810     151 255  ->  62.29 %
I have not enough memory to test with 8192*8192.

Re: [Patch] Faster rivers creation

Posted: 12 Jan 2014 18:44
by Eddi

Code: Select all

+	#define SET_MARK(x) SetBit(marks[x/8], x&7)
+	#define IS_MARKED(x) HasBit(marks[x/8], x&7)
sounds like something you would turn into an object method...

Re: [Patch] Faster rivers creation

Posted: 12 Jan 2014 19:24
by Michi_cc
std::vector<bool>? That one is explicitly specified using something more efficient than a plain byte array (i.e. for all known C++ compilers likely exactly what you implemented manually).

-- Michael Lutz

Re: [Patch] Faster rivers creation

Posted: 12 Jan 2014 20:16
by MJP
I'm not used to templates but if it gives the same result... I'll try to rewrite.

Re: [Patch] Faster rivers creation

Posted: 12 Jan 2014 20:41
by Michi_cc
Hmm, apparently not necessarily a good idea. The replacement for "MemSetT(marks, 0, MapSize());" should be std:fill(), but it seems to be that the standard isn't saying anything about compilers needing to provide a fast overload of fill for vector<bool>. On the wrong compiler, that might eat up anything gained at other places :(

-- Michael Lutz

Re: [Patch] Faster rivers creation

Posted: 15 Jan 2014 22:04
by MJP
Michi_cc wrote:Hmm, apparently not necessarily a good idea. The replacement for "MemSetT(marks, 0, MapSize());" should be std:fill(), but it seems to be that the standard isn't saying anything about compilers needing to provide a fast overload of fill for vector<bool>. On the wrong compiler, that might eat up anything gained at other places :([/i]
OK. So I'll let the code in its simplest expression. The most important thing
is to show the potential benefit behind simple patches.

________________________________________________________________________________
Objective: reduce the time to create a large map.

Settings: OpenGFX, no NewGRF, blitter 8bpp-optimized, town_name = "english"
options.png
options.png (13.39 KiB) Viewed 4474 times
  1. Don't eat all the memory for nothing.
    With the first patch applied, I can create 8192*8192 maps and stay under 1GB of
    memory during the whole process (without it, a 8192*4096 eats up to 6.7GB on my
    system).
  2. Ensure not to create more towns/industries than the pool can allocate.
    Because I'm quite sure 80000 is above 64000 (I could make IndustryID a uint32
    but I won't take the risk of breaking things I don't even know they exist).
    • 8192*4096 -> 244 sec
      8192*8192 -> 1025 sec
  3. Problem: river generation wastes 75% of its time doing memset().
    Quick "fix": reduce the memory volume by using bits instead of bytes for the
    concerned array.
    • 8192*4096 -> 152 sec
      8192*8192 -> 652 sec
  4. Problem: method to find closest town of an industry is unsuited to large maps.
    Use a zone check instead.
    • 8192*4096 -> 145 sec
      8192*8192 -> 603 sec
  5. Problem: method to find conflicting industries is unsuited to large maps.
    Use a zone check instead.
    • 8192*4096 -> 137 sec
      8192*8192 -> 590 sec
  6. Then speed up ReplaceEnglishWords().
    I was tempted to optimize this one by removing the whole function.
    • 8192*4096 -> 126 sec
      8192*8192 -> 489 sec
  7. Finally, town creation may not have enough names in certain languages (your
    settings ask for 25000+ towns? ending up with only 68 is a big problem to me)
    (same patch posted at FS#5864)

Re: [Patch] Improve large map creation speed

Posted: 16 Jan 2014 19:18
by MJP
I reworked the rivers "fix" but it does not give the same result :?
(8192*8192 generated in less than 280 sec)

There is something I fail to understand in landscape.cpp:

Code: Select all

/* Maybe we can make a lake. Find the Nth of the considered tiles. */
TileIndex lakeCenter = 0;
for (int i = RandomRange(count - 1); i != 0; lakeCenter++) {
	if (marks[lakeCenter]) i--;
}
After this loop, lakeCenter can be not null though marks[lakeCenter] is false.
This is verified by:

Code: Select all

if (lakeCenter != 0) assert(marks[lakeCenter] == true);
With the settings previously posted and a mapsize of 1024*1024, the assertion is trigged.
How can the lakeCenter tile can be a considered tile while marks[lakeCenter] is false?
When using the std::map, such a thing can't happen hence the differences in generated maps.
Rubidium (since r22767 was committed by you), are you sure of this loop?

Re: [Patch] Improve large map creation speed

Posted: 16 Jan 2014 19:34
by Rubidium
No, the loop does not look right. I guess the attached diff solves two issues; the resulting lakeCenter being not the wanted lakeCenter, but wanted lakeCenter + 1. Also it could keep lakeCenter = 0, which wouldn't be nice either I guess.

Re: [Patch] Improve large map creation speed

Posted: 16 Jan 2014 21:17
by MJP
Rubidium wrote:No, the loop does not look right. I guess the attached diff solves two issues; the resulting lakeCenter being not the wanted lakeCenter, but wanted lakeCenter + 1. Also it could keep lakeCenter = 0, which wouldn't be nice either I guess.
Yes, I get it working now :) The output is the same in both cases.

8192*4096 -> 72 sec
8192*8192 -> 272 sec
This is getting almost acceptable.

Re: [Patch] Improve large map creation speed

Posted: 17 Jan 2014 02:32
by MJP
I've added a map to speedup the town name uniqueness check.

4096*4096 -> 31 sec
8192*4096 -> 63 sec
8192*8192 -> 140 sec

I don't see anymore obvious optimization.
17 min down to 2.3 min... Mission completed :)

Re: [Patch] Improve large map creation speed

Posted: 17 Jan 2014 13:37
by HackaLittleBit
Clipboard02.png
Clipboard02.png (42.74 KiB) Viewed 998 times
Very nice MJP. :bow:

My computer win xp

Trunk manages to take a whopping 800 mb of memory when generating 2048 x 2048 map
Yours about 150 MB

Question:
Why is memory only released after starting game?
further some warnings.

Code: Select all

[SRC] Compiling town_cmd.cpp
d:/development/ottd/trunk/src/town_cmd.cpp: In function 'Town* CalcClosestTownFromTile(TileIndex, uint)':
d:/development/ottd/trunk/src/town_cmd.cpp:3183:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
[SRC] Compiling town_gui.cpp
[SRC] Compiling townname.cpp
d:/development/ottd/trunk/src/townname.cpp:196:13: warning: 'void ReplaceWords(const char*, const char*, char*)' defined but not used [-Wunused-function]
[SRC] Compiling train_cmd.cpp
$ gcc -v
Using built-in specs.
COLLECT_GCC=d:\MinGW\bin\gcc.exe
COLLECT_LTO_WRAPPER=d:/mingw/bin/../libexec/gcc/mingw32/4.6.2/lto-wrapper.exe
Target: mingw32
Configured with: ../gcc-4.6.2/configure --enable-languages=c,c++,ada,fortran,obj
c,obj-c++ --disable-sjlj-exceptions --with-dwarf2 --enable-shared --enable-libgo
mp --disable-win32-registry --enable-libstdcxx-debug --enable-version-specific-r
untime-libs --build=mingw32 --prefix=/mingw
Thread model: win32
gcc version 4.6.2 (GCC)

Regards

Re: [Patch] Improve large map creation speed

Posted: 17 Jan 2014 15:42
by MJP
HackaLittleBit wrote:further some warnings.

Code: Select all

[SRC] Compiling town_cmd.cpp
d:/development/ottd/trunk/src/town_cmd.cpp: In function 'Town* CalcClosestTownFromTile(TileIndex, uint)':
d:/development/ottd/trunk/src/town_cmd.cpp:3183:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
[SRC] Compiling town_gui.cpp
[SRC] Compiling townname.cpp
d:/development/ottd/trunk/src/townname.cpp:196:13: warning: 'void ReplaceWords(const char*, const char*, char*)' defined but not used [-Wunused-function]
[SRC] Compiling train_cmd.cpp
I indeed did not remove ReplaceWords() but now I think of it, there is no real
need for faster-replaceenglishwords.patch. I removed it since the important one
is use-map-to-check-town-names-uniqueness.patch. All warnings fixed in r4.
HackaLittleBit wrote:Question: Why is memory only released after starting game?
I did not delve into the cache system. I guess the map generation is a special
case where the cache should not be used as it is in the rest of the game. And
the value of 512 is the first one I used, it solved my problem so I did not even
try another one.
HackaLittleBit wrote:Trunk manages to take a whopping 800 mb of memory when generating 2048 x 2048 map
Yours about 150 MB
Now, 800 MB is what you need to generate a 8192*8192 :)
HackaLittleBit wrote:Very nice MJP. :bow:
Thanks!

Re: [Patch] Faster rivers creation

Posted: 18 Jan 2014 23:06
by MJP
MJP wrote:I'm not used to templates [...]
Well... Now I know when I should prefer std::set over std::map.
(times unchanged)

Re: [Patch] Improve large map creation speed

Posted: 19 Jan 2014 19:51
by Michi_cc
In fgw_05_checkconflictindustry-for-large-maps.patch, is the 14 in your addition the same 14 as in the existing code below or just coincidental? How did you determine dmax * dmax * 2, empirically or something else? (Also applies to closest town.) A few more comments might be nice :)

-- Michael Lutz

Re: [Patch] Improve large map creation speed

Posted: 20 Jan 2014 03:08
by MJP
Michi_cc wrote:In fgw_05_checkconflictindustry-for-large-maps.patch, is the 14 in your addition the same 14 as in the existing code below or just coincidental? How did you determine dmax * dmax * 2, empirically or something else? (Also applies to closest town.) A few more comments might be nice :)
In the case where nr towns = very low and nr industries = high, there is too
much industry conflicts, thus industry generation time is very long. I think it
could be prevented right from the World Generation interface but it would be a
new patch. If nr towns = very low, nr industries should not be more than normal.

That being said, dmax*dmax*2 is an error since I initially wanted the number of
tiles in the considered zone. It should have been dmax*dmax*4. But the average
time for 1 iteration in zone check is shorter in such a way that the *2 error
somehow balances things out.

In CheckIfFarEnoughFromConflictingIndustry(), the value of 14 is indeed the same
one specified for DistanceMax().

In CalcClosestTownFromTile(), I chose 50 from observation. But observation only
with nr towns = normal. So I reworked this part to be more context relevant and
it seems to work well on the few maps I tested (not all square maps).

The 2 "radius*radius*2" thresholds could certainly be refined but I don't think
there is much to gain though (it's 4am, I admit a bit of laziness on this one).

Re: [Patch] Improve large map creation speed

Posted: 21 Jan 2014 18:09
by MJP
I changed a few things, added comments, added 05_isclosetotown-for-large-maps.

2048*2048 -> 7 sec
8192*8192 -> 136 sec

The empirical ratio is shown by fgw_09_perf-in-calcclosesttownfromtile.patch.
See CalcClosestTownFromTile.ods, the last column "explains" the /2.

Re: [Patch] Improve large map creation speed

Posted: 06 Feb 2014 21:15
by Rubidium
04 is troublesome to me because of sqrt. This is a floating point operation and on different platforms that can yield different results. In the worst case it could mean that zone_check_radius differs between computers, and since it's called during multiplayer we could get a desync.

08 seems kinda tricky; shouldn't that be used in other town name generation cases as well? Also, what if the fallback is the one falling back to? I'm not really sure whether to implement it at this moment.

Re: [Patch] Improve large map creation speed

Posted: 08 Feb 2014 17:25
by MJP
Rubidium wrote:04 is troublesome to me because of sqrt. This is a floating point operation and on different platforms that can yield different results. In the worst case it could mean that zone_check_radius differs between computers, and since it's called during multiplayer we could get a desync.
I see the problem. If 2 zones differ in size on 2 computers, and 2 towns sitting on that line are equally distant and the order of their position differs from the one in the pool. The probability of desync is very low but not null.
Rubidium wrote:08 seems kinda tricky; shouldn't that be used in other town name generation cases as well? Also, what if the fallback is the one falling back to? I'm not really sure whether to implement it at this moment.
If the fallback is the one falling back to, then the town generation may take longer (it does not loop more than once), a check could prevent that.
Maybe it's better to show a warning in the "World Generation" window if the player settings can't be fulfilled.

Re: [Patch] Improve large map creation speed

Posted: 08 Feb 2014 21:30
by HackaLittleBit
MJP wrote:
Rubidium wrote:04 is troublesome to me because of sqrt. This is a floating point operation and on different platforms that can yield different results. In the worst case it could mean that zone_check_radius differs between computers, and since it's called during multiplayer we could get a desync.
I see the problem. If 2 zones differ in size on 2 computers, and 2 towns sitting on that line are equally distant and the order of their position differs from the one in the pool. The probability of desync is very low but not null.
Could you not use 'IntSqrt'.
I checked that and I saw no difference from 'sqrt' .

And I get warning Rubidium.
d:/development/ottd/trunkhg/src/industry_cmd.cpp:1579:47: warning: comparison be
tween signed and unsigned integer expressions [-Wsign-compare]

Re: [Patch] Improve large map creation speed

Posted: 10 Feb 2014 20:54
by MJP
HackaLittleBit wrote:Could you not use 'IntSqrt'.
I checked that and I saw no difference from 'sqrt' .
Right, IntSqrt() does not issue any floating point operation.
08 is refined but still as debatable as it was.