Patch: Improved acceleration for road vehicles [In trunk]
Moderator: OpenTTD Developers
Re: Patch: Improved acceleration for road vehicles [v12-r17802]
I'm still in the same situation: I can't compile win32 binaries right now, I will set MinGW again once that I have some free time, meanwhile if someone is able to provide win32 binaries they will be welcome.
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v12-r17802]
Or learn the simple method of compiling themselves, which is not hard, and not resource intensive, and does not require internal working of your computer, etc.
It's simple. Have a bit of patience, read carefully what the Compiling on MinGW page on the OpenTTD wiki says, and learn. If you follow what it says practically to the letter, you'll be compiling in no time.
It's simple. Have a bit of patience, read carefully what the Compiling on MinGW page on the OpenTTD wiki says, and learn. If you follow what it says practically to the letter, you'll be compiling in no time.
Re: Patch: Improved acceleration for road vehicles [v12-r17921]
I finally had some free time to set up compiling in Windows. You can find a win32 binary and a diff file updated to r17921 at the first post.
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v12-r17921]
WHOOP! WHOOP!
Something goes here, hell if I know.
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
New version of the patch, with no new features or external changes. The diff file and the win32 binary are at the first post.
v13-r18381
v13-r18381
- Update to trunk and a lot of related codechanges.
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
Would using a function pointer being faster than virtual functions? Not really pretty, but might be possible if you are concerned about the speed of virtual functions.
I have no idea if a function pointer will be faster or slower than a virtual function. It is just an idea.
I have no idea if a function pointer will be faster or slower than a virtual function. It is just an idea.
My OpenTTD contributions (AIs, Game Scripts, patches, OpenTTD Auto Updater, and some sprites)
Junctioneer (a traffic intersection simulator)
Junctioneer (a traffic intersection simulator)
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
A function pointer should be one memory read (cache miss in worst case) cheaper, at the expense of having one pointer in the class for each function instead of only one pointer to a class-common vft.
i.e. it would be a performance vs memory use tradeoff.
i.e. it would be a performance vs memory use tradeoff.
Melt with the Shadows,
Embrace your destiny...
Embrace your destiny...
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
It will look ugly, but since the acceleration code can be called for hundreds or thousands of vehicles each tick, I think performance is far more important. I will give function pointers a try once I know how to compare the speed of each implementation.
The virtual functions are mostly one liners, and they would be perfect candidates for inlining if they weren't virtual. This is why I think that the added time of calling the virtual function will be the biggest performance issue.
The virtual functions are mostly one liners, and they would be perfect candidates for inlining if they weren't virtual. This is why I think that the added time of calling the virtual function will be the biggest performance issue.
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
- andythenorth
- Tycoon
- Posts: 5667
- Joined: 31 Mar 2007 14:23
- Location: Lost in Music
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
I've compiled and tested this with HEQS. Works well.Terkhen wrote:New version of the patch, with no new features or external changes. The diff file and the win32 binary are at the first post.v13-r18381
I've adjusted TE coefficients for some HEQS articulated vehicles; these will show up in r172 when the HEQS nightly build server catches up. http://bundles.openttdcoop.org/heqs/nightlies/. Due to articulated RVs not using trailer unladen weights, total vehicle weight is 'incorrect' for many HEQS vehicles, but I'm not too bothered, it works ok for gameplay
![Smile :)](./images/smilies/icon_smile.gif)
FIRS Industry Replacement Set (released) | HEQS Heavy Equipment Set (trucks, industrial trams and more) (finished)
Unsinkable Sam (ships) (preview released) | CHIPS Has Improved Players' Stations (finished)
Iron Horse ((trains) (released) | Termite (tracks for Iron Horse) (released) | Busy Bee (game script) (released)
Road Hog (road vehicles and trams) (released)
Unsinkable Sam (ships) (preview released) | CHIPS Has Improved Players' Stations (finished)
Iron Horse ((trains) (released) | Termite (tracks for Iron Horse) (released) | Busy Bee (game script) (released)
Road Hog (road vehicles and trams) (released)
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
That's great ![Very Happy :D](./images/smilies/icon_biggrin.gif)
I'll have a test game as soon as the new version shows up at the server. Once that I test both HEQS and the patch file, I'll start with the optimizations.
![Very Happy :D](./images/smilies/icon_biggrin.gif)
I'll have a test game as soon as the new version shows up at the server. Once that I test both HEQS and the patch file, I'll start with the optimizations.
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v13-r18519]
Patch and build updated to current trunk. I hope to have time for developing the optimizations required for v14 soon.
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v14-r18674]
Version 14 ready; the performance changes are still missing.
v14-r18674: Added a configuration option to select the steepness of slopes for road vehicles.
This solves the problem of having different steepness for slopes in trains and road vehicles: anyone can select the option he wants. By default, road vehicles use a steepness of 7%.
Edit: Uploaded a new patch that fixes a warning.
v14-r18674: Added a configuration option to select the steepness of slopes for road vehicles.
This solves the problem of having different steepness for slopes in trains and road vehicles: anyone can select the option he wants. By default, road vehicles use a steepness of 7%.
Edit: Uploaded a new patch that fixes a warning.
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
I finally started profiling the performance of the unified acceleration. I am using Openttdcoop public server game 169 for this, as it is recent and contains a lot of trains. As I thought, the TrainLocoHandler function (which calls the code that calculates acceleration) was 16% slower using the patch than with trunk. Since this function does a lot more besides calling UpdateTrainSpeed, I can only suppose that UpdateTrainSpeed performance is really low now.
I'm attaching my results, in case that anybody spots an obvious mistake in my data or calculations (all rows refer to the gprof results for TrainLocoHandler executing the savegame linked above for 2000 ticks). Before starting optimizations to my code, I'd also like to know if there is some precise way of profiling UpdateTrainSpeed alone: the standard --enable-profiling compilation removes UpdateTrainSpeed for optimization purposes. I don't want to change the compiler optimization flags, because then I would be profiling a binary that will not perform as a standard OpenTTD binary.
I'm attaching my results, in case that anybody spots an obvious mistake in my data or calculations (all rows refer to the gprof results for TrainLocoHandler executing the savegame linked above for 2000 ticks). Before starting optimizations to my code, I'd also like to know if there is some precise way of profiling UpdateTrainSpeed alone: the standard --enable-profiling compilation removes UpdateTrainSpeed for optimization purposes. I don't want to change the compiler optimization flags, because then I would be profiling a binary that will not perform as a standard OpenTTD binary.
- Attachments
-
- data.ods
- (18.71 KiB) Downloaded 127 times
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
You can use try to use TIC/TOC, which counts the number of CPU ticks there are between the TIC and TOC. However make sure that you capture all paths, and TIC/TOC doesn't work on all architectures. Also context switches and the like can give weird numbers.
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Regarding optimization, using a cast, i.e. T::From(this) in AcceleratedVehicle::GetAcceleration may help. Since you're using a template anyway, it may help the compiler to optimize away all the virtual function calls. You do this in CargoChanged but not in GetAcceleration, while it may be more useful in the latter case.
Note that I haven't tested anything, so don't shoot me if I'm wrong![Very Happy :D](./images/smilies/icon_biggrin.gif)
Note that I haven't tested anything, so don't shoot me if I'm wrong
![Very Happy :D](./images/smilies/icon_biggrin.gif)
Create your own NewGRF? Check out this tutorial!
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Now TrainLocoHandler is only 13% slower than the trunk versionHirundo wrote:Note that I haven't tested anything, so don't shoot me if I'm wrong
![Very Happy :D](./images/smilies/icon_biggrin.gif)
It seems to be Intel exclusive (either that or my searching failed, as I wasn't able to find any in-depth web) and I'm using an AMD for profiling. I found about rdtsc; I'll try to make it work.Rubidium wrote:You can use try to use TIC/TOC, which counts the number of CPU ticks there are between the TIC and TOC. However make sure that you capture all paths, and TIC/TOC doesn't work on all architectures. Also context switches and the like can give weird numbers.
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
TIC/TOC are macros in IIRC debug.h (which internally use rdtsc).
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Thanks!. I got the wrong idea, I thought you were referring to something outside OpenTTD code.
Edit: I have the first results. The initial version of UpdateTrainSpeed is 27% slower than in trunk. Using Hirundo's suggestion, UpdateTrainSpeed is only a 10% slower. I'm going to try other optimization suggestions, let's see how optimized this can be![Very Happy :D](./images/smilies/icon_biggrin.gif)
In case anyone is curious, I'm using the following diff file and quick-and-dirty bash script for measuring times.
Edit: I have the first results. The initial version of UpdateTrainSpeed is 27% slower than in trunk. Using Hirundo's suggestion, UpdateTrainSpeed is only a 10% slower. I'm going to try other optimization suggestions, let's see how optimized this can be
![Very Happy :D](./images/smilies/icon_biggrin.gif)
In case anyone is curious, I'm using the following diff file and quick-and-dirty bash script for measuring times.
- Attachments
-
- train_profiling.diff
- (678 Bytes) Downloaded 111 times
-
- tictoc.zip
- (287 Bytes) Downloaded 118 times
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v15-18750]
New version released. There are no changes besides optimizations.
I found TIC TOC results not very precise, as a result I went back to gprof. Using more executions (100 executions per version), I got these results for the TrainLocoHandler function: v14 is 28.7% slower than in trunk, and v15 is a 18% slower.
My problem now is that I'm out of new optimizations; Hirundo's suggestion already took care of the virtual function calls. The implemented versions of these functions have a FORCEINLINE, but they still appear as different functions at gprof output. Since the GetAcceleration function does a lot of calls to small functions, I think the performance would be way better if I managed to inline them for real. Besides that idea, I don't know how to continue reducing the gap between trunk and the unified acceleration code.
I found TIC TOC results not very precise, as a result I went back to gprof. Using more executions (100 executions per version), I got these results for the TrainLocoHandler function: v14 is 28.7% slower than in trunk, and v15 is a 18% slower.
My problem now is that I'm out of new optimizations; Hirundo's suggestion already took care of the virtual function calls. The implemented versions of these functions have a FORCEINLINE, but they still appear as different functions at gprof output. Since the GetAcceleration function does a lot of calls to small functions, I think the performance would be way better if I managed to inline them for real. Besides that idea, I don't know how to continue reducing the gap between trunk and the unified acceleration code.
- Attachments
-
- data_gprof.ods
- (24.38 KiB) Downloaded 122 times
Spanish translation of OpenTTD
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Extended heightmaps
Have fun, don't quarrel too much and add as many advanced settings as you can.
Re: Patch: Improved acceleration for road vehicles [v15-18750]
I'm impressed that you managed to inline virtual functions at all.
To guarantee that polymorphism works, virtual functions are often required to be called through a vtable; a table of function pointers pointed to by the object for that purpose.
If the compiler can determine unambiguiously that the object is of a certain type (usually because it can see the construction of the object, whether passed by value or constructed within the function) then it can call the function directly, but if the object is passed or returned by pointer or reference, the compiler will usually have to use the vtable.
If you know the type of the object and the overhead of creating a copy is not prohibitive, it might be worthwhile to create a copy and call the functions on the copy; the compiler should be able to inline those.
To guarantee that polymorphism works, virtual functions are often required to be called through a vtable; a table of function pointers pointed to by the object for that purpose.
If the compiler can determine unambiguiously that the object is of a certain type (usually because it can see the construction of the object, whether passed by value or constructed within the function) then it can call the function directly, but if the object is passed or returned by pointer or reference, the compiler will usually have to use the vtable.
If you know the type of the object and the overhead of creating a copy is not prohibitive, it might be worthwhile to create a copy and call the functions on the copy; the compiler should be able to inline those.
To get a good answer, ask a Smart Question. Similarly, if you want a bug fixed, write a Useful Bug Report. No TTDPatch crashlog? Then follow directions.
Projects: NFORenum (download) | PlaneSet (Website) | grfcodec (download) | grfdebug.log parser
Projects: NFORenum (download) | PlaneSet (Website) | grfcodec (download) | grfdebug.log parser
Who is online
Users browsing this forum: No registered users and 2 guests