Page 6 of 7
Re: Patch: Improved acceleration for road vehicles [v12-r17802]
Posted: 18 Oct 2009 18:29
by Terkhen
I'm still in the same situation: I can't compile win32 binaries right now, I will set MinGW again once that I have some free time, meanwhile if someone is able to provide win32 binaries they will be welcome.
Re: Patch: Improved acceleration for road vehicles [v12-r17802]
Posted: 18 Oct 2009 18:40
by Gremnon
Or learn the simple method of compiling themselves, which is not hard, and not resource intensive, and does not require internal working of your computer, etc.
It's simple. Have a bit of patience, read carefully what the Compiling on MinGW page on the OpenTTD wiki says, and learn. If you follow what it says practically to the letter, you'll be compiling in no time.
Re: Patch: Improved acceleration for road vehicles [v12-r17921]
Posted: 31 Oct 2009 18:29
by Terkhen
I finally had some free time to set up compiling in Windows. You can find a win32 binary and a diff file updated to r17921 at the first post.
Re: Patch: Improved acceleration for road vehicles [v12-r17921]
Posted: 31 Oct 2009 19:46
by Comm Cody
WHOOP! WHOOP!
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
Posted: 02 Dec 2009 18:36
by Terkhen
New version of the patch, with no new features or external changes. The diff file and the win32 binary are at the first post.
v13-r18381
- Update to trunk and a lot of related codechanges.
In the current implementation of the unified acceleration code, I'm using a lot of virtual functions. I know this is probably not very efficient, but I have no real data about the actual performance. I want to test the speed of my implementation (for trains) against trunk speed (and decide about the best implementation with that data), but I don't know what should be the best way of measuring it, besides that I should use the null video driver. Can anyone shed some light on this issue?. Any ideas on possible ways of implementing these virtual functions in a more efficient way will be welcome too.
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
Posted: 02 Dec 2009 19:37
by Zuu
Would using a function pointer being faster than virtual functions? Not really pretty, but might be possible if you are concerned about the speed of virtual functions.
I have no idea if a function pointer will be faster or slower than a virtual function. It is just an idea.
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
Posted: 02 Dec 2009 20:16
by Thief^
A function pointer should be one memory read (cache miss in worst case) cheaper, at the expense of having one pointer in the class for each function instead of only one pointer to a class-common vft.
i.e. it would be a performance vs memory use tradeoff.
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
Posted: 02 Dec 2009 20:23
by Terkhen
It will look ugly, but since the acceleration code can be called for hundreds or thousands of vehicles each tick, I think performance is far more important. I will give function pointers a try once I know how to compare the speed of each implementation.
The virtual functions are mostly one liners, and they would be perfect candidates for inlining if they weren't virtual. This is why I think that the added time of calling the virtual function will be the biggest performance issue.
Re: Patch: Improved acceleration for road vehicles [v13-r18381]
Posted: 05 Dec 2009 08:39
by andythenorth
Terkhen wrote:New version of the patch, with no new features or external changes. The diff file and the win32 binary are at the first post.v13-r18381
I've compiled and tested this with HEQS. Works well.
I've adjusted TE coefficients for some HEQS articulated vehicles; these will show up in r172 when the HEQS nightly build server catches up.
http://bundles.openttdcoop.org/heqs/nightlies/. Due to articulated RVs not using trailer unladen weights, total vehicle weight is 'incorrect' for many HEQS vehicles, but I'm not too bothered, it works ok for gameplay

Re: Patch: Improved acceleration for road vehicles [v13-r18381]
Posted: 05 Dec 2009 16:36
by Terkhen
That's great
I'll have a test game as soon as the new version shows up at the server. Once that I test both HEQS and the patch file, I'll start with the optimizations.
Re: Patch: Improved acceleration for road vehicles [v13-r18519]
Posted: 16 Dec 2009 23:08
by Terkhen
Patch and build updated to current trunk. I hope to have time for developing the optimizations required for v14 soon.
Re: Patch: Improved acceleration for road vehicles [v14-r18674]
Posted: 31 Dec 2009 17:58
by Terkhen
Version 14 ready; the performance changes are still missing.
v14-r18674: Added a configuration option to select the steepness of slopes for road vehicles.
This solves the problem of having different steepness for slopes in trains and road vehicles: anyone can select the option he wants. By default, road vehicles use a steepness of 7%.
Edit: Uploaded a new patch that fixes a warning.
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Posted: 06 Jan 2010 13:29
by Terkhen
I finally started profiling the performance of the unified acceleration. I am using
Openttdcoop public server game 169 for this, as it is recent and contains a lot of trains. As I thought, the TrainLocoHandler function (which calls the code that calculates acceleration) was 16% slower using the patch than with trunk. Since this function does a lot more besides calling UpdateTrainSpeed, I can only suppose that UpdateTrainSpeed performance is really low now.
I'm attaching my results, in case that anybody spots an obvious mistake in my data or calculations (all rows refer to the gprof results for TrainLocoHandler executing the savegame linked above for 2000 ticks). Before starting optimizations to my code, I'd also like to know if there is some precise way of profiling UpdateTrainSpeed alone: the standard --enable-profiling compilation removes UpdateTrainSpeed for optimization purposes. I don't want to change the compiler optimization flags, because then I would be profiling a binary that will not perform as a standard OpenTTD binary.
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Posted: 06 Jan 2010 13:38
by Rubidium
You can use try to use TIC/TOC, which counts the number of CPU ticks there are between the TIC and TOC. However make sure that you capture all paths, and TIC/TOC doesn't work on all architectures. Also context switches and the like can give weird numbers.
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Posted: 06 Jan 2010 15:52
by Hirundo
Regarding optimization, using a cast, i.e. T::From(this) in AcceleratedVehicle::GetAcceleration may help. Since you're using a template anyway, it may help the compiler to optimize away all the virtual function calls. You do this in CargoChanged but not in GetAcceleration, while it may be more useful in the latter case.
Note that I haven't tested anything, so don't shoot me if I'm wrong

Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Posted: 06 Jan 2010 21:00
by Terkhen
Hirundo wrote:Note that I haven't tested anything, so don't shoot me if I'm wrong

Now TrainLocoHandler is only 13% slower than the trunk version
Rubidium wrote:You can use try to use TIC/TOC, which counts the number of CPU ticks there are between the TIC and TOC. However make sure that you capture all paths, and TIC/TOC doesn't work on all architectures. Also context switches and the like can give weird numbers.
It seems to be Intel exclusive (either that or my searching failed, as I wasn't able to find any in-depth web) and I'm using an AMD for profiling. I found about
rdtsc; I'll try to make it work.
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Posted: 06 Jan 2010 21:03
by Rubidium
TIC/TOC are macros in IIRC debug.h (which internally use rdtsc).
Re: Patch: Improved acceleration for road vehicles [v14-r18709]
Posted: 06 Jan 2010 21:05
by Terkhen
Thanks!. I got the wrong idea, I thought you were referring to something outside OpenTTD code.
Edit: I have the first results. The initial version of UpdateTrainSpeed is 27% slower than in trunk. Using Hirundo's suggestion, UpdateTrainSpeed is only a 10% slower. I'm going to try other optimization suggestions, let's see how optimized this can be
In case anyone is curious, I'm using the following diff file and quick-and-dirty bash script for measuring times.
Re: Patch: Improved acceleration for road vehicles [v15-18750]
Posted: 07 Jan 2010 13:36
by Terkhen
New version released. There are no changes besides optimizations.
I found TIC TOC results not very precise, as a result I went back to gprof. Using more executions (100 executions per version), I got these results for the TrainLocoHandler function: v14 is 28.7% slower than in trunk, and v15 is a 18% slower.
My problem now is that I'm out of new optimizations; Hirundo's suggestion already took care of the virtual function calls. The implemented versions of these functions have a FORCEINLINE, but they still appear as different functions at gprof output. Since the GetAcceleration function does a lot of calls to small functions, I think the performance would be way better if I managed to inline them for real. Besides that idea, I don't know how to continue reducing the gap between trunk and the unified acceleration code.
Re: Patch: Improved acceleration for road vehicles [v15-18750]
Posted: 10 Jan 2010 06:12
by DaleStan
I'm impressed that you managed to inline virtual functions at all.
To guarantee that polymorphism works, virtual functions are often required to be called through a vtable; a table of function pointers pointed to by the object for that purpose.
If the compiler can determine unambiguiously that the object is of a certain type (usually because it can see the construction of the object, whether passed by value or constructed within the function) then it can call the function directly, but if the object is passed or returned by pointer or reference, the compiler will usually have to use the vtable.
If you know the type of the object and the overhead of creating a copy is not prohibitive, it might be worthwhile to create a copy and call the functions on the copy; the compiler should be able to inline those.