We made some tests for simutrans (which uses a more or less similar software rendering) and SDL versus SDL2 backend. Conversion was not too difficult. A subtle issues: SDL2 cannot use sounds with different sampling rates together.
SDL2 is certainly not faster but at least about 20% slower in most cases. The exception is the newest MAC OS, where only SDL2 gets a hardware acceleration (which is mandantory with a retina display). Otherwise screen rendering hits the bandwidth of the memory transfer the CPU can do.
See performance of SDL2 on windows with an otherwise identical game to display (GDI is just plain old windows routines, not DirectX)
SDL2 13.6 ms/frame
OpenGL, PBO 9.3 ms/frame
SDL1, ST, -use_hw 9.0 ms/frame
OpenGL 8.3 ms/frame
GDI, ST 7.6 ms/frame
SDL1, ST, UpdateRect 7.4 ms/frame
SDL1, ST, UpdateRects 6.5 ms/frame
GDI, MT 5.4 ms/frame
SDL1, MT 5.3 ms/frame
ST=single threaded MT=multi threade
At least for 16 bit/pixel bitmaps, SDL2 is is half speed of single threaded SDL1.2 ... So the majority (i.e. the windows users) might get rather a speed loss than a gain. (Of course with 32 bit bitmaps, it may be less obvious, since these come even closer to the bandwidth bottleneck). But I am pretty sure, 8 bit bitmaps will be better in SDL1.2 too, since OpenGL does not really care about them any more.
As noted above, SDL2 really needs merging of dirty rects as much as possible. THe above number are with the optimized algorithm of getting as few as possible dirty rects.
I like to look at great maps and see how things flow. A little like a finished model railway, but it is evolving and actually never finished. http://www.simutrans.com