Use of __builtin_expect in SQLite

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Use of __builtin_expect in SQLite

Dominique Pellé
Hi

I see that SQLite has many small optimizations being
checked-in. Wouldn't it help to use the following macros
and use unlikely(...) for error paths:

#ifdef __GNUC__
#define likely(x)       __builtin_expect(!!(x), 1)
#define unlikely(x)     __builtin_expect(!!(x), 0)
#else
#define likely(x)       (x)
#define unlikely(x)     (x)
#endif

It helps for branch prediction and instruction cache hits
as cold branches are moved away from hot branches.
CPU have branch predictor, but for complex code, with
many 'if', static hints can still helps. Better is to compile
with PGO (profile guided optimization), but few people
use PGO in practice.

Doing a Google search, I found that __builtin_expect was
disabled because it did not work with old gcc and was
then completely removed in 2012:

http://dev.gitblit.com:8080/cgi-bin/repo/sqlite/info/e01f9ed9450d3e23
http://sqlite.org/src4/info/d618b9b1069e66779c298798acb24044664b5109

However, it's possible to check fo gcc version
using something like this as found here:

http://www.opensource.apple.com/source/X11proto/X11proto-57/xproto/xproto-7.0.21/Xfuncproto.h.in?txt

#if defined(__GNUC__) && ((__GNUC__ * 100 + __GNUC_MINOR__) >= 303)
# define _X_LIKELY(x)   __builtin_expect(!!(x), 1)
# define _X_UNLIKELY(x) __builtin_expect(!!(x), 0)
#else /* not gcc >= 3.3 */
# define _X_LIKELY(x)   (x)
# define _X_UNLIKELY(x) (x)
#endif

I'm curious about the outcome on SQLite benchmarks.

Regards
Dominique
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Use of __builtin_expect in SQLite

Roger Binns
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/02/16 00:56, Dominique Pellé wrote:
> I'm curious about the outcome on SQLite benchmarks.

About a year ago I tried them out on some tight code (non-SQLite) that
absolutely had to use less CPU time.  I couldn't get them to make any
difference outside the bounds of measurement error.  Since SQLite has
lots of "tight code" places, the instrumentation would have to help in
most of them to make a difference (Amdahl's law).

Taking a step back, the reasons why it had no measureable effect are
simple.  The processors are getting better at branch prediction,
better at mitigating mispredicted branches, getting even faster
compared to memory.  The compilers are getting better all the time too
at the same kind of things.

It takes a more pervasive effort (in the general case) to see
performance improvements, which is what PGO does.  Note that while it
provides branch information, it also provides far more useful
information about what gets called and how much so that the entire
binary can be re-arranged, such as putting hot functions in the same
pages reducing cache pressure which will have a more measureable effect.

Roger

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAla4DB4ACgkQmOOfHg372QRzyQCg05Z2zyOjqJ58q0jx367wQhqo
+SAAn1AeeydFIdwvJRwh1P+MrSEX1Pkd
=4nwe
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Use of __builtin_expect in SQLite

Simon Slavin-3

On 8 Feb 2016, at 3:31am, Roger Binns <[hidden email]> wrote:

> Taking a step back, the reasons why it had no measureable effect are
> simple.  The processors are getting better at branch prediction,
> better at mitigating mispredicted branches, getting even faster
> compared to memory.  The compilers are getting better all the time too
> at the same kind of things.

This is the other argument against optimization (of a certain kind).  Computers are far better at optimization than humans are, and they always do it.  No need for you to do it too.

Simon.
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Use of __builtin_expect in SQLite

Matthias-Christian Ott
In reply to this post by Roger Binns
On 2016-02-08 04:31, Roger Binns wrote:
> On 07/02/16 00:56, Dominique Pellé wrote:
>> I'm curious about the outcome on SQLite benchmarks.
>
> About a year ago I tried them out on some tight code (non-SQLite) that
> absolutely had to use less CPU time.  I couldn't get them to make any
> difference outside the bounds of measurement error.  Since SQLite has
> lots of "tight code" places, the instrumentation would have to help in
> most of them to make a difference (Amdahl's law).

Amdahl's law is not applicable here and describes a completely different
problem. SQLite does not involve concurrency.

In typical scenarios SQLite is also limited by the number of IOPS and
optimization to improve branch predictability have no measurable effect
and you just waste time with them. Just use PCIe SSDs that are able to
saturate the bus and all of you performance problems with SQLite are
gone if you use SQLite on typical computer.

- Matthias-Christian


_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

signature.asc (188 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Use of __builtin_expect in SQLite

Scott Hess
On Sun, Feb 7, 2016 at 10:39 PM, Matthias-Christian Ott <[hidden email]>
wrote:

> On 2016-02-08 04:31, Roger Binns wrote:
> > On 07/02/16 00:56, Dominique Pellé wrote:
> >> I'm curious about the outcome on SQLite benchmarks.
> >
> > About a year ago I tried them out on some tight code (non-SQLite) that
> > absolutely had to use less CPU time.  I couldn't get them to make any
> > difference outside the bounds of measurement error.  Since SQLite has
> > lots of "tight code" places, the instrumentation would have to help in
> > most of them to make a difference (Amdahl's law).
>
> Amdahl's law is not applicable here and describes a completely different
> problem. SQLite does not involve concurrency.
>

"Amdahl's law _In computer architecture, Amdahl's law (or Amdahl's
argument[1]) gives the theoretical speedup in latency of the execution of a
task at fixed workload that can be expected of a system whose resources are
improved."

You _can_ use it to analyze why parallelizing an algorithm across N CPUs
won'y cause the runtime to become 1/N, but that's not the only use.

In typical scenarios SQLite is also limited by the number of IOPS and
> optimization to improve branch predictability have no measurable effect
> and you just waste time with them. Just use PCIe SSDs that are able to
> saturate the bus and all of you performance problems with SQLite are
> gone if you use SQLite on typical computer.
>

In fact, your argument here is Amdahl's Law.  If your performance is
primarily limited by I//O latency, then improving CPU efficiency won't help
much.

-scott
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Reply | Threaded
Open this post in threaded view
|

Re: Use of __builtin_expect in SQLite

Roger Binns
In reply to this post by Matthias-Christian Ott
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/02/16 22:39, Matthias-Christian Ott wrote:
> Amdahl's law is not applicable here and describes a completely
> different problem. SQLite does not involve concurrency.

Amdahl's law very much applies, and doesn't explicitly only involve
concurrency.  It is about relating speedups of individual pieces and
how that affects the whole.  For example:

Lets say that processing a representative query involves ten different
places in the SQLite code, and that each one of those takes about ten
percent of the total execution time.  And then lets say
likely/unlikely speeds up one of those by ten percent.  The overall
whole improvement will then be 1%.  To get an overall improvement of
10% each of the ten different pieces would need to get about 10%
faster.  That would be a huge amount of work, and the nature of each
of those places would have be that they could be sped up that way.

> SQLite does not involve concurrency.

http://sqlite.org/pragma.html#pragma_threads  :-)

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAla4u9gACgkQmOOfHg372QScKACeKaDcRUmtllIaCtLrvQOXYAoy
tPsAoNH+TKDtsWsE9XeJHTVwKQ24MjJu
=sPJZ
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
[hidden email]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users