Gmane
From: <matt <at> finray.com>
Subject: Re: [function] timing trivia
Newsgroups: gmane.comp.lib.boost.devel
Date: 2004-02-15 12:40:23 GMT (5 years, 19 weeks, 6 days, 7 hours and 56 minutes ago)
I think I found a suitable trick to measure boost::function overhead in
release mode on my platform.

I'm getting 14-30 nanoseconds consistently now by forcing the result into
a volatile double:

Results are:
--------------------------------------------------------------------------------
                        looper invasive timing estimate
                               boost_function_call
--------------------------------------------------------------------------------
median time = 33.52057416267943 microseconds
90% range size = 2.514354066985641 microseconds
widest range size (max - min) = 19.96459330143541 microseconds
minimum time = 25.20813397129187 microseconds
maximum time = 45.17272727272728 microseconds
50% range = (33.51961722488039 microseconds, 33.52057416267943 microseconds)
50% range size = 0.9569377990413671 nanoseconds
--------------------------------------------------------------------------------
                        looper invasive timing estimate
                  boost_function_call_equiv_with_in_line_source
--------------------------------------------------------------------------------
median time = 33.50287081339713 microseconds
90% range size = 2.509569377990427 microseconds
widest range size (max - min) = 16.82918660287082 microseconds
minimum time = 25.18851674641148 microseconds
maximum time = 42.0177033492823 microseconds
50% range = (33.50287081339713 microseconds, 33.50382775119618 microseconds)
50% range size = 0.9569377990413671 nanoseconds
Press any key to continue . . .

Which is about an 18ns difference.  From this code:

#define MAX_FN_LOOP  1e4

static double not_empty() {
	static double sum;
	static double i;

	sum = 0.0;
	for (i = 0.0; i < MAX_FN_LOOP ; ++i) {
		sum += i * i;
	}
	return sum;
}

inline double boost_function_call( matt::timer& t )
{
	boost::function< double (void)> fn = ¬_empty;

	double now;
	volatile double x = 0;
	t.restart();
	x += fn();

	now = t.elapsed();

	return now;
}

inline double boost_function_call_equiv_with_in_line_source( matt::timer& t )
{

	double now;
	volatile double x;
	t.restart();

	static double sum;
	static double i;

	sum = 0.0;
	for (i = 0.0; i < MAX_FN_LOOP ; ++i) {
		sum += i * i;
	}

	x = sum;
	now = t.elapsed();

	return now;
}

If I change the size of the loop to something much smaller, say 1e1, I get
a reasonably consistent result now:

--------------------------------------------------------------------------------
                        looper invasive timing estimate
                               boost_function_call
--------------------------------------------------------------------------------
median time = 60.28708133971293 nanoseconds
90% range size = 0.9569377990430612 nanoseconds
widest range size (max - min) = 2.90622009569378 microseconds
minimum time = 60.28708133971293 nanoseconds
maximum time = 2.966507177033494 microseconds
50% range = (60.28708133971293 nanoseconds, 61.244019138756 nanoseconds)
50% range size = 0.9569377990430612 nanoseconds
--------------------------------------------------------------------------------
                        looper invasive timing estimate
                  boost_function_call_equiv_with_in_line_source
--------------------------------------------------------------------------------
median time = 43.54066985645935 nanoseconds
90% range size =  nanoseconds
widest range size (max - min) = 34.92822966507178 nanoseconds
minimum time = 43.54066985645935 nanoseconds
maximum time = 78.46889952153111 nanoseconds
50% range = (43.54066985645935 nanoseconds, 43.54066985645935 nanoseconds)
50% range size =  nanoseconds
Press any key to continue . . .

About a 17ns difference.  Pretty consistent with the previous measurement
even though the function workload is a couple of orders of magnitude
different.

I think my quoteable message for Doug would read:

<message>
The cost of boost::function can be reasonably consitently measured at
around 20ns +/- 10 ns on a modern >2GHz platform versus directly inlining
the code.

However, the performance of your application my benefit from or be
disadvantaged by boost::function depending on how your C++ optimiser
optimises.  Similar to a standard function pointer, differences of order
of 10% have been noted to the _benefit_ or _disadvantage_ of using
boost::function to call a function that contains a tight loop depending on
your compilation circumstances.
</message>

HTH...

Which is where I'll leave it.  I think I'm satisfied with my lack of
understanding of this trivial trivia now.

Regards,

Matt Hurd.
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost