Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Henry Nestler <henry.ne-KvP5wT2u2U0 <at> public.gmane.org>
Subject: Re: Very large time offset in coLinux
Newsgroups: gmane.linux.colinux.general
Date: Friday 17th September 2010 00:47:20 UTC (over 6 years ago)
Hello Ron,

thank for tracing all this, and many thanks for pointing to the div64 
bug. It would be nice, if you would open a bug report on sf.net, so we 
don't forget to change the co_div64 some times. Currently I have no idea 
for better function.

I don't assume that is the problem. Because the rounding error will 
later adjust by multily and storing the rest in the variable 
timestamp_reminder. I mean this line:
cmon->timestamp_reminder = timestamp_diff - (jiffies * 
cmon->timestamp_freq.quad);

A debug version is available from here:
http://www.henrynestler.com/colinux/testing/devel-0.7.8/20100916-jiffies

I have changed the casts from "long long" to "unsigned long long" and 
remove the casts where we don't need. So we would have one bit more and 
no negative values.

Old:
long long timestamp_diff;
timestamp_diff += 100 * (((long long)timestamp.quad) - ((long 
long)cmon->timestamp.quad));

New:
unsigned long long timestamp_diff;
timestamp_diff += 100 * (timestamp.quad - cmon->timestamp.quad)

Henry

On 16.09.2010 19:06, Ron Avriel wrote:
> Hi,
>
> Any update on this issue? The server leaped again with almost an 
> identical value (30949 seconds).
> Is it possible to at least have a debug version with log prints in 
> case of large leap?
> I also suggest replacing co_div64() - see below.
>
> Thanks,
> Ron
>
>
> From: [email protected]
> To: [email protected]
> Date: Sun, 12 Sep 2010 14:29:25 +0000
> Subject: Re: [coLinux-users] Very large time offset in coLinux
>
> Hi Henry,
>
> One of our servers leaped forward again. The interesting part is that 
> the leap is almost identical to a previous leap.
> Last time it leaped forward by 30944 seconds, and this time by 30961 
> seconds.
> Performance frequency is 3579545.
>
> Since these two leaps are very close, I have a feeling it's not some a 
> random error, but rather a calculation error.
> It's possible that Windows/Linux were loaded at time of leap.
>
> I went over some of the code and found that co_div64() isn't accurate 
> (!), although I couldn't explain the leap by this bug.
>
> For example,
> co_div64(0x100000000,0x10000000) returns 15 instead of 16.
> co_div64(0x1000000000000,0x10000000) returns 983055 instead of 1048576.
>
> I'm sure you'll find more accurate algorithms.
>
> Could you also go over relevant code and see if you notice any 
> overflow, signed/unsigned error that can explain the leap with the 
> above data?
> Would it be possible to to get a debug version to get more information 
> next time the problem occurs?
>
> Thanks in advance,
> Ron
 
CD: 20ms