[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [E-devel] fix for that silly -mfpmath=sse weirdness



David Sharp wrote:
> On 9/25/06, The Rasterman Carsten Haitzler <raster@rasterman.com> wrote:
>> On Mon, 25 Sep 2006 10:50:09 -0700 Blake Barnett <shadoi@nanovoid.com> 
>> babbled:
>>
>> >
>> > On Sep 25, 2006, at 10:44 AM, David Sharp wrote:
>> >
>> > > amd64 users, rejoice!
>> > >
>> > > i got tired of slowing down e with -mfpmath=387 all the time, so i
>> > > finally dug in to this bug after realizing it was probably a problem
>> > > with floating point cancelation (occurs when subtracting two numbers
>> > > that are very near equal, and causes an extreme loss of precision).
>> > >
>> > > the problem is in _edje_part_recalc() when it linearly interpolates
>> > > all the part parameters. all of the caclulations are of this form:
>> > > p3.x = (p1.x * (1.0 - pos)) + (p2.x * (pos));
>> > > i believe there is some cancelation occuring in the (1.0-pos) part of
>> > > this, especially as pos approaches 1.0.
>> > >
>> > > replacing the above line with the following fixes the problem:
>> > > p3.x = p1.x + (p2.x - p1.x) * pos;
>> > > mathematically equivalent, but, alas, computers aren't as good at 
>> math
>> > > as we think.
>> >
>> > Awesome!  My 3500+ CPU has felt fairly sluggish in E, my video card
>> > has 128MB of video memory (ATI 9700PRO, using ATI's drivers) and I've
>> > noticed that things just aren't as fluid as in Rasters video
>> > captures.  Hopefully this'll help. Nice work.
>>
>> it won't make any difference you can measure. the amount of fp math 
>> done is <
>> 0.1% of the work - by using sse math you lose a bit of precision and 
>> maybe gain
>> 50% speedup - on that < 0.01%. frankly- you will not be able to even 
>> measure
>> the speedup letalone notice it. it's a fallacy to think it will help. 
>> trust me.
> 
> yah, it will only save about 37 instructions (b/c the new formulas use
> one less operation, and there are 37 of them), and the sse
> instructions will only be slightly faster than the FPU ones. that's a
> savings of about, say 25 ns each time _edje_part_recalc is run.
> 
> btw, you gave blake my credit in the CVS log... :(

Bummer. I guess ATI's drivers just aren't as fast as Nvidia's.  At least
we don't need a special case for amd64 when building packages now.

-Blake