i'm implementing floating point library in c, it's based on ieee standard. i've started addition.
i've problem understand rounding.
maybe example help
x = -3.652248e-11 y = 1.263346e-10 cz = 8.981213e-11 mz = 8.981214e-11 cz = 8.98121e-11 mz = 8.98121e-11 x = ae20a0a7 y = 2f0ae808 cf = 2ec57fbc mf = 2ec57fbd cz = 457fbc mz = 457fbd ce = 5d me = 5d
in above, x
, y
random input value, cz
result of c operator +
, mz
result of implementation.
cf
, mf
bit representation of final result. can see final bit different , don't understand why. i've took inspiration implementation handbook of floating point arithmetic.
what not understand, guess, how rounding performed. addition algorithm based on identity
x + y = (-1)^{sx}2^ex(|x| + (-1)^(sx xor sy) |y| 2^{ey-ex})
where if name quantity
|z| = (|x| + (-1)^(sx xor sy) |y| 2^{ey-ex})
basically problem arises when need post-normalize result using left shifting, careful in case |z|
positive. rounding technique should applied here in case?
my copy of muller et al. on loan friend, can't double-check algorithm you're using specifically, walking through addition of values list:
x = 0xae20a0a7 = -b1.01000001010000010100111 * 2^-35 y = 0x2f0ae808 = +b1.00010101110100000001000 * 2^-33
if normalize x
, y
common exponent , add, un-normalized infinitely precise result:
b100.01010111010000000100000 * 2^-35 - b1.01000001010000010100111 * 2^-35 ------------------------------------- b11.00010101111111101111001 * 2^-35
now normalize without rounding yet:
b1.10001010111111110111100 1 * 2^-34 ^ rounding point
the infinitely-precise result halfway between 2 nearest floating-point numbers, choose even one, , round down to
b1.10001010111111110111100 * 2^-34 = 0x2ec57fbc
given exact halfway case, explanation why you're not getting correct answer you're not handling ties even part of rounding rule correctly. if try round adding half ulp , truncating, result observing.
Comments
Post a Comment