2Tessa Posted March 15 Share Posted March 15 (edited) Yesterday I was looking for ways to normalize rotations faster and i found this way that increase the speed by ~20%: instead to execute 4 divisions: rotation NormRot(rotation Q) { float MagQ = llSqrt(Q.x*Q.x + Q.y*Q.y +Q.z*Q.z + Q.s*Q.s); return <Q.x/MagQ, Q.y/MagQ, Q.z/MagQ, Q.s/MagQ>; } you can execute 4 multiplication by the inverse rotation NormRot(rotation Q) { float iMagQ = 1/llSqrt(Q.x*Q.x + Q.y*Q.y +Q.z*Q.z + Q.s*Q.s); return <Q.x*iMagQ, Q.y*iMagQ, Q.z*iMagQ, Q.s*iMagQ>; } I also tried to find ways to compute the SQRT faster for NLERP rotation interpolation rotation NLERP(rotation a, rotation b, float t) { float ti = 1-t; rotation r = <a.x*ti, a.y*ti, a.z*ti, a.s*ti>+<b.x*t, b.y*t, b.z*t, b.s*t>; float m = llSqrt(r.x*r.x+r.y*r.y+r.z*r.z+r.s*r.s); // normalize return <r.x/m, r.y/m, r.z/m, r.s/m>; } and in C there are plenty ways, but they use operators which are not available in LSL, or they lose the speed when interpreted. Who have in-depth knowledge can understand if any of these ways can be effective in LSL. The fastest SQRT hacks the 32bit memory representation of the float value. float sqrt7(float x) { // convert float to bitwise but LSL don't have ADDRESS type and unary operators * and & unsigned int i = *(unsigned int*) &x; i += 127 << 23; i >>= 1; return *(float*) &i; } float Q_rsqrt( float number ) { long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = * ( long * ) &y; i = 0x5f3759df - ( i >> 1 ); y = * ( float * ) &i; y = y * ( threehalfs - ( x2 * y * y ) ); return y; } float fastsqrt(float val) { union { int tmp; float val; } u; u.val = val; u.tmp -= 1<<23; u.tmp >>= 1; u.tmp += 1<<29; return u.val; } func NewtonSqrt(x float64) float64 { z := 1.0 // let initial guess to be 1 for i := 1; i <= 10; i++ { z -= (z*z - x) / (2*z) // MAGIC LINE!! } return z } double ApproximateSqrtBabylonian(double x, int precision) { double res = x; for (int n = 0; n < precision; n++) { res = 0.5f * (res + x / res); } return res; } /* by Jim Ulery */ static unsigned julery_isqrt(unsigned long val) { unsigned long temp, g=0, b = 0x8000, bshft = 15; do { if (val >= (temp = (((g << 1) + b)<<bshft--))) { g += b; val -= temp; } } while (b >>= 1); return g; } double fsqrt (double y) { double x, z, tempf; unsigned long *tfptr = ((unsigned long *)&tempf) + 1; tempf = y; *tfptr = (0xbfcdd90a - *tfptr)>>1; /* estimate of 1/sqrt(y) */ x = tempf; z = y*0.5; /* hoist out the “/2” */ x = (1.5*x) - (x*x)*(x*z); /* iteration formula */ x = (1.5*x) – (x*x)*(x*z); x = (1.5*x) – (x*x)*(x*z); x = (1.5*x) – (x*x)*(x*z); x = (1.5*x) – (x*x)*(x*z); return x*y; } Edited March 15 by 2Tessa Link to comment Share on other sites More sharing options...
elleevelyn Posted March 15 Share Posted March 15 i think the best we can get without using any LSL library function is the Newton example. Probably using the precision parameter from the babylonian example for finer control on the result not sure tho that it would be any faster than just using llSqrt the quake method (Q_rsqrt) might be possibly faster if there is a substitute method for the magic number 0x5f3759df, but I dunno what that method might be, or even if there is a substitute at all 1 Link to comment Share on other sites More sharing options...
Wulfie Reanimator Posted March 15 Share Posted March 15 3 minutes ago, elleevelyn said: the quake method (Q_rsqrt) might be possibly faster if there is a substitute method for the magic number 0x5f3759df, but I dunno what that method might be, or even if there is a substitute at all There's no substitute for it. The trick relies on type-punning the value with pointers. &i ; creates an integer pointer to the value of i. (float *) &i ; typecasts the integer pointer to a float pointer. * (float *) &i ; dereferences the float pointer into the value being pointed to. The "pun" here is that the integer value of "i" is now interpreted as if it was a float all along. And since LSL has no pointers... 1 Link to comment Share on other sites More sharing options...
Frionil Fang Posted March 15 Share Posted March 15 Like Wulfie said, most of these are no help without type punning and/or unions. There's also the fact that user-defined functions are *slow*. From a quick test, 100000 square roots takes less than a second; 100000 user-defined square root function calls to a single-iteration Newton approximation (which is obviously terrible and useless) already takes more than twice as long. With a reasonable number of iterations, we're talking ~10x as long. Consider what you are using these rotations for, too: nlerp is already meant to be fast and loose, functions that don't require normalization (like plain old llSetRot/PRIM_ROT) will work even if the results might not be accurate, that probably wasn't the goal in the first place. Functions that do require normalization (llSetKeyframedMotion comes to mind) won't be satisfied with a sloppy approximation, they need the magnitude of the rotation to be close enough to 1. TL;DR stick with multiplying by 1/llSqrt(mag) for accuracy, leave it out entirely for speed if it works. 1 Link to comment Share on other sites More sharing options...
2Tessa Posted March 15 Author Share Posted March 15 As alternatives of: llSqrt( llFrand(1) ) I tried: llPow( e , 0.5 * llLog( llFrand(1) ) ) llPow( 10 , 0.5 * llLog10( llFrand(1) ) ) llPow( llFrand(1) , 0.5 ) but even the single call of them [ llPow(), llLog(), llLog10() ] is 50-100% slower than llSqrt() 😆 Link to comment Share on other sites More sharing options...
elleevelyn Posted March 15 Share Posted March 15 6 hours ago, Frionil Fang said: From a quick test, 100000 square roots takes less than a second; 100000 user-defined square root function calls to a single-iteration Newton approximation (which is obviously terrible and useless) already takes more than twice as long. With a reasonable number of iterations, we're talking ~10x as long. i haven't done any testing, but I think with Newton then is about up to 24 iterations (precision) on a 32-bit float type, breaking early on match thinking about the quake method a bit more then we can get a union of some kind using fui() and uif() from the LSL Combined Library which I am not going to test as would be terribly slower (two calls to llPow) than llSqrt this said, even if we did this then as I haven't tested it (so not sure what the fui() bit arrangement is) then if the bit arrangement is something other than a C-like union then the magic number would be something other than quake which would be discoverable but yeah so agree with your TLDR. Use llSqrt 1 Link to comment Share on other sites More sharing options...
2Tessa Posted March 16 Author Share Posted March 16 (edited) Very interesting to know that float can be stored into integer 😀 //================================================================= // Float Union Integer (see LSL wiki) integer fui(float input) { if((input) != 0.0){ integer sign = (input < 0) << 31; if((input = llFabs(input)) < 2.3509887016445750159374730744445e-38) return sign | (integer)(input / 1.4012984643248170709237295832899e-45); integer exp = llFloor((llLog(input) / 0.69314718055994530941723212145818)); return (0x7FFFFF & (integer)(input * (0x1000000 >> sign))) | (((exp + 126 + (sign = ((integer)input - (3 <= (input /= (float)("0x1p"+(string)(exp -= ((exp >> 31) | 1)))))))) << 23 ) | sign); } return ((string)input == (string)(-0.0)) << 31; } float iuf(integer input) { return llPow(2.0, (input | !input) - 150) * (((!!(input = (0xff & (input >> 23)))) << 23) | ((input & 0x7fffff))) * (1 | (input >> 31)); } //----------------------------------------------------------------- float e = 2.71828182845904523536028747135266249; integer inte; float floe; default { touch_start(integer tot_num) { llOwnerSay("Float Union Integer (float stored into integer and way back)"); llOwnerSay("e: "+Float2Sci(e)); // does not truncate the decimals inte = fui(e); llOwnerSay("inte = fui(e): "+(string)inte); floe = iuf(inte); llOwnerSay("floe = iuf(inte): "+Float2Sci(floe)); // does not truncate the decimals llOwnerSay("floe = iuf(inte): "+(string)floe+" (typecasted)"); } } Edited March 16 by 2Tessa Link to comment Share on other sites More sharing options...
2Tessa Posted March 16 Author Share Posted March 16 (edited) int_value = fui(fl_value); int_value += 127 << 23; int_value = int_value >> 1; fl_result = iuf(int_value); this simple code works as SQRT for large float values: llSqrt( 3.40282346e38 ) gives: 1.84467429e19 bitwiseSqrt( 3.40282346e38 ) gives: -1.84467429e19 only the sign is wrong but lose precision for small values: llSqrt( 2 ) gives: 1.41421353 bitwiseSqrt( 2 ) gives: 1.5 😆 Edited March 16 by 2Tessa Link to comment Share on other sites More sharing options...
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now