I want advice on that is incomplete when it comes to unroll and design to max speed
I want to input general regs before this, to be pointers, to change rotationaxis
with alu units and clock cycles, I want advice on where to place SS opcodes to start with minus/plus ops after mulps operation its depending on? if it doesnt slow down I want to interleave operations
, maybe use addps instead of addss and use these ops for translate
???=represents unused 32bit float
Rotations in the third deminsion simply involve rotating on all 3 planes.
To rotate a point (X,Y,Z) around the point (0,0,0) you would use this
algorithim:
1st, rotate on the X axis
Yt = Y*COS(Xan) - Z*SIN(Xan)
Zt = Y*SIN(Xan) + Z*COS(Xan)
Y = Yt -- Note that you must not alter the cordinates
Z = Zt -- until both transforms are preformed
movaps XMM0,XYZW
movups XMM4,XYZW+24
movaps XMM5,XYZW+48
movups XMM6,XYZW+24*3
movaps XMM7,XYZW+96
movaps XMM1,xscale1cossin??? ;why not use a free scale of X?
movaps XMM2,xscale2sincos???
mulps XMM0,XMM1
movaps ???ycosminuszsin???,XMM0
mulps XMM4,XMM1
movaps ???ycosminuszsin???+16,XMM4
mulps XMM5,XMM1
movaps XMM0,XYZW ;load again how many clock cycles after store?
movaps ???ycosminuszsin???+32,XMM5
mulps XMM6,XMM1
movaps ???ycosminuszsin???+48,XMM6
mulps XMM7,XMM1
movaps ???ycosminuszsin???+64,XMM7
mulps XMM2,XMM0
movaps ???ycosminuszsin???,XMM0
movaps ???ysinpluszcos???,XMM2
;movss/subss/movss and movss/addss/movss section
;after
movss X,XMM1
movss XMM3,Ycos ;now if I input a reg for example eax to this proc for
subss XMM3,zsin ;what change what axis I gonna rotate
movss Y,XMM3
movss XMM4,ysin
addss XMM4,zcos
movss Z,XMM4
data XYZWUV ;24byte w UV coordinates
data coscossindummy ;unused in rotate can be used to scale variable
data sinsincosdummy
data xcosycosminuszsindummy
data xcosycosminuszsindummy
Next, rotate on the Y axis
Xt = X*COS(Yan) - Z*SIN(Yan)
Zt = X*SIN(Yan) + Z*COS(Yan)
X = Xt
Z = Zt
And finally, the Z axis
Xt = X*COS(Zan) - Y*SIN(Zan)
Yt = X*SIN(Zan) + Y*COS(Zan)
X = Xt
Y = Yt
before optimizing something, try to find best algorithm,because your is for every axis is 6 muls so 3 axis is 18 muls, in old days of scene there was 6 muls algo for all 3 axis,mmx instruction arent pairable but always its better to keep code in parts, so first read into all regs data, then mul all, and at end write all data
"Avatar's Guide To 3D-Rotations"
StEP oNE! - 12 muls / rotation
-------------------------------
rotate around z-axis:
x' = x*cos(A) + y*sin(A)
y' = x*sin(A) - y*cos(A)
rotate around y-axis:
x'' = x'*cos(B) + z*sin(B)
z' = x'*sin(B) - z*cos(B)
rotate around x-axis:
y'' = y'*cos(C) + z'*sin(C)
z'' = y'*sin(C) - z'*cos(C)
after this the rotated vector is (x'',y'',z'')
StEP tWO! - 9 muls / rotation + 14 muls init
---------------------------------------------
If we evaluate the rotations from the first step we get
x'' = x * [cos(A)cos(B)] +
+ y * [sin(A)cos(B)] +
+ z * [sin(B)]
y'' = x * [sin(A)cos(C) + cos(A)sin(B)sin(C)] +
+ y * [-cos(A)cos(C) + sin(A)sin(B)sin(C)] +
+ z * [-cos(B)sin(C)]
z'' = x * [sin(A)sin(C) - cos(A)sin(B)cos(C)] +
+ y * [-cos(A)sin(C) - sin(A)sin(B)cos(C)] +
+ z * [cos(B)cos(C)]
consisting of nine constants multiplied by the original x/y/z-coordinates.
We precalculate these constants everytime we change an angle
xx = [cos(A)cos(B)]
xy = [sin(A)cos(B)]
xz = [sin(B)]
yx = [sin(A)cos(C) + cos(A)sin(B)sin(C)]
yy = [-cos(A)cos(C) + sin(A)sin(B)sin(C)]
yz = [-cos(B)sin(C)]
zx = [sin(A)sin(C) - cos(A)sin(B)cos(C)]
zy = [-cos(A)sin(C) - sin(A)sin(B)cos(C)]
zz = [cos(B)cos(C)]
and the rotation becomes somewhat easier
x'' = x * xx + y * xy + z * xz
y'' = x * yx + y * yy + z * yz
z'' = x * zx + y * zy + z * zz
StEP tHREE! - 6 muls / rotation + 17 muls init
-----------------------------------------------
In this step we use the fact that
(a+y)(b+x) = ab + ax + by + xy
which we can transform into
ax + by = (a+y)(b+x) - (ab + xy), and
ax + by + cz = (a+y)(b+x) + cz - (ab + xy)
Doing that for each of the rotations x',y',z' gives us
x' = (xx + y)(xy + x) + z*zx - (xx*xy + x*y)
y' = (yx + y)(yy + x) + z*yx - (yx*yy + x*y)
z' = (zx + y)(zy + x) + z*zx - (zx*zy + x*y)
If the object is kept intact then x*y is constant and can be
precalced. Add this to the init and precalculate x_y = x*y
for each vertice
xx_xy = xx*xy
yx_yy = yx*yy
zx_zy = zx*zy
The rotation then becomes
x' = (xx + y)(xy + x) + z*xz - (xx_xy + x_y)
y' = (yx + y)(yy + x) + z*yz - (yx_yy + x_y)
z' = (zx + y)(zy + x) + z*zz - (zx_zy + x_y)
^^precalced^^
This leaves us with 6 muls per rotation, quite an improvement
compared to the initial 12.
Now go out and abuse this stuff!!
Avatar 1995
In a 2D plan you can rotate a point multipling it by z where z is e^(Ai),
Or simple adding its angle by A, dunno how polar coordinates becomes on 3D, maybe an option.
thanks
ok I want a general rotation also, but I want to create a specialized one for create turns, from straight meshes, which means each tile is gonna be moved /rotated in a small angle, compared to previous tile
also data could be organized the same as 3d.obj file is with section of vertices only xyzxyzxyzxyz and only final output to customvertex xyzwvu, that way rotate 3*4 vertices= a quad at a time