It is currently Fri Nov 24, 2017 6:47 pm

All times are UTC - 5 hours




 Page 1 of 1 [ 4 posts ] 
Author Message
 Post subject: Combined sin & cos for math library (FSINCOS by x87)
PostPosted: Sat Dec 06, 2008 3:06 am 

Joined: Wed Aug 06, 2008 7:53 pm
Posts: 182
Location: Russia
Every time we calculate co-ordinates of a point on elliptical curve or a rotational matrix etc, we invoke sin( alpha ) and cos( alpha ). However for a pretty long time x87 FPU (math companion of x86 CPU) disposes of the FSINCOS instruction, which calculates the sine and cosine simultaneously for a single-precision floating-point radian. It's slower than calling just sine or cosine, but faster than calling them both consecutively. I believe the absence of a C library function for a combination sine-cosine was merely an oversight. Let's fix this.
#define GAME_ENGINE_USE_ASM_X86

//------------------------------------------------------------------------------
// Combined 'sin & cos' is going to be faster than two separate calls.
//------------------------------------------------------------------------------
inline void sin_cos( IN const float radians, OUT float & sin_, OUT float & cos_ )
{
#if defined( GAME_ENGINE_USE_ASM_X86 ) && 1
    __asm
    {
        fld     radians     // Load radians from memory
        fsincos             // ST(0)=cos ST(1)=sin
        mov     eax, sin_   // load address
        mov     ecx, cos_   // load address
        fstp    [ecx]       // cos_ = cos
        fstp    [eax]       // sin_ = sin
    };
#else // !GAME_ENGINE_USE_ASM_X86
    sin_ = ::sinf( radians );
    cos_ = ::cosf( radians );
#endif // GAME_ENGINE_USE_ASM_X86
}





An example of usage:

void calculate_points( const float radius, const float step = 0.01f ) {

    float angle, s_i_n, c_o_s;

    m_points.clear();

    for ( angle = 0.0f; angle < FLOAT_2_PI; angle += step ) {

        sin_cos( angle, s_i_n, c_o_s );

        m_points.push_back( Vector2( radius * c_o_s, radius * s_i_n ) );
    }
}


Last edited by BugHunter on Mon Dec 08, 2008 12:19 pm, edited 4 times in total.


_________________
«Computer scientists deal with algorithms that you may call practical in theory but unpractical in practice.» © Timothy Gowers
Offline
 Profile  
 
 Post subject:
PostPosted: Sat Dec 06, 2008 6:28 am 

Joined: Sat Jun 23, 2007 7:56 pm
Posts: 145
nice 8)


Offline
 Profile  
 
 Post subject: A quick test of sin_cos() VS sin(), cos()
PostPosted: Mon Dec 08, 2008 12:24 pm 

Joined: Wed Aug 06, 2008 7:53 pm
Posts: 182
Location: Russia
A quick test shows that combined sin_cos() is almost 2 times faster than two separate calls of sin( ), cos( ):
Quote:
=---------------------------------------------------------------=
A comparison of sin_cos() against separate calls of sin(), cos().
=---------------------------------------------------------------=
Total iterations per each pass: 5000000

TEST sin_cos()
pass 1: 0.500000 seconds
pass 2: 0.485000 seconds
pass 3: 0.531000 seconds
pass 4: 0.469000 seconds
pass 5: 0.500000 seconds
RESULT: 0.497000 seconds

TEST sin(), cos()
pass 1: 0.984000 seconds
pass 2: 0.984000 seconds
pass 3: 0.953000 seconds
pass 4: 0.938000 seconds
pass 5: 0.953000 seconds
RESULT: 0.962400 seconds

RATIO: sin_cos() 1.936419 times faster.





Complete benchmark code.
#include <stdio.h>
#include <math.h>
#include <boost/timer.hpp>

#define IN
#define OUT

#define GAME_ENGINE_USE_ASM_X86

//------------------------------------------------------------------------------
// Combined 'sin & cos' is going to be faster than two separate calls.
//------------------------------------------------------------------------------
inline void sin_cos( IN const float radians, OUT float & sin_, OUT float & cos_ )
{
#if defined( GAME_ENGINE_USE_ASM_X86 ) && 1
    __asm
    {
        fld     radians     // Load radians from memory
        fsincos             // ST(0)=cos ST(1)=sin
        mov     eax, sin_   // load address
        mov     ecx, cos_   // load address
        fstp    [ecx]       // cos_ = cos
        fstp    [eax]       // sin_ = sin
    };
#else // !GAME_ENGINE_USE_ASM_X86
    sin_ = ::sinf( radians );
    cos_ = ::cosf( radians );
#endif // GAME_ENGINE_USE_ASM_X86
}


//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
float sin_, cos_; // should be global to prevent an optimization out

const size_t PASSESS = 5;
const size_t ITERATIONS = 5 * 1000 * 1000;

//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
void test_combined_sin_cos( float angle )
{
    sin_cos( angle, sin_, cos_ );
}

void test_separate_sin_cos( float angle )
{
    sin_ = sinf( angle );
    cos_ = cosf( angle );
}

typedef void (* test_fun)( float );

//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
double estimate( test_fun fun )
{
    double milestones[ PASSESS ];
    double result;
       
    boost::timer tm;

    size_t t, n;

    for (n = 0; n < PASSESS; ++n )
    {
        printf("pass %u: ", n + 1 );
        tm.restart();
        for (t = 0; t < ITERATIONS; ++t)
        {
            fun( float(t) );
        }
        milestones[ n ] = tm.elapsed();
        printf("%f seconds\n", milestones[ n ]);
    }
    result = 0.0f;
    for (n = 0; n < PASSESS; ++n )
    {
        result += milestones[n];
    }
    result /= PASSESS;
    printf("RESULT: %f seconds\n", result);
    return result;
}

//------------------------------------------------------------------------------
//
//------------------------------------------------------------------------------
int main(int argc, char* argv[])
{
    printf("=---------------------------------------------------------------=\n");
   printf("A comparison of sin_cos() against separate calls of sin(), cos().\n");
    printf("=---------------------------------------------------------------=\n");

    printf("Total iterations per each pass: %u\n\n", ITERATIONS);
    printf("TEST sin_cos()\n");
    double r1 = estimate( test_combined_sin_cos );
    printf("\n");
    printf("TEST sin(), cos()\n");
    double r2 = estimate( test_separate_sin_cos );
    printf("\n");
    printf("RATIO: sin_cos() %f times faster." "\n\n", r2 / r1);
   
    return 0;
}


Offline
 Profile  
 
 Post subject: sin_cos() with AT&T syntax (for GCC & GAS)
PostPosted: Sun Dec 21, 2008 12:53 pm 

Joined: Wed Aug 06, 2008 7:53 pm
Posts: 182
Location: Russia
I have updated the code to make it compatible with GAS (GNU Assembler) as well. GAS by default uses not Intel’s but rather AT&T’s syntax.


First, a code snippet from project setup file ( my name is “GeProjectSetup.h” ) which helps to select code branch:
//
// Enable/disable built-in assembler.
//
#define GAME_ENGINE_USE_ASM_X86

//
// Determine x86 assembler syntax: "Intel" or "AT&T"
//
#if defined( GAME_ENGINE_USE_ASM_X86 )
#   if defined( _MSC_VER ) || defined( __BORLANDC__ ) // || defined( __INTEL_COMPILER )
#       define GAME_ENGINE_USE_ASM_X86_INTEL
#   elif defined( __GNUC__ )
#       define GAME_ENGINE_USE_ASM_X86_ATNT
#   endif
#endif
//
// Check GAME_ENGINE_USE_ASM_X86_INTEL or GAME_ENGINE_USE_ASM_X86_ATNT.
//
#undef GAME_ENGINE_USE_ASM_X86



Second, part of library file ( i.e. GeUtilites.h ) that contains enhanced sin_cos():
//------------------------------------------------------------------------------
// Combined 'sin & cos' is going to be faster than two separate calls.
//------------------------------------------------------------------------------
inline void sin_cos( IN const float radians, OUT float & sin_, OUT float & cos_ )
{
#if defined( GAME_ENGINE_USE_ASM_X86_INTEL )

    __asm
    {
        fld     radians     // Load radians from memory
        fsincos             // ST(0)=cos ST(1)=sin
        mov     eax, sin_   // load address
        mov     ecx, cos_   // load address
        fstp    [ecx]       // cos_ = cos
        fstp    [eax]       // sin_ = sin
    };

#elif defined( GAME_ENGINE_USE_ASM_X86_ATNT )

    __asm__
    (
        "fsincos"       :   // opcode
        "=t" ( cos_ )   ,   // output: ST(0) to cos_
        "=u" ( sin_ )   :   // output: ST(1) to sin_
        "0" ( radians )     // input:  radians to ST(0)
    );

#else // !GAME_ENGINE_USE_ASM_X86

    sin_ = ::sinf( radians );
    cos_ = ::cosf( radians );

#endif // GAME_ENGINE_USE_ASM_X86
}


After this step the code can be compiled by GNU C, for instance in free Code::Blocks IDE, when GCC compiler (MinGW) is selected.


Offline
 Profile  
 
Display posts from previous:  Sort by  
 Page 1 of 1 [ 4 posts ] 

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Jump to:  

cron