Register    Login    Forum    Search    FAQ

Board index » FANN » Enhancing the C library




Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Get OpenCL enhanced FANN here!
 Post Posted: Wed Jun 02, 2010 11:36 pm 
Offline
Site Admin
User avatar

Joined: Tue Mar 06, 2007 10:03 pm
Posts: 134
Location: Corn Desert (IL, USA)
I've added three new functions in an OpenCL optimized version of the FANN library:
fann_run_many()
- This runs an ANN on thousands of inputs, resulting in an array of thousands of output values.

fann_train_on_data_cl()
- This trains an ANN on thousands of examples. When run on an appropriately fast GPU it should result in a major speedup. On a slow GPU like my 9500 GT, it runs at CPU speed.

fann_train_epoch_cl()
- One epoch of the above function. Does basically the same thing, but only once instead of iterating until the ANN is fully trained.

All of the above functions are designed to work with multiple ANNs simultaneously, but hasn't been tested so I assume the functionality doesn't quite work.

I haven't gone in and cleaned or documented the functions well, but if you're interested start with fann/src/optimized/opencl/fann.c. I need to put down FANN for at least a few months and work on another pressing project. I just wanted to get it out for other people to use now. Feel free to build on my work, make it more user-friendly, do more error checking, clean the build process, etc. and post the new code back here.

I apologize, but I don't have much free time for any assistance, so expect to be on your own besides basic advice.

Re: speed
On my current GPU (GeForce 9500 GT), I'm getting roughly the same speed between the normal and OpenCL versions. I currently have a GTX 285 on order, and it should be at least 10x faster. With a modern GPU, such as the GTX 480, I expect it to be at least 20x faster than my 2.26GHz Nehalem Mac Pro. If you are able to benchmark, please post your results. Please try using an ANN with more neurons per layer than the xor example, though.

Here is some code to test fann_run_many() to get you started in your comparisons:
Code:
 void fann_run_many(struct fann **anns, fann_type * input,
               fann_type **output, int num_anns, int num_runs)
{
    unsigned int ann_num, i;
   
    printf("Running Scalar!\n");
   
    for(ann_num = 0; ann_num < num_anns; ++ann_num) {
        unsigned int num_outputs, num_inputs;
        struct fann *ann = anns[ann_num];
       
        num_inputs = ann->num_input;
        num_outputs = ann->num_output;
       
        for(i=0; i<num_runs; ++i)
            memcpy(&(output[ann_num][num_outputs*i]),
                   fann_run(ann, &input[num_inputs*i]),
                   sizeof(fann_type)*num_outputs);
    }
}

int main(int argc, char *argv[])
{
   fann_type *input, *output;
    int i, j;
//    int num_runs = 1000000;
    int num_runs = 6;
   
   struct fann *ann = fann_create_from_file("~/fann/tests/xor_float.net");
   assert(ann != NULL);
   
    // Use this to make sure we're linking to the right header
    int chk = ann->first_layer->num_inputs;
   
    //Get net params
    int num_in = fann_get_num_input(ann);
    int num_out = fann_get_num_output(ann);
   
   printf("Inputs:%5d Outputs:%5d Total:%5d\n", ann->num_input, ann->num_output, ann->num_neurons);
   
   input = (fann_type*)calloc(num_runs*num_in, sizeof(fann_type));
   output = (fann_type*)calloc(num_runs*num_out, sizeof(fann_type));
   
    //Make a gamut of input values
    for(i=0; i<num_runs*num_in; ++i){
        float dig_frac = ((float)(i % num_in)+1.0)/((float)num_in);
        float tot_frac = ((float)i)/((float)num_runs*num_in);
        input[i] = fmodf(tot_frac, dig_frac)*2.0/dig_frac-1.0;
    }
   
   fann_run_many(&ann, input, &output, 1, num_runs);
   
    //* // Use for output comparisons
    for(i=0; i<num_runs; ++i){
        for(j=0; j<num_in; ++j){
            printf("%9f ", input[i*num_in+j]);
        }
        printf("->");
        for(j=0; j<num_out; ++j){
            printf(" %9f", output[i*num_out+j]);
        }
        printf("\n");
    }//*/
   
    return 0;
}


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Wed Jun 02, 2010 11:41 pm 
Offline
Site Admin
User avatar

Joined: Tue Mar 06, 2007 10:03 pm
Posts: 134
Location: Corn Desert (IL, USA)
Here's the package:
http://pricepages.org/temp/fann_opencl.tar.gz


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Mon Jun 28, 2010 9:52 am 
Offline

Joined: Fri Mar 13, 2009 1:16 am
Posts: 5
Were You able to benchmark it with the new video ? :)


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Mon Jun 28, 2010 4:02 pm 
Offline
Site Admin
User avatar

Joined: Tue Mar 06, 2007 10:03 pm
Posts: 134
Location: Corn Desert (IL, USA)
Yep, the new card (GTX 285) runs the kernel about 20x faster. I didn't look at the overall speed, but that's pretty amazing. :)


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Tue Jun 29, 2010 11:06 pm 
Offline

Joined: Mon Jun 29, 2009 5:31 pm
Posts: 15
Really nice! I'll test on my 9500 GT (finally I have an excuse to buy a more powerful GPU).


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Wed Jun 30, 2010 7:09 pm 
Offline

Joined: Fri Mar 13, 2009 1:16 am
Posts: 5
kknd wrote:
Really nice! I'll test on my 9500 GT (finally I have an excuse to buy a more powerful GPU).

There is one other problem with GPU other then speed. It's the memory transfer. You *SHOULD* ensure minimal really minimal number of transfers from the base memory to GPU and so forth. If You don't ensure that You'll loose more time moving data then actually training so the CPU could even be faster than the GPU if You are not careful.


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Wed Jun 30, 2010 11:13 pm 
Offline
Site Admin
User avatar

Joined: Tue Mar 06, 2007 10:03 pm
Posts: 134
Location: Corn Desert (IL, USA)
Yep, it works best for running an ANN on thousands of data simultaneously, so it's one batch transfer. The ANN training routine should be fairly efficient because a large amount of data is left on the card, but there are a number of small transfers after each epoch for updating the weights, etc.


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Wed Apr 06, 2011 6:05 am 
Offline

Joined: Sun Feb 27, 2011 12:28 pm
Posts: 5
Is anybody who is using it make workspace for VC++?


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Mon Jun 06, 2011 9:07 am 
Offline

Joined: Fri Jun 03, 2011 6:04 pm
Posts: 3
mailseth wrote:


I have downloaded the code, but I don't understand how can I compile the optimized version.
I try with VisualStudio and with cywin I try to make static .lib and dll library but I can't get the library.
Probably I'm missing something but I can compile only the standard fann library without optimization.
Could you give me some advice to make me compile sse or OpenCL optimized library? thanks


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Fri Jun 17, 2011 10:55 am 
Offline

Joined: Fri Jun 17, 2011 9:52 am
Posts: 5
Location: Khanty-Mansi Autonomous Okrug, Russia
Here is little example for Microsoft(R) Visual Studio 2010. You need customize include and lib path for your OpenCL software in libfann properties. I use Intel(R) OpenCL SDK 1.1 and add in include "$(INTELOCLSDKROOT)\include" and in lib "$(INTELOCLSDKROOT)\lib\x86" and "OpenCL.lib".
Set the project "tal" as launched and start.


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Mon Jun 20, 2011 10:05 am 
Offline

Joined: Fri Jun 17, 2011 9:52 am
Posts: 5
Location: Khanty-Mansi Autonomous Okrug, Russia
In previos version was some mistake. Here is new version. Sorry :)


Attachments:
Projects.rar [131.51 KiB]
Downloaded 571 times
Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Tue Jun 21, 2011 9:21 pm 
Offline

Joined: Tue Jun 21, 2011 9:18 pm
Posts: 1
Hi,
I'm trying to use Tal's version but it fails at line 671:
Code:
    program = clCreateProgramWithSource(context, 1, (const char**)&fin_program_src,
                                        NULL, &err);
    assert(err == CL_SUCCESS);


I am using the latest CUDA/openCL from nVidia

thanks,
shul


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Mon Jun 27, 2011 8:30 am 
Offline

Joined: Sun Jan 23, 2011 3:07 pm
Posts: 2
Tal's version is bugged imho , it tries to load the cpu as opencl device.


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Wed Jun 29, 2011 3:22 am 
Offline

Joined: Fri Jun 17, 2011 9:52 am
Posts: 5
Location: Khanty-Mansi Autonomous Okrug, Russia
I wrote that use Intel OpenCL SDK. If you have too processor Intel, then you can easily check it. http://software.intel.com/en-us/articles/download-intel-opencl-sdk/


Top 
 Profile  
 
 Post subject: Re: Get OpenCL enhanced FANN here!
 Post Posted: Fri Jul 08, 2011 7:23 am 
Offline

Joined: Fri Jun 17, 2011 9:52 am
Posts: 5
Location: Khanty-Mansi Autonomous Okrug, Russia
Previously, I did not specify that I worked with a beta version of the Intel SDK. And everything was fine. Yesterday I updated the SDK to the latest version. And now I get an error at line 673 in fann_cl_kernel.c. I checked the generated code (in variable fin_program_src) with Intel OpenCL Offline Compiler. The problem comes when you call backpropagate_MSE. If this call is removed, the building code is a success. I will try to ask a question at Intel.


Top 
 Profile  
 
Display posts from previous:  Sort by  
 
Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next

Board index » FANN » Enhancing the C library


Who is online

Users browsing this forum: No registered users and 1 guest

 
 

 
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: