| Author |
Message |
|
mailseth
|
Post subject: Get OpenCL enhanced FANN here! Posted: Wed Jun 02, 2010 11:36 pm |
|
 |
| Site Admin |
 |
Joined: Tue Mar 06, 2007 10:03 pm Posts: 134 Location: Corn Desert (IL, USA)
|
I've added three new functions in an OpenCL optimized version of the FANN library: fann_run_many() - This runs an ANN on thousands of inputs, resulting in an array of thousands of output values. fann_train_on_data_cl() - This trains an ANN on thousands of examples. When run on an appropriately fast GPU it should result in a major speedup. On a slow GPU like my 9500 GT, it runs at CPU speed. fann_train_epoch_cl() - One epoch of the above function. Does basically the same thing, but only once instead of iterating until the ANN is fully trained. All of the above functions are designed to work with multiple ANNs simultaneously, but hasn't been tested so I assume the functionality doesn't quite work. I haven't gone in and cleaned or documented the functions well, but if you're interested start with fann/src/optimized/opencl/fann.c. I need to put down FANN for at least a few months and work on another pressing project. I just wanted to get it out for other people to use now. Feel free to build on my work, make it more user-friendly, do more error checking, clean the build process, etc. and post the new code back here. I apologize, but I don't have much free time for any assistance, so expect to be on your own besides basic advice. Re: speed On my current GPU (GeForce 9500 GT), I'm getting roughly the same speed between the normal and OpenCL versions. I currently have a GTX 285 on order, and it should be at least 10x faster. With a modern GPU, such as the GTX 480, I expect it to be at least 20x faster than my 2.26GHz Nehalem Mac Pro. If you are able to benchmark, please post your results. Please try using an ANN with more neurons per layer than the xor example, though. Here is some code to test fann_run_many() to get you started in your comparisons: Code: void fann_run_many(struct fann **anns, fann_type * input, fann_type **output, int num_anns, int num_runs) { unsigned int ann_num, i; printf("Running Scalar!\n"); for(ann_num = 0; ann_num < num_anns; ++ann_num) { unsigned int num_outputs, num_inputs; struct fann *ann = anns[ann_num]; num_inputs = ann->num_input; num_outputs = ann->num_output; for(i=0; i<num_runs; ++i) memcpy(&(output[ann_num][num_outputs*i]), fann_run(ann, &input[num_inputs*i]), sizeof(fann_type)*num_outputs); } }
int main(int argc, char *argv[]) { fann_type *input, *output; int i, j; // int num_runs = 1000000; int num_runs = 6; struct fann *ann = fann_create_from_file("~/fann/tests/xor_float.net"); assert(ann != NULL); // Use this to make sure we're linking to the right header int chk = ann->first_layer->num_inputs; //Get net params int num_in = fann_get_num_input(ann); int num_out = fann_get_num_output(ann); printf("Inputs:%5d Outputs:%5d Total:%5d\n", ann->num_input, ann->num_output, ann->num_neurons); input = (fann_type*)calloc(num_runs*num_in, sizeof(fann_type)); output = (fann_type*)calloc(num_runs*num_out, sizeof(fann_type)); //Make a gamut of input values for(i=0; i<num_runs*num_in; ++i){ float dig_frac = ((float)(i % num_in)+1.0)/((float)num_in); float tot_frac = ((float)i)/((float)num_runs*num_in); input[i] = fmodf(tot_frac, dig_frac)*2.0/dig_frac-1.0; } fann_run_many(&ann, input, &output, 1, num_runs); //* // Use for output comparisons for(i=0; i<num_runs; ++i){ for(j=0; j<num_in; ++j){ printf("%9f ", input[i*num_in+j]); } printf("->"); for(j=0; j<num_out; ++j){ printf(" %9f", output[i*num_out+j]); } printf("\n"); }//*/ return 0; }
|
|
 |
|
 |
|
mailseth
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Wed Jun 02, 2010 11:41 pm |
|
 |
| Site Admin |
 |
Joined: Tue Mar 06, 2007 10:03 pm Posts: 134 Location: Corn Desert (IL, USA)
|
|
 |
|
 |
|
tbyte
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Mon Jun 28, 2010 9:52 am |
|
Joined: Fri Mar 13, 2009 1:16 am Posts: 5
|
Were You able to benchmark it with the new video ? 
|
|
 |
|
 |
|
mailseth
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Mon Jun 28, 2010 4:02 pm |
|
 |
| Site Admin |
 |
Joined: Tue Mar 06, 2007 10:03 pm Posts: 134 Location: Corn Desert (IL, USA)
|
Yep, the new card (GTX 285) runs the kernel about 20x faster. I didn't look at the overall speed, but that's pretty amazing. 
|
|
 |
|
 |
|
kknd
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Tue Jun 29, 2010 11:06 pm |
|
Joined: Mon Jun 29, 2009 5:31 pm Posts: 15
|
|
Really nice! I'll test on my 9500 GT (finally I have an excuse to buy a more powerful GPU).
|
|
 |
|
 |
|
tbyte
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Wed Jun 30, 2010 7:09 pm |
|
Joined: Fri Mar 13, 2009 1:16 am Posts: 5
|
kknd wrote: Really nice! I'll test on my 9500 GT (finally I have an excuse to buy a more powerful GPU). There is one other problem with GPU other then speed. It's the memory transfer. You *SHOULD* ensure minimal really minimal number of transfers from the base memory to GPU and so forth. If You don't ensure that You'll loose more time moving data then actually training so the CPU could even be faster than the GPU if You are not careful.
|
|
 |
|
 |
|
mailseth
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Wed Jun 30, 2010 11:13 pm |
|
 |
| Site Admin |
 |
Joined: Tue Mar 06, 2007 10:03 pm Posts: 134 Location: Corn Desert (IL, USA)
|
|
Yep, it works best for running an ANN on thousands of data simultaneously, so it's one batch transfer. The ANN training routine should be fairly efficient because a large amount of data is left on the card, but there are a number of small transfers after each epoch for updating the weights, etc.
|
|
 |
|
 |
|
gluhov
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Wed Apr 06, 2011 6:05 am |
|
Joined: Sun Feb 27, 2011 12:28 pm Posts: 5
|
|
Is anybody who is using it make workspace for VC++?
|
|
 |
|
 |
|
muntain
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Mon Jun 06, 2011 9:07 am |
|
Joined: Fri Jun 03, 2011 6:04 pm Posts: 3
|
mailseth wrote: I have downloaded the code, but I don't understand how can I compile the optimized version. I try with VisualStudio and with cywin I try to make static .lib and dll library but I can't get the library. Probably I'm missing something but I can compile only the standard fann library without optimization. Could you give me some advice to make me compile sse or OpenCL optimized library? thanks
|
|
 |
|
 |
|
tal1974
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Fri Jun 17, 2011 10:55 am |
|
Joined: Fri Jun 17, 2011 9:52 am Posts: 5 Location: Khanty-Mansi Autonomous Okrug, Russia
|
|
Here is little example for Microsoft(R) Visual Studio 2010. You need customize include and lib path for your OpenCL software in libfann properties. I use Intel(R) OpenCL SDK 1.1 and add in include "$(INTELOCLSDKROOT)\include" and in lib "$(INTELOCLSDKROOT)\lib\x86" and "OpenCL.lib". Set the project "tal" as launched and start.
|
|
 |
|
 |
|
tal1974
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Mon Jun 20, 2011 10:05 am |
|
Joined: Fri Jun 17, 2011 9:52 am Posts: 5 Location: Khanty-Mansi Autonomous Okrug, Russia
|
In previos version was some mistake. Here is new version. Sorry 
|
|
 |
|
 |
|
shul
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Tue Jun 21, 2011 9:21 pm |
|
Joined: Tue Jun 21, 2011 9:18 pm Posts: 1
|
Hi, I'm trying to use Tal's version but it fails at line 671: Code: program = clCreateProgramWithSource(context, 1, (const char**)&fin_program_src, NULL, &err); assert(err == CL_SUCCESS); I am using the latest CUDA/openCL from nVidia thanks, shul
|
|
 |
|
 |
|
N4rk0
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Mon Jun 27, 2011 8:30 am |
|
Joined: Sun Jan 23, 2011 3:07 pm Posts: 2
|
|
Tal's version is bugged imho , it tries to load the cpu as opencl device.
|
|
 |
|
 |
|
tal1974
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Wed Jun 29, 2011 3:22 am |
|
Joined: Fri Jun 17, 2011 9:52 am Posts: 5 Location: Khanty-Mansi Autonomous Okrug, Russia
|
|
 |
|
 |
|
tal1974
|
Post subject: Re: Get OpenCL enhanced FANN here! Posted: Fri Jul 08, 2011 7:23 am |
|
Joined: Fri Jun 17, 2011 9:52 am Posts: 5 Location: Khanty-Mansi Autonomous Okrug, Russia
|
|
Previously, I did not specify that I worked with a beta version of the Intel SDK. And everything was fine. Yesterday I updated the SDK to the latest version. And now I get an error at line 673 in fann_cl_kernel.c. I checked the generated code (in variable fin_program_src) with Intel OpenCL Offline Compiler. The problem comes when you call backpropagate_MSE. If this call is removed, the building code is a success. I will try to ask a question at Intel.
|
|
 |
|
 |
|