In the scope of the NeoBrain project, some algorithms have been designed to solve large-scale problems (involving 60'000x10'000 matrices over thousands of iterations) in the field of neuronal activity visualization. NeoBrain trying to use MRI and MEG scans in a joint fashion to produce 3D brain models with neuronal activity animation directly on the cerebral cortex. Algorithm for retrieving the neuronal map from MEG sensors takes 3 weeks to execute on a standard PC. The goal of this project was to get the highest possible speedup using CUDA technology with a limited number of GPUs.
Large matrices with non-obvious metrics & parallelization
Important mathematical background has been put in simple (confidential !) algorithms that are used in NeoBrain. Having some understanding of these maths was necessary to understand how the algorithm can be optimized, and to which part of the matrices it relies to. Moreover, finding a way to parallelize all the computation efficiently was also challenging because it was necessary to use advanced CUDA features.
Using estimation methods to build up neural activity maps
As the base equation system is not solvable by nature (108 variables), the only way of coming up with a usable neuronal activity maps from the MEG sensors is to do an estimation of it using iterative process with error estimator. This algorithm was implemented in tens of different ways exploiting various CUDA abilities to find the most efficient one. From the sequential version to the most efficient one, about 30 implementations have been done, with a final speedup of over 100'000.
In order to make the fastest algorithm implementation actually produce correct result, some parameters had to be tuned. Long tests were run, using common machine learning methods to find minimum of cost estimators leading to optimized settings with the most accurate neuronal activity map result. Various cost estimators and minimization methods have been used, including grid search and gradient descent.