My research interests are in parallel programming models, emphasising approaches which exploit "skeletons" [1] to package well known patterns of computation and interaction as customisable frameworks. Each skeleton specification captures the behaviour of a commonly occurring pattern of computation and interaction, while packaging and hiding the details of its concrete implementation. The application programmer selects the appropriate skeleton and specializes it for a particular application. This simplifies programming by encouraging combination of the appropriate skeletons. Furthermore, it enables optimizations by capturing macro knowledge about the application structure. In contrast, conventional language and library mechanisms only offer the micro implementation structure of individual operations described by a non-skeletal equivalent. Essentially, the skeleton knows what will happen next and can use this knowledge to choose and adjust implementation details. For example, a skeleton implementation may be able to place threads that communicate frequently on cores that share some level of cache memory. This enhances prefetching of data for the next step of a thread computation. Similarly, we can assign additional worker threads to more time consuming operations for load balancing purposes. Such optimizations can be applied to any parallel application for which the programmer has used the corresponding skeleton.
Recent efforts have focused on the eSkel and Enhance projects, which investigate these ideas in the contexts of single machine parallelism and Grid computing [2] respectively. Current interests include the exploitation of skeletons in manycore and transactional settings [3] and the transparent exploitation of GPGPU and other heterogeneous platforms. [1]M. Cole, Bringing Skeletons out of the Closet: A Pragmatic Manifesto for Skeletal Parallel Programming, Parallel Computing vol. 30, num. 3, pp 389-406 (2004) [2]H. Gonzalez-Velez and M. Cole, Parallelism for Distributed Heterogeneous Architectures: A Methodological Approach with Pipelines and Farms, Concurrency and Computation: Practice and Experience, vol. 22, num. 4, pp. 2073-2094 (2010) [3]M. Castro, L.F. Wanderley-Goes, C.P Ribeiro, M. Cole, M. Cintra and J. Mehaut, A Machine Learning-Based Approach for Thread Mapping on Transactional Memory Applications, to appear in Proceedings of the 18th Annual International Conference on High Performance Computing (HiPC11), (2011) |