Abstract
The actor model of computation has been designed for a seamless support of concurrency and distribution. However, it remains unspecific about data parallel program flows, while available processing power of modern many core hardware such as graphics processing units (GPUs) or coprocessors increases the relevance of data parallelism for general-purpose computation.
In this work, we introduce OpenCL-enabled actors to the C++ Actor Framework (CAF). This offers a high level interface for accessing any OpenCL device without leaving the actor paradigm. The new type of actor is integrated into the runtime environment of CAF and gives rise to transparent message passing in distributed systems on heterogeneous hardware. Following the actor logic in CAF, OpenCL kernels can be composed while encapsulated in C++ actors, hence operate in a multi-stage fashion on data resident at the GPU. Developers are thus enabled to build complex data parallel programs from primitives without leaving the actor paradigm, nor sacrificing performance. Our evaluations on commodity GPUs, an Nvidia TESLA, and an Intel PHI reveal the expected linear scaling behavior when offloading larger workloads. For sub-second duties, the efficiency of offloading was found to largely differ between devices. Moreover, our findings indicate a negligible overhead over programming with the native OpenCL API.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://github.com/boostorg/compute (Feb. 2017).
- 2.
https://github.com/ddemidov/vexcl (Feb. 2017).
- 3.
References
ACM: Result and Artifact Review and Badging, January 2017. http://acm.org/publications/policies/artifact-review-badging
Agha, G.: Actors: A Model of Concurrent Computation In Distributed Systems. MIT Press, Cambridge (1986)
Agha, G., Mason, I.A., Smith, S., Talcott, C.: Towards a theory of actor computation. In: Cleaveland, W.R. (ed.) CONCUR 1992. LNCS, vol. 630, pp. 565–579. Springer, Heidelberg (1992). https://doi.org/10.1007/BFb0084816
AMD: Aparapi, February 2017. http://aparapi.github.io
Armstrong, J.: Making Reliable Distributed Systems in the Presence of Software Errors. Ph.D. thesis, Department of Microelectronics and Information Technology, KTH, Sweden (2003)
Armstrong, J.: A history of erlang. In: Proceedings of the Third ACM SIGPLAN Conference on History of Programming Languages (HOPL III), pp. 6-1–6-26. ACM, New York (2007)
Billeter, M., Olsson, O., Assarsson, U.: Efficient stream compaction on wide SIMD many-core architectures. In: Proceedings of the Conference on High Performance Graphics 2009, HPG 2009, pp. 159–166. ACM, New York, August 2009
Blythe, D.: The Direct3D 10 system. In: ACM SIGGRAPH 2006 Papers, SIGGRAPH 2006, pp. 724–734. ACM, New York (2006)
Breitbart, J.: CuPP - a framework for easy CUDA integration. In: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2009), pp. 1–8. IEEE Computer Society, Washington (2009)
CAPS: Cray Inc., NVIDIA and the Portland Group. The OpenACC Application Programming Interface, v1.0, November 2011
Charousset, D., Hiesgen, R., Schmidt, T.C.: CAF - the C++ actor framework for scalable and resource-efficient applications. In: Proceedings of the 5th ACM SIGPLAN Conference on Systems, Programming, and Applications (SPLASH 2014), Workshop AGERE! pp. 15–28. ACM, New York, October 2014
Charousset, D., Hiesgen, R., Schmidt, T.C.: Revisiting actor programming in C++. Comput. Lang. Syst. Struct. 45, 105–131 (2016). https://doi.org/10.1016/j.cl.2016.01.002
Charousset, D., Schmidt, T.C., Hiesgen, R., Wählisch, M.: Native actors - a scalable software platform for distributed, heterogeneous environments. In: Proceedings of the 4th ACM SIGPLAN Conference on Systems, Programming, and Applications (SPLASH 2013), Workshop AGERE! pp. 87–96. ACM, New York, October 2013
Clucas, R., Levitt, S.: CAPP: a C++ aspect-oriented based framework for parallel programming with OpenCL. In: Proceedings of the 2015 Annual Conference on South African Institute of Computer Scientists and Information Technologists (SAICSIT 2015), pp. 10:1–10:10. ACM, New York (2015)
Dagum, L., Menon, R.: OpenMP: an industry standard API for shared-memory programming. Comput. Sci. Eng. IEEE 5(1), 46–55 (1998)
Deliège, F., Pedersen, T.B.: Position list word aligned hybrid: optimizing space and performance for compressed bitmaps. In: Proceedings of the 13th International Conference on Extending Database Technology, pp. 228–239. EDBT 2010. ACM, New York, March 2010
Desell, T., Varela, C.A.: SALSA lite: a hash-based actor runtime for efficient local concurrency. In: Agha, G., et al. (eds.) Concurrent Objects and Beyond. LNCS, vol. 8665, pp. 144–166. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44471-9_7
Fang, J., Varbanescu, A.L., Sips, H.: A comprehensive performance comparison of CUDA and OpenCL. In: Parallel Processing (ICPP), pp. 216–225 (2011)
Fusco, F., Vlachos, M., Dimitropoulos, X., Deri, L.: Indexing million of packets per second using GPUs. In: Proceedings of the 2013 Conference on Internet Measurement Conference (IMC 2013), pp. 327–332. ACM, New York, October 2013
Harvey, P., Hentschel, K., Sventek, J.: Parallel programming in actor-based applications via OpenCL. In: The 16th International Conference on Middleware. ACM, New York, December 2015
Hewitt, C., Bishop, P., Steiger, R.: A universal modular ACTOR formalism for artificial intelligence. In: Proceedings of the 3rd IJCAI, pp. 235–245. Morgan Kaufmann Publishers Inc., San Francisco (1973)
Hiesgen, R., Charousset, D., Schmidt, T.C.: Manyfold actors: extending the C++ actor framework to heterogeneous many-core machines using OpenCL. In: Proceedings of the 6th ACM SIGPLAN Conference on Systems, Programming, and Applications (SPLASH 2015), Workshop AGERE! pp. 45–56. ACM, New York, October 2015
Intel: Intel Xeon PhiTM Coprocessor x100 Product Family Datasheet, February 2017. http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-coprocessor-datasheet.html
Kale, L.V., Krishnan, S.: Charm++: parallel programming with message-driven objects. In: Parallel Programming Using C++, pp. 175–213 (1996)
Kirk, D.B., Hwu, W.m.W.: Programming Massively Parallel Processors, A Hands-on Approach, 2nd edn. Morgan Kaufmann, San Francisco (2013)
Klöckner, A., Pinto, N., Lee, Y., Catanzaro, B., Ivanov, P., Fasih, A.: PyCUDA and PyOpenCL: a scripting-based approach to GPU run-time code generation. Parallel Comput. 38(3), 157–174 (2012)
Krieder, S.J., et al.: Design and evaluation of the GeMTC framework for GPU-enabled many-task computing. In: Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing, HPDC 2014, pp. 153–164. ACM, New York (2014)
Howes, L., Rovatsou, M.: SYCL integrates OpenCL devices with modern C++. Khronos Group, February 2017
Lindholm, E., Kilgard, M.J., Moreton, H.: A user-programmable vertex engine. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2001, pp. 149–158. ACM, New York (2001)
Medina, D.S., St-Cyr, A., Warburton, T.: OCCA: A unified approach to multi-threading languages. ArXiv e-prints, March 2014
Munshi, A.: The OpenCL Specification. Khronos OpenCL Working Group, Khronos Group (2012). http://www.khronos.org/registry/cl/specs/opencl-1.2.pdf, Version 1.2, Revision 19
Munshi, A., Howes, L.: The OpenCL Specification. Khronos OpenCL Working Group, Khronos Group (2015). https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf, Version 2.0, Revision 29
Nickolls, J., Dally, W.J.: The GPU computing era. IEEE Micro 30(2), 56–69 (2010)
NVIDIA: Tesla C2075 Computing Processor Board (Board Specification), February 2017
OpenACC-standard.org: The OpenACC Application Programming Interface, February 2017
Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proc. IEEE 96(5), 879–899 (2008)
Scarpino, M.: OpenCL in Action: How to Accelerate Graphics and Computation. Manning Publications Company, Manning Publication Co., 20 Baldwin Road, Shelter Island, NY 11964 (2011)
Scheitle, Q., Wählisch, M., Gasser, O., Schmidt, T.C., Carle, G.: Towards an ecosystem for reproducible research in computer networking. In: Proceedings of ACM SIGCOMM Reproducibility Workshop. ACM, New York, August 2017
Sorensen, T., Donaldson, A.F., Batty, M., Gopalakrishnan, G., Rakamarić, Z.: Portable inter-workgroup barrier synchronisation for GPUs. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2016, pp. 39–58. ACM, New York (2016)
Srinivasan, S., Mycroft, A.: Kilim: isolation-typed actors for java. In: Vitek, J. (ed.) ECOOP 2008. LNCS, vol. 5142, pp. 104–128. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-70592-5_6
The Khronos Group: The Khronos Group, February 2017. http://www.khronos.org/
Typesafe Inc.: Akka Framework, August 2017. http://akka.io
Vallentin, M., Paxson, V., Sommer, R.: VAST: a unified platform for interactive network forensics. In: Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), March 2016
Wienke, S., Springer, P., Terboven, C., an Mey, D.: OpenACC — first experiences with real-world applications. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 859–870. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32820-6_85
Wu, K., Otoo, E.J., Shoshani, A.: Optimizing bitmap indices with efficient compression. ACM Trans. Database Syst. 31(1), 1–38 (2006)
Acknowledgments
The authors would like to thank Marian Triebe and Sebastian Bartels for implementing benchmarks, testing, and bugfixing. We further want to thank Matthias Vallentin for raising the indexing use case, and the iNET working group for vivid discussions and inspiring suggestions. Funding by the German Federal Ministry of Education and Research within the projects ScaleCast and X–CHECK is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Hiesgen, R., Charousset, D., Schmidt, T.C. (2018). OpenCL Actors – Adding Data Parallelism to Actor-Based Programming with CAF. In: Ricci, A., Haller, P. (eds) Programming with Actors. Lecture Notes in Computer Science(), vol 10789. Springer, Cham. https://doi.org/10.1007/978-3-030-00302-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-00302-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00301-2
Online ISBN: 978-3-030-00302-9
eBook Packages: Computer ScienceComputer Science (R0)