Curriculum Vitae (English)                     Research Statement (English)

I am currently a member of the Computer Architecture and VLSI Systems Laboratory (CARV) of the Institute of Computer Science (ICS) at Foundation for Research and Technology - Hellas (FORTH) in Heraklio Crete.

Previously I was at the Microprocessor and Hardware Laboratory (MHL) at the Electrical and  Computer Engineering Department of the Technical University of Crete at Chania, where I did my master on applying priorities to threads in a simultaneous multithreading architecture.

 

Research Interests

My research interests include processor and memory hierarchy microarchitecture, mechanisms for communication, synchronization, and cache coherence in CMPs, as well as their reconciliation in (or redesign for) heterogeneous systems.

My current research is on runtime systems, scheduling, related hardware support and network interfaces, for multiprocessor systems.

I would like to become part of an R&D work environment, which plans and directs research results toward real world products. Nevertheless, I would also welcome the opportunity for basic research, with a Post-Doc in an academic or research institution.

 

Participation in Academic Projects

· Active Networks (PENED 1999)

· Unifying High-speed Interconnects (USENIX 2005-2006)

· Scalable computer ARChitecture (SARC 2006-2010)

· ENabling technologies for a programmable many-CORE (ENCORE 2010-2011)

 

PhD   Dissertation

Thesis Title:Direct Communication and Synchronization Mechanisms in Chip Multiprocessors,” (also available as FORTH-ICS technical report)

(Funded by FORTH-ICS bursary, HiPEAC Network of Excellence, and SARC FP6 IP project)

Supervisor: Prof. Manolis G. H. Katevenis

Brief Synopsis of Research:

The use of per core on-chip memories, managed in software with RDMA, adopted in the IBM Cell processor, has challenged the mainstream approach of using coherent caches for the on-chip memory hierarchy of CMPs. We demonstrate the combination of the two approaches, with cache-integration of a network interface (NI) for explicit interprocessor communication, and flexible dynamic allocation of on-chip memory to hardware-managed (cache) and software-managed parts.

The NI supports RDMA and messaging among the explicitly (software) managed on-chip memory portions, as well as transfers from and to non-cacheable main memory. An FPGA prototype developed shows reasonable logic overhead (< 20%) for the basic NI functionality. The NI also enables software-configurable synchronization primitives, inside the explicitly-managed portions of a cache: novel queues, which efficiently support multiple readers, providing hardware lock and job dispatching services, and counters, that allow selective fences for explicit transfers, and can be synthesized to implement barriers in the memory system.

Simulations of up to 128 core CMPs, using a combination of SIMICS and GEMS simulators, show that our synchronization primitives can provide significant benefits for contended locks and barriers, and improve task scheduling efficiency in the Cilk run-time system, especially for regular codes.

 

Master & other related Links

Thesis Title:Limited, Priority-Thread Based Sharing of Simultaneous Multithreaded Processor Resources

Supervisor: Prof. Apostolos Dollas

Brief Synopsis of Research:

My master’s dissertation was on the potential of applying priorities among the hardware threads of simultaneous multithreaded (SMT) processors. The study was based on simulation of several variants of the processor architecture, in order to limit the interference and contention for the processor resources, of the simultaneously executing threads. A configurable, trace-driven simulator was built from scratch, capable of simulating out-of-order execution processor architectures with state of the art branch prediction and register renaming, which was then extended for simultaneous multithreading. The simulator built, consisted of over 13K lines of C++ code.

 

 

 

Publications

[1] Stamatis Kavadias, Manolis Katevenis, and Dionisios Pnevmatikatos, “Network Interface Design for Explicit Communication in Chip Multiprocessors,” in book Designing Network-on-Chip Architectures in the Nanoscale Era, December 2010, CRC press.

[2] Stamatis Kavadias, Manolis G.H. Katevenis, Michail Zampetakis, and Dimitris Nikoloppoulos, “Cache-Integrated Network Interfaces: Flexible On-Chip Communication and Synchronization for Large-Scale CMPs,” in International Journal of Parallel Programming, pp.1—22, June 2011, Springer Netherlands.

[3] Manolis Katevenis, Vassilis Papaefstathiou, Stamatis Kavadias, Dionisios Pnevmatikatos, Federico Silla, Dimitrios Nikolopoulos, “Explicit Communication and Synchronization in SARC,” IEEE Micro, vol. 30, pp.30--41, Los Alamitos, CA, USA, 2010, IEEE Computer Society.

[4] S. G. Kavadias, M. G. Katevenis, M. Zampetakis, and D. S. Nikolopoulos, "On-chip Communication and Synchronization Mechanisms with Cache-integrated Network Interfaces," in Proceedings of the 7th ACM international Conference on Computing Frontiers (Bertinoro, Italy, May 17 - 19, 2010), pp. 217--226, CF '10, ACM, New York, NY.

[5] Christoforos Kachris, George Nikiforos, Stamatis Kavadias, Vassilis Papaefstathiou, Manolis Katevenis, "Network processing in Multi-core FPGAs with Integrated Cache-Network Interface", IEEE International Conference on Reconfigurable Computing and FPGAs (Reconfig 2010), Cancun, Mexico, December, 2010.

[6] Christoforos Kachris, George Nikiforos, Stamatis Kavadias, Vassilis Papaefstathiou, Manolis Katevenis, "Low-latency Explicit Communication and Synchronization in Scalable Multi-core Clusters", IEEE International Conference on Cluster Computing (Cluster 2010), Heraklion, Greece, September 2010.

[7] G. Kalokerinos, V. Papaefstathiou, G. Nikiforos, S. Kavadias, M. Katevenis, D. Pnevmatikatos, Xiaojun Yang, "FPGA Implementation of a Configurable Cache/Scratchpad Memory with Virtualized User-level RDMA Capability," International Symposium on Systems, Architectures, Modeling, and Simulation, 2009. SAMOS '09, vol., no., pp.149--156, 20 - 23 July 2009.

[8] G. Nikiforos, G. Kalokairinos, V. Papaefstathiou, S. Kavadias, D. Pnevmatikatos and M. Katevenis, "A run-time Configurable Cache/Scratchpad Memory with Virtualized User-Level RDMA Capability," in the 6th HiPEAC Industrial Workshop on Embedded Computing, 26 November 2008, THALES Research and Development - Palaiseau, Paris, France.

[9] V. Papaefstathiou, D. Pnevmatikatos, M. Marazakis, G. Kalokairinos, A. Ioannou, M. Papamichael, S. Kavadias, G. Mihelogiannakis, M. Katevenis, "Prototyping Efficient Interprocessor Communication Mechanisms," International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, 2007 (IC-SAMOS 2007), vol., no., pp.26--33, 16 - 19 July 2007.

[10] V. Papaefstathiou, G. Kalokairinos, A. Ioannou, M. Papamichael, G. Mihelogiannakis, S. Kavadias, E. Vlahos, D. Pnevmatikatos and M. Katevenis, "An FPGA-based Prototyping Platform for Research in High-Speed Interprocessor Communication," In the 2nd HiPEAC Industrial Workshop on Embedded Computing, 17 October 2006, Philips (NXP), Eindhoven, Netherlands.

[11] A. Dollas, D. Pnevmatikatos, N. Aslanides, E. Sotiriades, S. Kavvadias, S. Zogopoulos, "Experimental Testing of PLATO, a Reconfigurable Active ATM Network Node," in Proceedings of the 8th Panhellenic Informatics Conference, Cyprus, November, 2001.

[12] A. Dollas, D. Pnevmatikatos, N. Aslanides, S. Kavvadias, E. Sotiriades, K. Papademetriou, “Rapid Prototyping of Reusable 4x4 Active ATM Switch Core with the PCI Pamette,” in Proceedings of the 12th International IEEE Workshop on Rapid System Prototyping (RSP-2001), pp. 17--23, Monterey, CA, June 25-27, 2001, IEEE.

[13] A. Dollas, D. Pnevmatikatos, N. Aslanides, S. Kavvadias, E. Sotiriades, S. Zogopoulos, K. Papademetriou, N. Chrysos, K. Harteros, E. Antonidakis, N. Petrakis, “Architecture and Applications of PLATO, a Reconfigurable Active Network Platform,” in Proceedings of the 9th International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM-2001), Rohnert Park, CA, April 30 - May 2, 2001, IEEE.