Integrated Tracing, Profiling and Debugging for Tuning Large Heterogeneous Clusters

The complexity of the communication and computing infrastructure has been increasing exponentially over the last several decades. Adding to this complexity is the increasing reliance on virtualization and transparent migration, which can be useful for resource optimization, scalability and fault tolerance. Understanding the performance of the system services has become extremely difficult and the tools for that purpose are severely lacking. Tracing tools have the potential to provide with low overhead all the needed information about the different parts of a system; however they lack the needed integration and correlation to link the events originating from different layers and different pieces of the distributed system puzzle.

This project will i) complement the data gathering tools, ii) integrate and correlate information, iii) accumulate all that information in a system model of the current state, and iv) provide analysis and visualization modules.

 

More details