Track 4: Tracing and debugging support for advanced programming environments

Abstract:

Track 4 focuses on providing specialised algorithms, analysis and views to support complex software environments, with a large number of parallel co-processor cores, a large cloud of physical nodes, and distributed applications with mobile code.

Challenges:

The focus of this project is to adequately support large scale heterogeneous systems. The first 3 tracks propose algorithms and an architecture to achieve this goal. Nonetheless, specialised analysis and views are required to support specific representative applications that run on such systems. Three different classes of applications are targeted. The applications selected are in widespread use, come with an open source reference implementation, and exercise the large scale heterogeneous systems targeted by this project.

The first application is parallel programming environments, for manycore systems and GPGPUs, such as OpenMP and OpenCL, and also dataflow programming, which has been used for signal processing on earlier Epiphany chips and for TensorFlow on GPGPU chips. In addition to the framework proposed in tracks 1 to 3, supporting this application requires linking the individual computation events (one computation on one core) with the associated code line in the high level parallel programming model. Similarly challenging is designing a suitable graphical interface that can show an overview of the system state, representing thousands of parallel cores, but can also be used to dig into a problem and display the detailed state of a specific core.

The second target is cloud based applications. The environment consists in a very large number of nodes running a Cloud Computing stack with virtualisation, such as OpenStack and OpenNFV. On top of that, distributed applications such as Web services (e.g., Linux Apache MySQL and PHP) and MapReduce parallel computing (e.g., Apache Spark) complete the stack. Here again, the monitoring infrastructure relies on the work in tracks 1 to 3, but also requires support to map logical computations and network nodes to physical hardware nodes and network switches.

The third and last application class examined is complex layered interactive distributed applications with mobile code, as will be used for the monitoring framework proposed in this project. New software development environments are transitioning from single-process executables to distributed multi-process systems. While modular frameworks such as Eclipse offer a wealth of libraries and plugins that can be harnessed to build very sophisticated applications, there remains the constraint of implementing everything in the same programming language, environment and process. Already, some large pieces of functionality were deemed too costly to reprogram within Eclipse and were kept separate and accessed remotely, for instance the programming language compilers and debuggers. This has led to some duplication between these pieces, as both the compiler and Eclipse implement programming language parsing, for instance. A newer architecture for software development tools was proposed, where the functionality is spread among a few processes, like programming language servers, debuggers, and a browser based user interface. Open source examples of this new trend include Electron (an open source library developed in JavaScript using Node.js on top of the Chromium browser), Visual Studio Code (an open source software development environment built over Electron and communicating with compilers through the Language Server Protocol), and Eclipse Orion (an open source software development environment that runs in the cloud and executes JavaScript in a browser-based client for interactions).

Plan: 

In track 4, tracing and debugging support for advanced programming environments, the aim is to propose new algorithms and views to support complex software environments, with a large number of parallel co-processor cores, a large cloud of physical nodes, and distributed applications with mobile code. This will serve a dual purpose, adding support for these complex use cases and validating that the algorithms and architecture proposed in the first 3 tracks indeed can support such use cases efficiently.