Motivation
In the last decade, application performance management (APM) solutions have been developed supporting enterprises with monitoring capabilities and early detection of performance problems. Leading application APM solutions mostly support only alerting and visualization of performance-relevant measures. The configuration of the software instrumentation, the diagnosis of performance problems, and the isolation of the concrete root cause(s) often remain error-prone and frustrating manual tasks. To this day, these tasks are performed by costly and rare performance experts. In order to improve this situation, NovaTec Consulting GmbH and the University of Stuttgart (Reliable Software Systems Group) launched the collaborative research project diagnoseIT on "Expert-guided Automatic Diagnosis of Performance Problems in Enterprise Applications". Hereby, the core idea is to formalize APM expert knowledge to automatically execute recurring APM tasks such as the configuration of a meaningful software instrumentation and the diagnosis of performance problems to isolate their root cause. By delegating the described tasks to diagnoseIT, experts do not have to deal with similar problems over and over again. Instead, the expert can focus on more challenging (and interesting) tasks.
Problem
diagnoseIT provides the analysis results to its user in the form of comprehensive reports that include qualitative (e.g., problem's location, type, and anti-pattern) and quantitative information (impact of the problem in numbers). A major goal of the report is to describe the problem to non-experts making it possible to provide different report types to individual roles. To include the definition of known performance anti-patterns in the description, the anti-patterns have to be detected as part of the diagnosis. Hereby, the automated diagnosis is designed as follows: possible symptoms of performance problems are provided as formalized expert knowledge—an extensible set of rules. When a symptom is detected in a trace, the root cause diagnosis is started without the need for human interaction. Rules that perform localization of the problem are applied first, followed by technology and/or domain-specific rules, which are used to semantify the isolated root cause.
Tasks
- Development of concepts and rules to detect software performance anti-patterns in traces
- Prototyping of the concepts and rules as part of the diagnoseIT implementation
- Evaluation of the false-positive and false-negative rate of the rules
Challenges
- Categorization of software performance anti-patterns by single-trace detectability
- Detection of how anti-patterns are manifested concretely in a trace
Locations
- Stuttgart (preferred)
- Frankfurt (remote supervision)
- Munich (remote supervision)
- Berlin (remote supervision)
Contact
- Christoph Heger (NovaTec Consulting GmbH)
- Andre van Hoorn (University of Stuttgart)