An Improved Hidden Markov Model for Anomaly Detection Using Frequent Common Patterns

Host-based intrusion detection techniques are needed to ensure the safety and security of software systems, especially, if these systems handle sensitive data. Most host-based intrusion detection systems involve building some sort of reference models offline, usually from execution traces (in the absence of the source code), to characterize the system healthy behavior. The models can later be used as a baseline for online detection of abnormal behavior. Perhaps the most popular techniques are the ones based on the use of Hidden Markov Models (HMM). These techniques, however, require long training time of the models, which makes them computationally infeasible, the main reason being the large size of typical traces, often millions of lines long. In this paper, we propose an improved HMM using the concept of frequent common patterns. In other words, we build models based on extracting the largest n-grams (patterns) in the traces instead of taking each trace event on its own. We show through a case study that our approach can reduce the training time by 31.96%-48.44% compared to the original HMM algorithms while keeping almost the same accuracy rate.

Date: 
Wednesday, June 13, 2012
Publication authors (members): 
Abdelwahab Hamou-Lhadj
Afroza Sultana
Mario Couture
Publisher: 
IEEE Communications Society
Bibtex: 
A. Sultana, A. Hamou-Lhadj, M. Couture, "An Improved Hidden Markov Model for Anomaly Detection Using Frequent Common Patterns", The IEEE International Conference on Communications (ICC’12), Communication and Information Systems Security Symposium, Ottawa, ON, 2012.
Publication status: 
Published

Tracks concerned: