{"218":0,"2429":0,"2430":0,"2432":0,"2433":0,"2434":0,"2435":0}
Site Home
Site Home
Drexel University Libraries
Drexel University
Contact Us
å
iDEA: DREXEL LIBRARIES E-REPOSITORY AND ARCHIVES
iDEA: DREXEL LIBRARIES E-REPOSITORY AND ARCHIVES
Main sections
Main menu
Home
Search
Collections
Names
Subjects
Titles
About
You are here
Home
/
Islandora Repository
/
Theses, Dissertations, and Projects
/
Adaptive Sampling and Statistical Inference for Anomaly Detection
Adaptive Sampling and Statistical Inference for Anomaly Detection
Details
Title
Adaptive Sampling and Statistical Inference for Anomaly Detection
Author(s)
Huang, Tingshan
Advisor(s)
Kandasamy, Nagarajan
;
Sethu, Harish
Keywords
Electrical engineering
;
Cyberinfrastructure--Security measures
;
Computer networks--Security measures
;
Data protection
Date
2015-12
Publisher
Drexel University
Thesis
Ph.D., Electrical Engineering -- Drexel University, 2015
Abstract
Given the rising threat of malware and the increasing inadequacy of signature-based solutions, online performance monitoring has emerged as a critical component of the security infrastructure of data centers and networked systems. Most of the systems that require monitoring are usually large-scale, highly dynamic and time-evolving. These facts add to the complexity of both monitoring and the underlying techniques for anomaly detection. Furthermore, one cannot ignore the costs associated with monitoring and detection which can interfere with the normal operation of a system and deplete the supply of resources available for the system. Therefore, securing modern systems calls for efficient monitoring strategies and anomaly detection techniques that can deal with massive data with high efficiency and report unusual events effectively. This dissertation contributes new algorithms and implementation strategies toward a significant improvement in the effectiveness and efficiency of two components of security infrastructure: (1) system monitoring and (2) anomaly detection. For system monitoring purposes, we develop two techniques which reduce the cost associated with information collection: i) a non-sampling technique and ii) a sampling technique. The non-sampling technique is based on compression and employs the best basis algorithm to automatically select the basis for compressing the data according to the structure of the data. The sampling technique improves upon compressive sampling, a recent signal processing technique for acquiring data at low cost. This enhances the technique of compressive sampling by employing it in an adaptive-rate model wherein the sampling rate for compressive sampling is adaptively tuned to the data being sampled. Our simulation results on measurements collected from a data center show that these two data collection techniques achieve small information loss with reduced monitoring cost. The best basis algorithm can select the basis in which the data is most concisely represented, allowing a reduced sample size for monitoring. The adaptive-rate model for compressive sampling allows us to save 70% in sample size, compared with the constant-rate model. For anomaly detection, this dissertation develops three techniques to allow efficient detection of anomalies. In the first technique, we exploit the properties maintained in the samples of compressive sampling and apply state-of-the-art anomaly detection techniques directly to compressed measurements. Simulation results show that the detection rate of abrupt changes using the compressed measurements is greater than 95% when the size of the measurements is only 18%. In our second approach, we characterize performance-related measurements as a stream of covariance matrices, one for each designated window of time, and then propose a new metric to quantify changes in the covariance matrices. The observed changes are then employed to infer anomalies in the system. In our third approach, anomalies in a system are detected using a low-complexity distributed algorithm when only steams of raw measurement vectors, one for each time window, are available and distributed among multiple locations. We apply our techniques on real network traffic data and show that these two techniques furnish existing methods with more details about the anomalous changes.
URI
http://hdl.handle.net/1860/idea:6647
In Collections
Theses, Dissertations, and Projects
/islandora/object/idea%3A6647/datastream/OBJ/view
Search iDEA
All formats
Search by:
Keyword
Name
Subject
Title
Advanced Search
My Account
Login