Haut PDF Anomaly-based network intrusion detection using machine learning

Anomaly-based network intrusion detection using machine learning

Anomaly-based network intrusion detection using machine learning

3.9. OPEN ISSUES AND CHALLENGES on a chip, which are a lot more computationally constrained. If the probes monitor the same network, one probe could train a new neural network model and propagate it to the others to continuously adapt to the evolutions of the network. Conversely, a centralized IDS means a single point of failure, whereas a distributed IDS is much more resilient. This architecture choice depends on multiple factors, including the goal of the IDS in a global network security plan, or the topology of the monitored network. A combination of these two solutions could translate into a centralized neural network with a high-level view of the network, capable of detecting threats that would go unnoticed by the probes. For a centralized solution, it is still possible to use the probes by having them preprocess the data they send to the server. This reduces both the load on the network and the amount of work assigned to the central server. However, IDSs presented in this section are, when this information is available, always trained and tested on a single computer. But even in this case, their analyses are often performed offline, with only a small number capable of real-time detection for high-bandwidth networks. While this is also the case with certain commercial cybersecurity solutions, this outcome is not fully satis- fying. This would be an important problem in a real-world setting: if the IDS takes too much time to analyze incoming traffic, malicious data could harm the network before being detected. Or, even worse, the saturated IDS could drop packets containing the attack, making its detection impossible. Beyond a certain threshold, the IDS indeed simply ceases to function [72] – a flooding attack that is particularly cheap and effective. This is why the security of the IDS itself is extremely important. This issue is hardly or not covered at all in the reviewed papers, but it remains a major concern for any IDS. In a context where attackers
En savoir plus

123 En savoir plus

ONTIC: D5.4: Use Case #1 Network Intrusion Detection

ONTIC: D5.4: Use Case #1 Network Intrusion Detection

known attacks are continuously emerging. A general anomaly detection system should therefore be able to detect a wide range of anomalies with diverse structures, using the least amount of previous knowledge and information, ideally none. ONTIC UC #1 designed a new autonomous anomaly detection system based on original unsupervised machine learning algorithms designed for that purpose. The most important feature of the anomaly detector is that it does not rely on previously acquired knowledge nor it needs any training phase or labelled data, while it is expected not to leverage in most of the cases on human operators for making decisions on the status of detected anomalies (legitimate vs. attack or intrusion for instance). It aims also at triggering the appropriate counter-measures in most cases. However, as the project research results in WP4 highlight, it is not possible for the anomaly detection to autonomously make decisions for every anomaly. Therefore, a tool for a human administrator to make a decision on whether a spotted anomaly is legitimate or not; and, subsequently apply the suited counter-measure is needed. The new functionality that is required, and has been added in the design of the new anomaly detection system, is a network traffic analytics dashboard. It aims at providing human administrators with the appropriate information elements from the detection algorithms to decide what to do. The dashboard provides two sets of information:
En savoir plus

71 En savoir plus

Low-Rate False Alarm Anomaly-Based Intrusion Detection System with One-Class SVM

Low-Rate False Alarm Anomaly-Based Intrusion Detection System with One-Class SVM

the boldface letters indicate a one d-dimensional vector. Each of these inputs is assigned a label shown by y i {−1, 1} as outputs. Moreover, the number of components of each vector depicts the number of features since each observation has some characteristics. The provided real data consist of rows depicting examples and columns showing features. These features are low-level indicators captured from server log files, such as CPU usage, CPU load and so forth. These data are in two files: one containing non-attack observations and one capturing non-attack and attack observations. These two datasets are provided by the Groupe Access company, and they were captured from server log files. We have started a project with the title of "A machine learning method for anomaly-based intrusion detection system". We aimed to construct a system that can detect attacks with the contribution of reducing the false alarm rate. Thus, the main perspective of this research is on reducing this rate based on one-class SVM algorithm. Groupe Access was founded in 1993, and it provides information technology and hardware services, such as data protection and recovery, infrastructure core network design and management, and monitoring and customizing private, public, and hybrid clouds.
En savoir plus

100 En savoir plus

Energy performance based anomaly detection in non-residential buildings using symbolic aggregate approximation

Energy performance based anomaly detection in non-residential buildings using symbolic aggregate approximation

opportunity for building operators and energy data scientists, given the availability of data-driven tools for building monitoring, benchmarking, and control. However, these systems accumulate a vast amount of data over time, creating new challenges in efficient data interpretation. As a result, building energy researchers have been paying closer attention to methods that can enable them to analyse large building datasets for the purpose of remote auditing and automated anomaly detection. A recent review [6] indicates that researchers are widely targeting the topic of anomaly detection using analyses of the whole building energy demand. The two major approaches used in such analyses are data-driven methods [7] and model-based methods [8]. While model-based approaches are capable of simulating the behaviour of buildings under various boundary conditions, they rely on a relatively detailed level of knowledge about the building structure, parameters, and occupant behaviour. On the contrary, data-driven methods can be deployed more quickly by data analysts, though some knowledge of building physics and systems is essential for meaningful interpretation. In addition, the recent advancements in machine learning and data mining techniques have led to several methods that may be leveraged by building energy scientists [9].
En savoir plus

7 En savoir plus

Cache-Based Side-Channel Intrusion Detection using Hardware Performance Counters

Cache-Based Side-Channel Intrusion Detection using Hardware Performance Counters

We present a novel run-time detection approach for cache-based side channel attacks (SCAs). It constitutes machine learning models which take real- time data from hardware performance counters for detection purpose. We have performed our experiments with two state-of-the-art cache-based side channel attacks namely, Flush+Reload and Flush+Flush to evaluate the effectiveness of our detection approach. We have provided the experimental evaluation using real time system load conditions and analyzed the results on detection accuracy, detection speed, system-wide performance overhead and confusion matrix for used models. Proposed detection mechanism uses three different machine learning models, namely LDA, LR, SVM model for intrusion detection. We collect data related to the real-time behavior of running processes through selected CPU events, which are stored in registers called Hardware Performance Counters (HPCs). This data is used as features for our machine learning models. The method is designed for run-time detection of cache-based SCAs on RSA and AES crypto-systems. The proposed method uses carefully selected unique hardware events in a multiplexed fashion to reduce false positives and false negatives. We perform experiments on Intel's Core i5 i7 machines under No load, Average load, and Full load system conditions. Our results show detection accuracy of up to 99.51%in the best case for Flush+Reload and 99.97% in the best case for Flush+Flush SCA. Our detection approach shows considerably high detection efficiency under realistic system load conditions.
En savoir plus

3 En savoir plus

Network intrusion detection system for drone fleet using both spectral analysis and robust controller / observer

Network intrusion detection system for drone fleet using both spectral analysis and robust controller / observer

Fig. 20. Estimation with real traffic replay - PFC attack V. C ONCLUSION AND F UTURE W ORK In this paper, we have explained how a new hybrid method can improve intrusion detection systems in the specific context of drone fleet. We have combined the use of a linear controller / observer and spectral analysis of the traffic. Based on a wavelet analysis, this traffic characterization process provides a preliminary level of knowledge about which type of intrusion is performed in the network. Based on this information, our linear controller / observer can be tuned and can perform traffic reconstruction in order to estimate accurately the level of attack observed in the network. Consequently, our design methodology provides a simple way to construct and instan- tiate our gain matrices for both the AQM controller and the observer. This approach has given us promising results with a simple topology within a time-delay framework. Indeed, two different types of anomaly have been considered in this paper (constant and progressive flash-crowds) and they are both accurately detected by the intrusion detection process proposed in Section 3 and validated in Section 4.
En savoir plus

14 En savoir plus

Machine learning-based EDoS attack detection technique using execution trace analysis

Machine learning-based EDoS attack detection technique using execution trace analysis

accuracy of having a single framework for detecting different types of EDoS attacks at the same time is better than having a separate framework for each attack. Actually in this work we demonstrated that the metrics of one kind of attack play an important role to detect another type and vice versa.We con- sidered a nominate for each general types of attacks that we introduced in the paper. Thus, HTTP attack represented Bandwidth–consuming attacks; Attacks that target specific applications represented by Database attack in this work and TCP SYN Flood attack was a nominate for Connection–layer exhaustion attacks in this paper. So this work can be further expanded to find features for other attacks in addition to HTTP Attacks, Database Attacks, and TCP SYN Flood Attacks. A very interesting elaboration on this work would be to allocate weights to the metrics and then to design a machine- learning algorithm that can automatically predict and detect other types of attacks using this expanded model. Also, access to real-world network traffic was limited in this work and the evaluation of the work under real network traffic will be addressed in a feature work.
En savoir plus

21 En savoir plus

A new statistical approach to network anomaly detection

A new statistical approach to network anomaly detection

IV. C ONCLUSIONS In this paper we have presented an anomaly based network intrusion detection system, which detects anomalies using sta- tistical characterizations of the TCP traffic. We have compared several stochastic models, such as first order homogeneous and non-homogeneous Markov chains, high order homogeneous Markov chains, and stationary and non-stationary ECDF. We have detailed the estimation of the parameters of the models and we have shown the results obtained with the DARPA 1999 data set. The performance analysis has highlighted that the best results are obtained with the use of homogeneous Markov chains and that some improvements can be achieved using high order Markovian models: for instance, 4th order Markov chains lead to the same detection rate of first order models, with almost one half of false alarms.
En savoir plus

8 En savoir plus

Machine learning for IoT network monitoring

Machine learning for IoT network monitoring

VI. C ONCLUSION In this work we presented different machine learning based approaches for IoT network monitoring. First, an experimental smart home network was built to generate network traffic data. Bidirectional TCP flows are then extracted from the generated network traffic. Features used to describe bidirectional flows include the size of the first N packets sent and received, along with the corresponding inter-arrival times. The collected data are used to train different classification algorithms to recognize the IoT device type. An overall accuraccy of 99.9% is achieved by the Random Forest classifier. Further details about the work are available in [12]. Finally, we propose to use unsupervised deep learning algorithms, such as autoencoder, to detect at- tacks in IoT networks. In the future, the developed device type recognition and anomaly detection models will be integrated to a software defined networking (SDN) environment.
En savoir plus

4 En savoir plus

Anomaly detection in mixed telemetry data using a sparse representation and dictionary learning

Anomaly detection in mixed telemetry data using a sparse representation and dictionary learning

ML-based algorithms for AD in telemetry can be divided in two categories depending on their application to univariate or multi­ variate data. Univariate AD strategies process the different teleme­ try parameters independently, which is the most widely used ap­ proach. Popular ML methods that have been investigated in this framework include the one-class support vector machine (7) , near­ est neighbour techniques (8-10) or neural networks [11.12) . These solutions showed competitive results and improved significantly spacecraft heath monitoring. However, in order to improve AD in telemetry, it is important to formulate the problem in a multivari­ ate framework and take into account possible correlations between the different parameters, allowing contextual anomalies to be de­ tected. An example of contextual anomaly is shown in Fig. 1 (box #7). The detection of this kind of abnormal behaviour requires a multivariate detection rule. Sorne recent multivariate AD are based on feature extraction and dimensionality reduction (13) or on a probabilistic model for mixed discrete and continuous telemetry parameters (14) .
En savoir plus

12 En savoir plus

Anomaly detection in mixed telemetry data using a sparse representation and dictionary learning

Anomaly detection in mixed telemetry data using a sparse representation and dictionary learning

ML-based algorithms for AD in telemetry can be divided in two categories depending on their application to univariate or multi­ variate data. Univariate AD strategies process the different teleme­ try parameters independently, which is the most widely used ap­ proach. Popular ML methods that have been investigated in this framework include the one-class support vector machine (7) , near­ est neighbour techniques (8-10) or neural networks [11.12) . These solutions showed competitive results and improved significantly spacecraft heath monitoring. However, in order to improve AD in telemetry, it is important to formulate the problem in a multivari­ ate framework and take into account possible correlations between the different parameters, allowing contextual anomalies to be de­ tected. An example of contextual anomaly is shown in Fig. 1 (box #7). The detection of this kind of abnormal behaviour requires a multivariate detection rule. Sorne recent multivariate AD are based on feature extraction and dimensionality reduction (13) or on a probabilistic model for mixed discrete and continuous telemetry parameters (14) .
En savoir plus

11 En savoir plus

Towards privacy preserving cooperative cloud based intrusion detection systems

Towards privacy preserving cooperative cloud based intrusion detection systems

Abstract Cloud systems are becoming more sophisticated, dynamic, and vulnerable to attacks. Therefore, it's becoming increasingly difficult for a single cloud-based Intrusion Detection System (IDS) to detect all attacks, because of limited and incomplete knowledge about attacks and their implications. The recent works on cybersecurity have shown that a co-operation among cloud- based IDSs can bring higher detection accuracy in such complex computer systems. Through collaboration, cloud-based IDSs can consult and share knowledge with other IDSs to enhance detection accuracy and achieve mutual benefits. One fundamental barrier within cooperative IDS is the anonymity of the data the IDS exchanges. Malicious IDS can obtain sensitive information from other IDSs by inferring from the observed data. To address this problem, we propose a new framework for achieving a privacy-preserving cooperative cloud-based IDS. Specifically, we design a unified framework that integrates privacy-preserving techniques into machine learning-based IDSs to obtain privacy-aware cooperative IDS. Therefore, this allows IDS to hide private and sensitive information in the shared data while improving or maintaining detection accuracy. The proposed framework has been implemented by considering several machine learning and privacy-preserving techniques. The results suggest that the consulted IDSs can detect intrusions without the need to use the original data. The results (i.e., no records of significant degradation in accuracy) can be achieved using the newly generated data, similar to the original data semantically but not synthetically.
En savoir plus

88 En savoir plus

Random Partitioning Forest for Point-Wise and Collective Anomaly Detection - Application to Network Intrusion Detection

Random Partitioning Forest for Point-Wise and Collective Anomaly Detection - Application to Network Intrusion Detection

We can think of a sixth class of method covering recent advances in deep learning and self-encoding based methods. These approaches have been historically initiated by Kramer [19] and adapted recently to a deep learning framework under the form of auto-encoder (AE) [20] and Variational Auto- Encoder (VAE) [21]. In the context of anomaly detection, reconstruction error is the criterion used to decide whether a data item is normal or deviates too much from normality. The main advantage of VAE against AE is that their latent spaces are, by design, continuous, thanks to the prediction of a mean and a variance vectors allowing to smooth locally the latent space. In [22] the authors have proposed KitNET, an online unsupervised anomaly detector based on an ensemble of autoencoders, which are trained to reconstruct the input data, and whose performance is expected to incrementally improves overtime. One particularity of KitNET is that it estimates in an unsupervised manner the number of auto- encoders in the ensemble and the dimensions of the encoding layers. The last layer of the KitNET architecture is also an auto-encoder that takes as inputs the Root Mean Square Errors of the auto-encoders in the ensemble and provides in output the final reconstruction vector and RMSE. KitNET is considered as the state of the art unsupervised on-line anomaly detection for intrusion detection on network systems. In 2008, Isolation Forest (IF) [23], a quite conceptually dif- ferent approach to the previously referenced methods has been proposed. The IF paradigm is based on the difficulty to isolate a particular instance inside the whole set of instances when using (random) partitioning tree structures. It relies on the assumption that an anomaly is in general much easier to isolate than a ’normal’ data instance. Hence, IF is an unsupervised ap- proach that relates somehow to the information theoretic based methods since the isolation difficulty is addressed through
En savoir plus

17 En savoir plus

Distance Measures for Anomaly Intrusion Detection

Distance Measures for Anomaly Intrusion Detection

In general, intrusion detection methods based on the transition information model temporal variation of the audit data. The intrusion detection methods using the frequency information, on the other hand, convert the temporal sequences into some non-temporal representation typically in the form of multidimensional feature vectors with no time dimension. Our previous work [19] is consistent with Ye's work [20] and indicates that considering the transition information of audit data can improve detection accuracy but have to sacrifice some real-time performance compared to using the frequency information. In practice, audit data in intrusion detection problem is typically very large. For example, in colleting system calls of sendmail on a host machine, only 112 messages produced a combined trace with the length of over 1.5 million system calls [5]. Fast processing of massive audit data in real-time is therefore essential for a practical Intrusion Detection System (IDS) so that actions for response can be taken as soon as possible. However, intrusion detection methods considering the transition information of audit data usually require much time to
En savoir plus

10 En savoir plus

CP-based cloud workload annotation as a preprocessing for anomaly detection using deep neural networks

CP-based cloud workload annotation as a preprocessing for anomaly detection using deep neural networks

Institut Mines-Telecom Atlantique, LS2N, Nantes, France last-name.first-name@imt-atlantique.fr Abstract. Over the last years, supervised learning has been a subject of great interest. However, in presence of unlabelled data, we face the problem of deep unsupervised learning. To overcome this issue in the context of anomaly detection in a cloud workload, we propose a method that relies on constraint programming (CP). After defining the notion of quasi-periodic extreme pattern in a time series, we propose an algorithm to acquire a CP model that is further used to annotate the cloud workload dataset. We finally propose a neural network model that learns from the annotated data to predict anomalies in a cloud workload. The relevance of the proposed method is shown by running simulations on real-world data traces and by comparing the accuracy of the predictions with those of a state of the art unsupervised learning algorithm.
En savoir plus

13 En savoir plus

Machine learning and extremes for anomaly detection

Machine learning and extremes for anomaly detection

44 Chapter 4. Background on classical Anomaly Detection algorithms actions to be taken, especially in situations where human expertise is required to check each observation is time-consuming. From a machine learning perspective, anomaly detection can be considered as a specific clas- sification/ranking task, where the usual assumption in supervised learning stipulating that the dataset contains structural information regarding all classes breaks down, see Roberts ( 1999 ). This typically happens in the case of two highly unbalanced classes: the normal class is ex- pected to regroup a large majority of the dataset, so that the very small number of points representing the abnormal class does not allow to learn information about this class. In a clus- tering based approach, it can be interpreted as the presence of a single cluster, corresponding to the normal data. The abnormal ones are too limited to share a common structure, i.e. to form a second cluster. Their only characteristic is precisely to lie outside the normal cluster, namely to lack any structure. Thus, common classification approaches may not be applied as such, even in a supervised context. Supervised anomaly detection consists in training the algorithm on a labeled (normal/abnormal) dataset including both normal and abnormal observations. In the novelty detection framework (also called one-class classification or semi-supervised anomaly detection), only normal data are available for training. This is the case in applications where normal operations are known but intrusion/attacks/viruses are unknown and should be detected. In the unsupervised setup (also called outlier detection), no assumption is made on the data which consist in unlabeled normal and abnormal instances. In general, a method from the novelty detection framework may apply to the unsupervised one, as soon as the number of anomalies is sufficiently weak to prevent the algorithm from fitting them when learning the normal behavior. Such a method should be robust to outlying observations.
En savoir plus

221 En savoir plus

Joint Optimization of Monitor Location and Network Anomaly Detection

Joint Optimization of Monitor Location and Network Anomaly Detection

Recently, Zhao et al. [1] argued that link and monitor ca- pacities to handle monitoring flows should be considered while selecting monitor locations. The authors claimed that the problem is quite complex; and proposed a multi-round monitoring scheme that reduces the complexity by a factor of the number of rounds. The major limitation of such an approach is that it increases the delay to detect anomalies by a factor of the number of rounds. In this paper we investigate and reduce the trade-off between the optimization objectives of the two steps. Toward this end, we propose two different ILP formulations that model a joint optimization of monitor location and network anomaly detection problems. Given a set of operational constraints, our ILPs provide optimal locations for monitors and optimal set of paths to be monitored that minimize the total monitoring cost and satisfies the constraints. The two ILPs were solved on randomly generated network topologies, in order to investigate the complexity of the problem and to obtain a deeper understanding of the interpaly between the optimization objectives and their impact on the quality of the solution.
En savoir plus

5 En savoir plus

Cyber security risk analysis framework : network traffic anomaly detection

Cyber security risk analysis framework : network traffic anomaly detection

The experiments were conducted utilizing various time series algorithms (Seasonal ETS, Seasonal ARIMA, TBATS, Double-Seasonal Holt-Winters, and Ensemble methods) and Lo[r]

86 En savoir plus

Heuristics for Joint Optimization of Monitor Location and Network Anomaly Detection

Heuristics for Joint Optimization of Monitor Location and Network Anomaly Detection

Now, we investigate the quality of the solutions. For this purpose, we compute the total monitoring cost as given in (1). Simulation results for the five approaches and the 8 considered topologies are depicted in TABLE IV. This metric illustrates the cost gap between the different approaches and shed light on the impact of the selective heuristics on the quality of the solution. Surprisingly, the selective algorithm performs better than the exhaustive algorithm and the LP-assisted greedy algorithm, although it does not explore all the network paths. This is because, the selective algorithm starts by covering the maximum number of paths using only 2 monitors and without generating redundant measurements, and then, it balances the load between monitors and links while covering the remaining links. Furthermore, for small networks, the gap between the exact solutions of the ILP and the solutions of the selective algorithm is negligible.
En savoir plus

6 En savoir plus

Sequence Covering for Efficient Host-Based Intrusion Detection

Sequence Covering for Efficient Host-Based Intrusion Detection

Index Terms—Sequence Covering Similarity, Host-based Intrusion Detection, System Calls, Semi-Supervised Learning, Zero-Day F 1 I NTRODUCTION I NTRUSION Detection Systems (IDS) are more and more heavily challenged by intrusion scenarios developed by today’s hackers. The number of reported intrusion incidents has dramatically increased during the last few years with very serious consequences for organizations, companies and individuals. As an example, the Troyan horse TINBA (which stands for TINy BAnker) has targeted with apparent success the worldwide banking system during the last three years [1], [2], [3]. The detection of zero-day attacks (attacks that have never been detected before) is even more challenging since no pattern or signature characterizing this kind of attack can be used to identify it. Furthermore, with the de- velopment of the IoT, the rate of the production of sequences of system calls, i.e. sequential data used to access, manage, or administrate connected equipments, is exploding. Hence, the need to develop and use efficient intrusion detection algorithms that can identify, isolate and handle suspicious patterns in sequential information flows is evermore press- ing with time.
En savoir plus

15 En savoir plus

Show all 10000 documents...