2 Chapter 2
Past decades witnessed great success and advances in the field of information and communication technology (ICT) which is expected to grow much more in the close future and attract so many attentions. Among the variety of applications in Information Technology field, ICT Embedded components in different devices have become an important component playing a significant role in different aspects of our life.
Embedded systems are defined as computer systems which are designed to perform specific functions, usually in real-time and are embedded as part of a complete system. Embedded systems include different types of devices such as portable devices (e.g. smart phones and MP3 players) and large systems (e.g. plant control systems) .
Nowadays, the relation between the cyber world and the physical world has distinguished CPS from traditional embedded systems. CPS main characteristic is the integration of computation and physical processes . In these systems, different devices with computational components are working together in a network and are monitoring, sensing and actuating the elements in the physical world.
There are many different examples and applications of CPS. They include large-scale systems such as health care, automation, transportation and smart grid system. In addition, a new concept is coming up for mobile cyber-physical applications using smart phones and mobile Internet devices that employ multiple sensors . In all different types of CPS, the most important issue is properly understanding and resolving the complicated interaction between physical and computational elements . In these systems, the cyber section is set of control logic and sensor units and the physical section is a set of actuator units .
Term “Cyber-Physical Systems” has emerged as an important and critical research topic is the combination of computations, communications and control. Although it is difficult to have one specific definition for this term because of its vast span, it can be depicted as a physical engineering system that its operation is being controlled, monitored and integrated by a computational core.
In CPS, understanding physical and computational processes separately is not enough and sufficient. CPS is about the intersection and it is important to understand the interaction of physical and computation components. 
As the connection diagram demonstrates in Figure 2-1, sensors capture and transmit the physical world’s status to CPS and the computational components process the received information. The system then decides which action should be taken based on the result given by computational components and the actions are performed by the actuators. The following steps have been defined by Eric Ke Wang showing the main steps in CPS workflow: 
- Monitoring: This is the fundamental function which monitors the physical environment and its processes. Using this function CPS can send feedbacks on previous actions and make sure that operations are correctly done in the future.
- Networking: This function is responsible for aggregating data received from sensors. There are many sensors networked in CPS which are generating data in real-time. Meanwhile, different services need to interact and communicate with the network.
- Computing: The data monitored in step one and aggregated in step two should be analyzed in step three which is computing step. This function is responsible for checking the criteria which are defined previously and decide if the result coming from analysis satisfies that. If not, the computing section executes corrective actions. As an example, the temperature rise can be detected using a data-center CPS.
- Actuation: The results coming from computing elements are sent and executed by actuators. The actions that can be taken by actuators are different types of activities such as correcting the cyber behavior and modifying the physical process. For example, shutting down a system before a probable explosion.
CPS have much more functions, capabilities and services which could not be included in embedded systems. Users can not influence embedded systems. Embedded systems can just help with automated tasks and they are not usually visible to the user. Any possible action needed in embedded systems would be performed under complete control of the user . In opposite to embedded systems, CPS acts based on the data gathered from the physical world in real-time and reacts to this data through predefined orders while cooperating with services, local systems, Internet of Services (IoS) and Internet of Things (IoT) .
In summary, CPS focuses on the connection between the physical world and the cyber world while traditional embedded systems concentrate on computational elements. Figure 2-2 shows the structure of a CPS more clearly.
CPS exceeds embedded systems in different aspects. CPS are more reliable, safe, efficient, robust and adaptable . As an instance, with help of a fast response captured from the sensors, the damages of an explosion in gas stations or in a car accident can be avoided. These systems can also help in having more precise robotic surgeries and result in less pain, blood loss etc. As a result, research on CPS is growing faster and is significantly important nowadays. Considering the unlimited applications CPS can enhance humans’ life quality.
1.1. CPS characteristics
Key characteristics of CPS can be summarized as follows:
- System of systems: In contrast to embedded systems, CPS is a complex system, consisting of many subsystems that interact with each other and can also stand-alone. Therefore, the CPS complexity is much more than a traditional individual embedded system .
- Interactions between control, communication and computation: CPS should be automated and non-technical factors (human factors) should be all omitted in the control loop. Therefore, the control, communication and computing element should be considered in parallel while designing the system. 
- Coupled cyber and physical world: The physical world in CPS should be tightly coupled with the cyber world. Consequently, large-scale wireless and wired networks become significantly important. 
1.2. CPS background
In 2008, in the US, President’s Council of Advisors on Science and Technology (PCAST) organization has prioritized CPS as a top federal research investment. Because of the recommendations coming from PCAST about CPS importance, the CPS program was started at the National Science Foundation (NSF) in 2009 . The NSF CPS program concentrates on fundamental issues concerning different sectors such as health care, automotive, energy, transportation and aerospace.
In addition, this program supports developing different components including methods, hardware, software and tools to accelerate the realization and the usage of CPS in different domains. The NFS CPS program also further implemented the CPS Virtual Organization to prepare and increase research and education community for emerging innovations and applications of CPS.
In 2010, several workshops and conferences started regarding CPS and their applications. These conferences were mainly initiated by the cooperation of the ACM and the IEEE. The first conference on CPS (ICCPS), was successful with many interesting paper and sessions. A timeline of CPS history is shown in Figure 2-3.
1.3. CPS applications
Nowadays CPS are being used in various domains including health care systems, assisted living, advanced automotive systems, traffic control and safety, energy conservation, environmental control, critical infrastructure (e.g. power, water), robotics and manufacturing . In this section, three examples of CPS applications, health care and medicine, aerospace, electric power grid and automotive systems are explained.
1.3.1. Health care and medicine
The health care domain consists of home care, operating room, robotic surgery, national health data, electronic patient record, etc. These systems which are mostly controlled by computer systems and some work in real-time require too much safety and accuracy in timing .
Health care domain in CPS that is usually called Medical CPS (MCPS) helps doctors and patients to interact with each other easier using the cyber section in CPS to receive better treatment . With the help of MCPS, doctors can monitor patients from far away on remote systems rather than the local stand-alone systems.
The wireless technology significantly improves the safety of health care systems; current complicated massive connections between systems using wires in health care environments often results in a crisscross of cables named ‘‘malignant spaghetti’’, which is a serious vulnerability that puts patients’ lives in danger . Systems’ reliability is of high importance in MCPS and it is one of the researchers’ priorities to enhance the overall reliability by bringing new technologies and theories.
CPS research has a signiﬁcant impact on the design of aircraft as well as on air trafﬁc management with the aim to signiﬁcantly improving aviation safety. Some key research issues in aerospace CPS are as follows:
(i) New functionalities to achieve higher capacity, greater safety and more efﬁciency as well as tradeoffs among their possibly conﬂicting goals; (ii) integrated ﬂight deck systems, moving from displays and concepts for pilots to future autonomous systems; (iii) vehicle health monitoring and management; (iv) safety research relative to aircraft control systems.
One of the main challenges is design veriﬁcation and validation of extremely complex ﬂight systems. Since the complexity of ﬂight systems is constantly increasing, the cost of veriﬁcation and validation also increases. The research on veriﬁcation and validation of aviation ﬂight-critical systems includes how to provide methodologies for rigorous and systematic high-level validation of various system safety properties and requirements. This is evaluated in all phases ranging from initial design through implementation, maintenance and modiﬁcation; it is also highly required to understand tradeoffs between.
1.3.3. Power grid
A power grid CPS is formed from the power electronics, power grid and embedded control software. Designing this type of CPS requires a high-level of security, fault tolerance and decentralized control . Recently, research on smart power grid has gained tremendous interests. Development of smart power grids has been of great public interest, which results in a high priority for policymakers. It is a top priority to protect the energy infrastructure from failure as well as outside attacks. For example, under certain unexpected situations, a failure in one location of the electric power grid can propagate across the grid, which leads to plenty of failures and blackouts.
The key objective is to design a robust power grid network by introducing real-time control in the composition of cyber and physical elements in the grid. In particular, security policy, intrusion detection and mitigation must deal with possible outside attacks and should be carefully considered.
1.3.4. Automotive systems
Nowadays vehicle systems are way more advanced than pure mechanical systems. Automotive systems are being used everywhere. Around 30-90 processors are embedded and networked in each car in different sections such as brake system, engine control, airbag system and door locks. . Furthermore, cars may connect to each other and communicate using the Internet, vehicle-to-vehicle networks, or cellular networks . In this situation, safety and security of systems become highly critical. These systems should guarantee reliability for complex networked software.
Automotive CPS has one of the most usages in our everyday life and is one of the most critical ones that should be highly secured as a small accident can damage a lot and take many lives. In the US, currently, almost 42,000 fatal accidents happen each year which could dramatically be reduced using more intelligent systems to help the drivers. Current technologies for collision avoidance are passive and heavily depend on driver’s interaction. Consequently, the automation of collision avoidance is of great interest. With advanced technologies for onboard sensing and in-vehicle computation, as well as with global positioning systems (GPS) and inter-vehicle information exchange, it is expected that near-zero automotive trafﬁc fatalities and signiﬁcantly reduced trafﬁc congestion are achieved. With the growing development of CPS, some new solutions can be applied to unmanned vehicles. Researchers are working on a program to integrate unmanned vehicles and intelligent roads as a CPS. 
SCADA is one type of industrial control systems (ICS) that help to monitor and control operation remotely via communication channels . Existing industrial processes in the physical world can be monitored by ICSs. SCADA is cooperating with many various types of CPS. It can be employed to acquire raw data about the remote equipment’s status and it usually uses different communication channel for each remote station.
SCADA is the core of different industries such as transportation, manufacturing, power, gas, oil, water and many other areas. It cooperates well with many different types of CPS since they can range from simple to complex large configurations and projects. SCADA can be found in our daily life nowadays almost everywhere. It is not easily seen because it is used behind the scenes. You can find SCADA at your local supermarket, wastewater treatment plant, or more importantly the gas stations . SCADA systems employ many software/hardware elements allowing industrial organizations to: 
- Aggregate, control, monitor and process data.
- Control devices and interact with them. Devices are connected through human-machine interfaces (HMI).
- Store all events in log files.
In SCADA, PLCs (Programmable Logic Controllers) and/or RTUs (Remote Terminal Units) receive data from sensors and/or manual inputs and send these data to computers which have SCADA software on them. These computers then analyze the data and display the result. SCADA reduces the wasting time and improves efficiency the manufacturing process and may result in significant savings of money and time. Figure 2-4 shows SCADA architecture and the connection between sensors, RTUs, PLC and SCADA system.
1.4.1. Security challenge in SCADA
In contrast to ICSs, SCADA systems are used in large distances for large-scale processes including multiple sites . Many security attacks have been reported against utility assets. Security vulnerabilities in critical systems can lead to fatal disruptions and they may disclose sensitive information. An attacker can execute an attack in less than one hour as soon as system vulnerability is known and its security is compromised. The growing usage of the Internet helps attackers to form an attack from multiple locations. The most dangerous type of attack is when attackers gain access to the supervisory control access and execute disruptive commands .
1.5. Architectural topology
There are three main topologies for CPS:
- Centralized topology: In this topology, data is collected from distributed sensors and all sensors and actuators are monitored using one deployed middleware . In this topology, it is easier to manage and control CPS and a more secure environment is provided. However, there are more devices being added every day and it makes every system more complex, therefore, it would be problematic to use this tightly coupled centralized topology.
- Distributed topology: In this topology, a very small “middleware” is implemented on each physical device. This middleware is responsible for controlling the physical part and connecting with other peer-to-peer sections. For instance, a middleware may consist of agents and actors. These entities provide adaptive load balancing and monitoring while moving across networked sensors. As the name of this topology shows it provides scalable systems as it is possible to add as many as physical devices and computing elements are needed without any interference with other elements. On one hand, this topology can minimize the network congestion as there is no bottleneck point in it. But on the other hand, it is not possible to execute complex computation since the physical devices have limited resources and managing the devices gets harder. 
- Nested topology: If combine both previous topologies are combined, we get nested topology. The cyber-physical system in this topology can contain more than one local CPS networks which can be either centralized or distributed .
1.6. Three-Tiers CPS architecture
Three-tiers CPS architecture includes the following parts: 
- Service Tiers: The service tier employs different services such as CC (Cloud Computing) and builds a computing environment.
- Environmental Tiers: Includes physical devices and it is in contact with end-users which are the target environment.
- Control Tiers: Makes controlling decisions based on the monitored information that has been aggregated from sensors. This tier finds the right services with service tier consultant and provides the services which have been requested by physical devices. The three-tiers architecture is shown in Figure 2-5.
1.7. CPS Challenges
CPS is a very active and critical research subject nowadays and researchers need to solve many various questions including architecture layers, the systems design and most importantly physical and cyber worlds’ integration. There are six main challenges discussed in CPS: 
- Control and hybrid systems. An updated mathematical theory is required to integrate time-based and event-based systems for feedback control. The theory should suit hybrid systems with different geographic scopes and timescales.
- Sensor and mobile networks. Information gathering and aggregating out of huge amount of unprocessed data is one of the critical steps in CPS. Therefore, a self-organizing and reorganizing mobile network is required for CPS.
- Abstractions. New resource allocation scheme is required for real-time embedded systems and computational abstractions, to make sure that system can achieve scalability, fault tolerance, optimization etc. Therefore, with emerging new technologies and new needs, new distributed real-time computing and communication methods are required.
- Model-based development. Though there are several existing model-based development methods, they are far from meeting demands in CPS. Abstraction and modeling are needed for computing and communications and physical dynamics for a variety of scales, localities and time scopes.
- Verification, validation and certification. Formal methods in different sections should be able to interact easily and safely. Testing methods and compositional verification should be applied to the system.
- Robustness, reliability, safety and security. The most critical issue in CPS concentrates on its safety and security. System safety, robustness, reliability and security, should be guaranteed in uncertain environments, security attacks and errors coming from the physical world and physical devices. The mechanisms in CPS should be time-based, location-based and tag-based to be able to solve and mitigate security problems.
Haung-Ming Haung in  has brought up and discussed three important fundamental connected issues in developing and evaluating real-time hybrid CPS:
- Integration of physical and simulated version of components using a reusable middleware architecture;
- Achieving predictable timing over available hardware and software platforms; and
- Interchanging physical versus components’ simulated versions within a system with high accuracy.
Security of CPS can be categorized into three main aspects:
- Perception security, which is to ensure the security and accuracy of the information collected from physical environment;
- Transport security, which is to prevent the data from being destroyed during the transmission processes;
- Processing center security, such as physical security and safety procedures in servers or workstations .
1.7.1. CPS intrusion detection
In conclusion, the most important CPS issues are availability, reliability and security. The first two can be affected by security issue. Security is the most significant challenge in critical infrastructures like CPS which is highly integrated into today’s world. As this integration increases, securing CPS becomes more and more important. Compromised sub-system, sensor or node can disrupt the CPS functionality. Therefore, it is of high importance to increase the security in CPS and to decrease the possibility of attacks and intrusions. The best security solutions nowadays are designed with IDS. CPS intrusion detection system has one of the highest priorities in security research in different companies all around the world. This thesis aims to do a survey on challenges of CPS intrusion detection, compare the existing techniques and improve one of the common intrusion detection techniques used in CPS to increase CPS security against cyber-attacks. In the next chapter, a comprehensive survey of the existing CPS intrusion detection techniques is provided. In addition, a classification tree is introduced to organize existing CPS intrusion detection schemes.
3 Chapter 3
Nowadays with increasing speed, efficiency, number and communications of computer systems in critical infrastructure systems, the urge of having more reliable security systems has been significantly increased. During years 1984 to 1986, Peter Neumann and Dorotty Denning, were doing research on the security of systems in real-time, which resulted to produce an IDS based on expert systems. This system was named IDES (Intrusion Detection Expert System) . The idea in their project became the foundation of many IDSs produced after.
CERT (Computer Emergency Response Team) reports that with the daily rise in computer systems and Internet, the number of intrusions has extremely increased . NIST (The National Institute of Standards and Technology) defines intrusion as an effort of an attacker to compromise confidentiality, integration and availability (CIA) or to escape security rules. It also defines intrusion detection as the process of monitoring and analyzing running events to find out if they have signs of efforts for the intrusion. In addition, as programming cannot be done without any mistakes, the rapid change of software application leaves behind too many exploitable vulnerabilities. These issues show the necessity for having IDS as a stronger wall to protect networks against attacks. This necessity is more observable in infrastructural systems such as CPS since they have critical missions and they are connected to the physical world and any damages in these systems may cause huge financial and life loss. IDSs analyze network data to detect and record suspicious events in real-time. IDS are systems which automate the process of detection . In addition, intrusion detection system (IPS) is similar to IDS but can also stop possible attacks . In some articles, the term of intrusion detection and prevention systems (IDPS) is a synonym to IPS, but the term IDPS is rarely used.
Although the traditional prevention methods including user authentication, access control, using firewall and encryption are used in the first step of defense for network security, they are not enough. If users use a weak password, user authentication can be compromised. Firewalls, on the other hand, are vulnerable to error in their configurations and they cannot check the content of the packet. They also can be bypassed by tunnels and are unable to detect malicious mobile code or insider attacks. There are three groups of attackers:
- Attackers that are getting access to the network from the Internet connections;
- Valid users who are trying to get privileged access, however, they are not allowed;
- Valid users that have complete access but they abuse that.
In the following, IDS’s compliments, computer attack taxonomy, potential attacks to CPS and different types of IDSs are examined.
IDSs are used as one of the steps to secure computer networks. In other words, IDSs are one of the security systems to protect the networks and other security systems should cooperate with them as complements. These systems have different positions and responsibilities in the network. Some of these compliments are introduced in this section.
2.1.1. Security policy
Security policy is a significantly important step in every security project. Security policy’s aim is to specify the rules that define privileges for users and the security properties which should be satisfied. There are several default standards to be used for designing system frameworks such as ISO 17,799 and ISO 27001. ISO security standards enforce companies for specific norms and requirements .
The main goal of firewall systems is protecting the network behind it. It is significantly important to have appropriately configured firewalls on every network. Firewalls are capable of checking inside each packet and detect any known attack which matched with the rules inside their table. Firewalls should be configured very carefully by security engineers based on appropriate security policies for the requirements of the organization. A Firewall can restrict malicious access from outside the network. It examines information on the network layer (IP Address), transport layer (port address, multiplexing) and application layer (application) .
2.1.3. Authentication and encryption
Encryption is one of the most common and effective ways to retain information. This mechanism has the capability to provide a reliable point to point transmission. It can transfer data between clients, servers and routers. However, encryption cannot be the only way to maintain security. With discovering a vulnerability in the network, the attacker could threaten critical information or the system can be defenseless against DoS attacks. Authentication mechanisms that use username and passwords also cannot be an enough security mechanism, especially when the majority of the users are using weak passwords.
2.1.4. Access control list
Firewalls use a set of rules named access control lists which are written and defined based on security policies of each organization separately. These lists are used to restrict malicious traffics or to define permissions. Attacks cannot be hindered by this list only and they should be enforced to firewalls. As an example, it is possible to restrict access to specific services by a predefined range of IP addresses.
2.2. Computer attack taxonomy
A well-defined taxonomy classifies attacks into categories that have mutual properties. Groups should not overlap since putting an attack into one category excludes it from other categories. Existing and probable attacks to CPS are categorized into four main groups: Probing, denial of service, user to root and remote to user. 
An attack is in the class of probing attacks if an attacker or intruder tries to find information or detect systems’ vulnerabilities in the network by scanning the network. An intruder that can access to a map showing devices on the network can easily scan the vulnerabilities and exploit them. There are various kinds of probes. As an example, some of the probes use social engineering methods and some abuse the features on the device. Probing is one of the most common classes of attack since the attacker does not need to be a technical expert. Attacks in probing class are written in Table 3-1. 
2.2.2. Denial of service (DoS) attacks
In DoS attack, the intruder spends a lot of memory or computing resources to make server and network occupied so it will not be able to handle other requests for any other services. This stops legitimate users from accessing the system and using services. DoS attacks can be done in different ways:
- To abuse legitimate user’s system features;
- To target implementation bugs;
- To exploit system’s vulnerabilities.
These groups of attacks are categorized based on the services that become unavailable during the attack and authorized user cannot use them anymore. Table 3-2 presents some types of DoS attacks and their impacts.
2.2.3. User to root (U2R) attacks
In this group of attacks, the attacker starts with gaining the higher-level privileges by accessing a normal account on one system . Then root access is gained by exploiting system vulnerabilities. Many of the attacks in this group are due to regular mistakes and programming bugs. Some of the attacks in U2R class are provided in Table 3-3.
Table 3‑3. User to Root Attacks and their impacts
2.2.4. Remote to user attacks
In this class of attacks, the intruder tries exploiting a system’s vulnerability by sending packets to it to be able to gain local access . There are various kinds of R2Us. For instance, social engineering methods are one of the most common types in this class. Table 3-4 demonstrates some common attack types in this class.
Table 3‑4. Remote to User Attacks
2.3. Potential threats to CPS
Figure 3-1 shows the general architecture of CPS and its potential threats . Attacks defined by A1, A2 and A3 respectively demonstrate the threats against the physical system, sensors and actuators. Such attacks cannot be reduced by the traditional security criteria since many significant physical measures should be prepared to prevent and block them. The A2 and A3 attacks are aiming to endanger the sensors and actuators data to guide deceiving attacks such as DoS. Therefore, strict security measures must be deployed against such threats .
A4, A5 and A6 attacks in the CPS aim to do disruption in the system calculations, DoS attacks, eavesdropping on the system, disclosure of confidential information by compromising their security keys or a combination of the above . A4 attack refers to all forms of possible cyber-attacks on the computer networks. Preventing such attacks requires the adoption of appropriate mechanisms such as authentication and encryption for secure connections and eliminating hidden communication channels that can be eavesdropped. A5 and A6 attacks are both external and internal attacks against computing devices or controllers. External attacks and intrusions can be done by a destructive entity such as a user with physical access to controllers. 
2.4. Different types of IDS used in CPS
Nowadays there are various types of IDSs that are used in CPS. These systems are classified into five categories including network-based IDS (NIDS), host-based IDS (HIDS), wireless IDS (WIDS), application-based IDS and distributed IDS (DIDS) . This section briefly overviews different IDSs employed in CPS.
2.4.1. Host-based intrusion detection system
These systems which are called HIDS, protect a single host against malicious incidents and attacks by monitoring and collecting the host’s information . HIDS detects intrusion using analyzing system files, system calls, events, etc. and can be deployed easily on a Hypervisor or Virtual Machine (VM). Special security policies can be enforced to the host by HIDS.
2.4.2. Network-based intrusion detection system
NIDS detects threats (e.g. Dos, port scan) by monitoring network traffic . This type of IDS collects data from all over the network and it compares it with predefined patterns from known attacks.
NIDS detects intrusion by monitoring each individual packets’ IP and transport layer headers. This type of IDS employs signature-based and anomaly-based intrusion detection methods which are not very visible to the users of the network. NIDS cannot decrypt encrypted traffic to be able to analyze the contents. 
2.4.3. Wireless intrusion detection system
NIDS and WIDS are similar with only one major difference. Unlike NIDS, WIDS records the traffic from Ad-Hoc and Wireless sensor networks as well .
2.4.4. Distributed intrusion detection system
A distributed IDS is a cooperation of more than one NIDS, WIDS and HIDS which communicate with the central server and with each other.
2.4.5. Application-based intrusion detection system
This category of IDSs is considered as a subcategory of HIDS in some of the articles. These type of IDSs focus on a specific assigned application and will analyze that application only, which means their monitoring is limited to that application on the host. Therefore, they are capable of detecting all malicious activities from virtual users that are trying to abuse that specific application.
2.5. Evaluation metric
The performance of the proposed model is analyzed based on accuracy metric, time and CPU usage. The basic data structure used for evaluation is confusion matrix . Table 3-5 shows the confusion matrix.
Table 3‑5. Confusion Matrix 
In the context of IDS, the attacks that have been identified correctly are referred as true positive and the identified normal packets are referred as true negative. False positive is the normal traffic that has been taken as an attack by mistake and false negatives are the attacks that have been taken as normal incorrectly.
- Classification Accuracy. It is the percentage of packets that have been correctly classified:
Accuracy = [(TP +TN) / (TP +TN + FP + FN)] * 100 (3-1)
2.6. Intrusion detection classification
Figure 3-2 illustrates categorization tree of IDSs. The most recent research has categorized IDSs based on five criteria: [22, 21]
- Detection Technique: IDSs are distinguished based on their basic approach to analyzing the detection in this criterion.
- Collection Process: Which separates Behavior-based IDS from Traffic-Based IDS.
- Trust Model: This criterion distinguishes IDSs that are dependent on raw data or result of analysis from independent IDSs.
- Response Technique: Which compares the active and passive response to fend off an attack.
- Analysis Technique: This criterion separates complex data mining from simple pattern matching method. While detection techniques specify that what the intrusion detection is looking for, analysis techniques show how to look for it.
Although some of the researchers have categorized ID techniques in four subcategories, most of them have divided all into two general categories including signature-based and anomaly-based detection. Anomaly-based intrusion detection is the process of comparing normal behavior with all the events to recognize the important behavioral deviations. This detection method looks for features at run time which are uncommon.
Figure 3‑2. Categorization tree of IDSs
In contrast, signature-based intrusion detection is dependent on a dictionary (database) of signatures (also called as rules) and it uses a specific pattern to detect a known threat and recognize a known malicious behavior. Signature-based methods follow run time characteristics that correspond with a specific pattern of abuse.
In some of the resources, the signature-based technique is referred as misuse detection, supervised detection or pattern-based detection . The main benefit of this classification is the low false positives rate. The key issue in this classification is that the technique must seek a specific pattern in a huge dictionary. One of the most important issues in this field is to make an effective dictionary of attacks which is discussed later in chapter 5. In the following section, the classification tree is discussed in detail.
2.6.1. Detection techniques in CPS
IDSs were introduced earlier. In this section, intrusion detection techniques in CPS including signature-based, anomaly-based, specification-based and reputation-based intrusion detection are described. Figure 3-3 shows a brief overview of IDS techniques and their main characteristics.
Figure 3‑3. IDS techniques and their main characteristics
188.8.131.52. Anomaly-based techniques
Anomaly-based detection technique is the process of finding anomalies by comparing normal profiles and behaviors with logs recorded from happened events . Techniques deployed in this category of intrusion detection look for features in processing time which are different from normal safe profiles. These techniques are classified into three categories:
- Without supervision: In this group, real-time input is used and there is no information about the output;
- With supervision: In this group, information about both input and related output are employed;
- Half supervision: This group is the combination of two first categories and works with two different set of audit data.
In some articles, audit data are also referred as signatures also. An advantage of anomaly detection techniques is that they do not look for specific patterns. This eliminates the need for having a dictionary of attacks and spending too many resources. Another significantly important advantage of these techniques is that they can detect zero-day attacks as well. Their disadvantage is that they have a high rate of false positive. These techniques are classified into three categories: 1) Statistical; 2) Rule-based; 3) Data mining.
Statistical techniques are the most common techniques in anomaly detection. These techniques record traffic model in some specific durations of time and keep comparing the records with incoming traffic. Rule-based techniques analyze the previous information they have received previously. For example, rule-based techniques they examine system models and distribution of data and radio propagation models. In rule-based techniques, the steps recorded by a packet or even incoming traffic rate can help the system to find threats. Data mining techniques use machine learning methods such as Clustering for intrusion detection. Table 3-6 shows anomaly detection techniques briefly.
Table 3‑6. Anomaly detection techniques
|Statistical||Medium complexity, distributed||Dependent on pre-knowledge, high complexity|
|Rule-based||Easy to use and fast||Dependent on pre-knowledge|
|Data mining||Distributed||Highest complexity|
A traffic-based anomaly detection which is called Multi-Level Intrusion Detection System (ML-IDS) is studied in . In this work, levels include analyzing and inspecting traffic flow, payload and packet header. ML-IDS improves the detection rate by using an effective fusion decision algorithm and decreases false positive rate. A supervised learning technique is deployed in  which show improvement in the detection rate of DoS attacks, R2L class and probing class.
184.108.40.206. Signature-based techniques
Signature-based intrusion detection is the process of comparing incoming packets with signatures (rules) and behavior patterns to detect probable threats. Each signature is a pattern which is related to an attack and it shows attack’s details. 
Signature-based techniques search for features in processing time which are similar to a malicious behavior. One of the best advantages of this technique is having low false positive rate . Its key problem is that it should look for a specific pattern in a large dictionary . An attack dictionary should store all the attack signatures. It is very important to employ the best possible effective dictionary and to keep it updated. Having too many signatures in the dictionary makes it harder and slower to find an attack.
220.127.116.11. Specification-based techniques
Specification-based techniques and anomaly detection techniques are similar. Specification-based techniques record healthy normal systems’ properties (such as CPU usage) and formalize it using a state machine. These techniques are mostly considered as a subcategory of anomaly detection techniques.
However, specification-based detection looks for anomalies in system level, whereas anomaly detection looks for user profiles, data flows and traffic on the network level to detect anomalies. The method discussed in  employs a Behavioral Monitoring Specification Language (BMSL) which formulates abnormal and normal behavior for each system by collecting system’s behavior and properties and its incoming and outgoing traffic. BMSL models the details and ordering of the events. Their method starts with the general behavior of the system and then concentrates on system’s high priority tasks. They combine specification detection technique with signature-based detection technique to achieve a higher detection rate. The problem in their method is that it cannot model the time properly. As an example, it would be important to know if ten fail access attempts have been done in just one minute. This can be an effort to attack the system and gain access. BMSL’s detection rate is about 82 percent if it is not combined with signature-based detection technique.
A specification-based detection to collect data traffic in a Wireless Mesh Network (WMN) is proposed in . This method uses a trust model and compares reports coming from each node and other reports from other nodes about the target node. If the similarity is low, it shows a potential attack.
18.104.22.168. Reputation-based model
In this category, instead of finding security threats, the system looks for nodes which have a selfish attitude. If a node is trying to increase its reputation to be called more by other nodes, the system administrators will be warned. The important issue in this technique is defining the first value for nodes’ reputation. This technique is mostly used in ad-hoc networks.
2.6.2. Collection process
There are two methods for collecting data. Behavior-based collection and traffic-based collection. In some articles, researchers refer to signature-based technique as behavior-based and to anomaly-based technique as traffic-based.
22.214.171.124. Behavior-based data collection
In behavior-based data collection, an IDS collects and analyzes very detailed information such as system files or logs to find out if a node is in danger or not. These methods are scalable and decentralized, therefore they are useful for distributed large-scale networks.
A disadvantage of these methods is that each node needs to do an extra work to collect and send their own audit data. This is a negative point for systems which have minimal resources such as RTUs in CPS. Another disadvantage of this method is that an attacker can change the audit data, logs and system-file after the intrusion is done . This method is not in use too much because of its disadvantages.
Systems that use traffic-based data collection gather information from network activities. This analysis can be based on general data (traffic and frequencies) as well as specific protocols to analyze the packet’s content. In this method, nodes are not forced to keep and analyze their own log files and collecting data is done with assigned nodes . The drawback of this method is that it is required to add extra nodes for collecting data and then make sure that each one of them has enough overview on all the nodes around it.
2.6.3. Trust model
Trust model defines which data should be monitored to evaluate trusted nodes. There are two trust models: Unitrust and Multitrust. Buchegger and Le Boudec in  have proposed a distributed reputation manager called CONFIDANT. They consider three different levels of data to evaluate trust: 
- Experienced data: This data is a kind of data that has not been processed or changed and is coming from the original node. This data has the highest priority and value.
- Observed data: This is data collected from the neighborhood and has less value.
- Reported data: This data is reported from outside the neighborhood and has the least value in CONFIDANT model.
The model gives nodes a recovery time to fix their reputation. There are three types of nodes:
- Suckers: The ones that always support their neighbors;
- Cheats: The ones that never support a neighbor;
- Grudgers: The ones that only support if they are supported by the same node.
Multitrust model uses reported data which have the least value since they are coming from other nodes outside the neighborhood. This data can be raw or processed. In contrast, Unitrust model uses direct observation and does not collect reported data, therefore data is more trustable. 
2.6.4. Analysis technique
Analysis is one of the most important tasks in IDS. There are two main techniques to analyze data: Pattern matching and data mining. Pattern matching process is introduced shortly in this section.
126.96.36.199. Pattern matching process
Although much academic research is concentrated around anomaly-based IDS, a lot of organizations are alternatively using pattern matching to protect their networks. Snort is one of the most common pattern matching IDSs that is widely used. Although Snort consists of different features which are using anomaly detection technique, it is considered as content-matching IDS . Snort checks the network packets’ content to see if it matches with any known signature. Each signature in Snort is defined by a rule that introduces a known attack. In each rule, there is an action specified which will be taken if the rule is matched with a packet. The action may be triggering an alert, recording log, rejecting the packet, blocking the sender etc.
2.6.5. Response strategy
There are two response strategies based on the given response time: Active and passive. Active techniques can be divided into two categories: Reactive and proactive .
- Passive: Response is given a while after the attack is recognized;
- Proactive: Response is given before the attack is done;
- Reactive: Response is given immediately when the attack is recognized.
Figure 3-4 illustrates the response strategy in attack detection on the time vector.
2.7. Security failure
Two main security failure situations are considered in CPS. The first failure which is based on the Byzantine fault model defines that if one-third or more of the nodes are attacked, threatened or their vulnerabilities are revealed, then the system fails. Since it will be impossible to reach a secure condition in an appropriate time, security failure is registered. Impairment failure is the second situation when an inside sensor/node in CPS is compromised and performs active attacks while systems are not able to detect the malicious node. Therefore, it will impair the system functionality and results in its failure. 
2.1. Comparing intrusion detection techniques
CPS are mission-oriented and purpose built and have predictable profiles and limited resources to use. Currently used signature-based intrusion detection techniques are not appropriate for CPS since updating a huge number of signatures in the dictionary is too difficult and it consumes many resources.
Anomaly-based and specification-based techniques have higher false alarm rates and this is a disadvantage for CPS. Reputation-based techniques also are not a good choice for CPS, because they are based on Multitrust model. Table 3-7 describes advantages and disadvantages of signature-based and anomaly-based techniques.
Critical Infrastructures (CI) such as CPS are physical and logical facilities of essential importance for public welfare. Their failure or disruption could potentially have a dramatic impact on the economic and social welfare of a nation, a society, or an economy.
Since gathered data in CPS is highly sensitive and vulnerable to attacks, it is needed to build secure networks to protect them. In process of combining physical world and the cyber world using software, hardware, user interfaces, protocols, sensors, Internet access and so on, an appropriate security framework is required. The following section presents some of the existing security frameworks which are used or capable of being used in CPS.
2.2.1. MonALISA framework
MonALISA framework proposed in  consists of distributed services that manage, control and monitor large-scale systems. MonALISA registers dynamic services that are agent-based and multi-threaded subsystems. These services are used by clients or/and services.
Distributed services or agents receive real-time information and cooperate in managing, controlling and optimizing tasks in a wide range. Monitoring is one of the most important steps in CPS which follows up all devices, facilities, sensors, networks and tasks in real-time. The higher-level services in MonALISA consist of distributed job scheduling, optimized dynamic routing, scheduled data transfer and management of distance services in an automated way.
188.8.131.52. System design
MonALISA increases systems’ reliability, eases managing of distributed systems and speeds up detecting the problem by real-time multi-threaded monitoring and controlling. The Lookup Services layer (LUS) is responsible for providing dynamic registration of services. Using this layer services can be easily discovered by each other, by the agents or by the clients. Services communicate via agents with their predefined protocol or via dynamic proxies.
The registration in LUS is done using a lease mechanism. Services should renew their leases and if they fail in renewing it, their lease will be removed from the Lookup Services network. As soon as a service is added or removed from the network, subscribed services to that specific service will be notified. The second layer is MonALISA layer which is the main layer that includes multi-threaded engines which execute monitoring tasks. This layer collects and stores data in real-time and agents can process the aggregated data locally. The subscription option lets the services to show their interest for a service for later usage.
The Proxy layer is used to provide a secure communication among agents. Aggregated information can also be accessed using secure channels provided by proxy servers. Higher-level services and clients access the gathered data using the channels provided by Proxy layer. Proxy services are allocated using a load balancing mechanism.
184.108.40.206. Security infrastructure
Since important data being collected by MonALISA services, each administrator should configure access policies to increase the security and to assure confidentiality of information . MonALISA’s security infrastructure is based on access control lists (ACLs) and Secure Sockets Layer (SSL) communication protocol which supports a variety of authorization methods . In the MonALISA framework, security is quite high since clients are interacting with services through proxy channels based on a lease time.
2.2.2. PolyOrBAC framework
PolyOrBAC framework proposed in  is designed based on and extended from OrBAC  which is a model for access/flow control. PolyOrBAC defines a solution for inter-organizational accesses and specifies policies for interorganizational access. For instance, it deploys econtracts which can be agreed on among departments that work on critical infrastructures and it deploys solutions for enforcing access control and auditing.
2.2.3. RAIM framework
RAIM framework (Real-time monitoring, Anomaly detection, Impact analysis and Mitigation) is mostly used for SCADA systems . To evaluate CPS’ security, it is recommended to examine the impacts and damages coming from threats and attacks and to exactly know all the characteristics of the system. Security engineering should recognize all possible attacks, their aim and their impacts on the system. Figure 3-5 shows RAIM framework used in SCADA consisting of four main steps: Real-time monitoring, intrusion detection, impact analysis and decision-making to prevent and detect malicious attacks (mitigation) .
220.127.116.11. Real-time monitoring
There are many different networks and subsystems cooperating with each other that have different goals including monitoring, control and aggregating information. For example, in SCADA there are distributed sensors, PLCs and central controllers which collect a large amount of different information from all over the network and are connected through different communication methods which all should be monitored in real-time.
18.104.22.168. Anomaly detection
As presented before, Anomaly Detection compares events and different logs including system logs, file logs, security logs and central controller logs. Auditing logs will help the system to detect anomalies.
22.214.171.124. Impact analysis
This step analysis intrusion behaviors and results or impacts coming from a threat or attack. All the subsystems’ vulnerabilities should be analyzed and evaluated in this step to help in improving the security.
All the gathered information in previous steps will be evaluated to find solutions for mitigating the probable future risks, especially the ones that may leave fatal damages.
4 Chapter 4
Signature-based IDSs are very effective for known attacks. Since its very fast and easy to install these systems, they can start working immediately. Signature-based IDS analyzes each packet and compares the content with the dictionary of known attacks. Sometimes normal packets are mistaken as attacks (False positives) but this does not occur too often. These systems generate easy to understand reports and label each packet as normal or as one class of attacks.
Although signature-based IDSs are efficient for known attacks, their problem is that they are not able to find zero-day attacks. Hackers use zero-day attacks and attack many systems before the administrators adapt their organizations IDSs . For this reason, signature-based IDS should be updated continuously. Attack reports should be collected from all over the world and as soon as a new attack is detected. In addition, security engineers should analyze and develop a solution for defending against the attack. The solution should be distributed to all the subsystems and IDS/IPS systems should be updated accordingly. However, the first subsection that has been attacked is already compromised and may have been damaged.
Figure 4-1 illustrates the detection process in signature-based techniques. This type of intrusion detection analysis packets’ features and tries to find a match in stored dictionaries which have all the known attacks recorded in them. Some articles call this technique misuse detection or pattern-based detection. Having low false positive rate is the most important advantage of signature-based IDS.
This technique reacts to known malicious behavior. In another word, they define a node as a good node if it is not exhibiting any attack signatures. The most significant issue in these topics is to create an efficient attack database. If the signature is too large it spends too much memory and if it is not detailed enough it will reduce its accuracy. Signature-based IDSs are more efficient and accurate for detecting outsider attack than other IDS techniques.
3.1. Decision tree based intrusion detection
One of the disadvantages in signature-based IDS is that it drops many packets as it does not have enough time to check each packet with all the rules in the attack dictionary. To solve this problem many researchers have used decision tree search methods. SQL injection attacks are examined in  and the decision tree is used for their detection. In  incoming HTTP requests are filtered using the tree.
A decision tree is a classification algorithm in data mining and its fundamental algorithm is called ID3 (Iterative Dichotomiser 3) . This algorithm builds a tree based on the given classified data and each data is recognized and defined with its features’ values. Classification in decision trees is done in a reverse order and the main challenge is to define the key features for the nodes. Each node in the tree shows a feature from signatures and the process is done when all the signatures are registered in the tree. The leaves in the tree show ending of each connection and are labeled with a type of attack. These trees are capable of working with large amount of data which is an advantage for CPS because there is plenty of traffic flow in CPS. Besides, the high performance in decision trees makes them a good option for the real-time systems. Figure 4-2 shows a small part of a decision tree. In this Figure, the destination node is decided based on the value of source port.
Decision tree accuracy and their ease of construction is another benefit which makes them a suitable choice for IDS in CPS. Researchers in  proposed a classification method using machine learning and simulated the method based on KDD data set. This algorithm has a similar function but it tries to find the attack with the minimum comparison. Although their results demonstrate a good performance improvement, the problem of having a huge amount of data in the dictionary remains unsolved.
3.2. Intrusion detection with filtering mechanism
Nowadays, wireless sensor networks have many security concerns. Sensor networks which are the main part of the critical infrastructures such as CPS require strong security mechanisms. These systems are typically developed in a critical environment which is very vulnerable to attacks. Traditional security methods including encryption, VPN, authentication and firewall are not adequate since they just examine external threats. Therefore, many organizations employ different IDSs to overcome this issue. It is an important step to decide which type of IDS is the best based on the organization’s architecture, size and finance. It should be considered that not every company has enough resources to afford too expensive IDSs.
In signature-based IDS the quality of security depends on the quality of the signatures in the dictionary. However, having a lot of information in the dictionary consumes plenty of resources. Furthermore, each event will be logged and each comparison will record some warning in the system. Recording too many warnings and log files is another problem in signature-based IDS which makes it hard to analyze all the information later.
Researchers in  have proposed a new type of signatures which combines traditional signatures with contextual information from the network. They also have defined a Hash function which drops unimportant and uncritical warnings. In their proposed method, each signature is an ordered pair as (CI, Sig) which CI contains the contextual information of the network and Sig shows the related signature. They could drop the warnings by 66.1% filtering rate.