Introduction to the Special Issue on Human-Interaction-Aware Data Analytics for Cyber-Physical Systems
Cities are deploying millions of sensors and actuators and developing smart services with sophisticated models and decision-making policies supporting by the Cyber Physical Systems and Internet of Things technologies. The increasing number of sensors collects a large amount of city data from multiple domains. The collected data has great value, but has not yet been fully exploited. Focusing on the domains of transportation, environment, emergency and public safety, energy, and social sensing, this paper carefully reviews the data sets being collected across 14 smart cities and the state-of-the-art work in models and decision making for smart cities. The paper also points out the capabilities, limitations, and challenges regarding data, models and decision making. Five overarching challenges faced today, and that will be further exacerbated in the future, including security, privacy, uncertainty, human in the loop, and economic and social challenges are also discussed.
This paper proposes new tools to detect the tampering of video feeds from surveillance cameras. Our proposal illustrates the unique cyber-physical properties that sensor devices can leverage for their cyber-security. While traditional authentication and attestation algorithms exchange digital challenges between devices authenticating each other, our work instead proposes challenges that manifest physically in the field of view of the camera (e.g., a QR code in a display, a change of color in lighting, an infrared light, etc.). This physical (challenge) and cyber (verification) attestation mechanism can help protect systems even when the sensors (cameras) and actuators (Display, IR LEDs, Color Lightbulbs) are compromised.
Many real-world attacks on cyber-physical systems involve physical intrusions to cause direct damage or to facilitate cyber attacks. Hence, in this work, we investigate the security risk of organizations with respect to different adversarial models of physical movement behavior. We study the case where an intrusion detection mechanism is in place to alert the system practitioner when users deviate from their normal movement behavior. We then analyze how different user behaviors may present themselves as different levels of threats in terms of their normal movement behavior within a given building topology. To quantify the difference in movement behavior, we define a WeightTopo metric that takes into account the building topology in addition to the movement pattern. We demonstrate our approach on a railway system case study and show how certain user roles are more vulnerable to attackers in terms of the physical intrusion detection probability when these roles are abused by attackers. We also determine quantitatively the amount of knowledge an attacker needs to possess in order to remain undetected. Certain individual users are found to pose a higher threat, implying the need for customized monitoring.
Pipelined control is an image-based control that uses parallel instances of its image-processing algorithm in a pipelined fashion to improve the quality of control. A higher number of pipes improves the controller settling time resulting in a trade-off between resources and control performance. In real-life applications, it is common to have a continuous-time model with additive uncertainties in one or more parameters that may affect the controller performance and therefore, the trade-off analysis. We consider models with uncertainties denoted by matrices with a single non-zero element, potentially caused by multiple uncertain parameters in the model. We analyse the impact of such uncertainties in the before-mentioned trade-off. To do so, we introduce a discretization technique for the uncertain model. Next, we use the discretized model with uncertainties to analyse the robustness of a pipelined controller designed to enhance performance. Such an analysis captures the relationship between resource usage, control performance, and robustness. Our results show that the tolerable uncertainties for a pipelined controller decreases when increasing the number of pipes. We also show the feasibility of our technique by implementing a realistic example in a Hardware-In-the-Loop simulation.
The vehicular cyber-physical systems (VCPS), among several other applications, may help in addressing the ever increasing problem of congestions in large cities. Nevertheless, this may be hindered by the problem of data falsification, which results out of either wrong perception of a traffic event or generation of fake information by the participating vehicles. Such information fabrication may cause re-routing of vehicles and artificial congestions, leading to economic, public safety, environmental, and health hazards. Thus, it is imperative to infer truthful traffic information at real-time for restoration of operation reliability of the VCPS. In this work, we propose a novel reputation scoring and decision support framework, called Spoofed and False Report Eradicator (SAFE), which offers a cost-effective and efficient solution to handle data falsification problem in the VCPS domain. It includes humans in the sensing loop by exploiting the paradigm of participatory sensing and a concept of mobile security agent (MSA) to nullify the effects of deliberate false contribution, and a variant of the distance bounding mechanism to thwart location-spoofing attacks. A regression-based model integrates these effects to generate the expected truthfulness of a participants contribution. To determine if any contribution is true or not, a generalized linear model is used to transform expected truthfulness into a Quality of Contribution (QoC) score. The QoC of different contributions are aggregated to compute the user reputation. Such reputation enables classification of different participation behaviors. Finally, an Expected Utility Theory (EUT)-based decision model is proposed which utilizes the reputation score to determine if an information should be published or dropped. To evaluate SAFE through experimental study, we compare the reputation-based user segregation performance achieved by our framework with that generated by the state-of-the-art reputation mechanisms. Experimental results demonstrate that SAFE is able to better capture subtle differences in user behaviors based on quality, quantity and location accuracy, and significantly improves operational reliability through accurate publishing of only legitimate information.
In this article, we describe a motion planning framework in a cyber-physical system (CPS) that takes into account the human's safety perception in the presence of a flying robot. We use Virtual reality (VR) as a safe testing environment to collect psychological signals from the test subjects experiencing a flying robot in their vicinity. The collected data shows that the sensor signals from the physical part (human) of CPS are influenced by unknown factors due to the distraction by other factors when the human's attention is focused not only on the robot but also on other stimuli. To overcome this issue, we propose to model the change of the focus in the human's attention as a latent discrete random variable, which clusters the data samples into two groups of relevant and irrelevant samples. The proposed model improves the likelihood over the Gaussian noise model, which only minimizes the squared error. We also present a numerical optimal path planning method that ensures spatial separation from the obstacle despite the time discretization in the CPS. Optimal paths generated using the proposed model result in reasonable safety distance from the human. In contrast, the paths generated by the standard regression model with Gaussian noise assumption have undesirable shapes due to over-fitting.
Android users are increasingly concerned with the privacy of their data and security of their devices. To improve the security awareness of users, recent automatic techniques produce security-centric descriptions by performing program analysis. However, the generated text does not always address users? concerns as they are generally too technical to be understood by ordinary users. Moreover, different users have varied linguistic preferences, which do not match the text. Motivated by this challenge, we develop an innovative scheme to help users avoid malware and privacy-breaching apps by generating security descriptions that explain the privacy and security related aspects of an Android app in clear and understandable terms. We implement a prototype system, PERSCRIPTION, to generate personalised security-centric descriptions that automatically learn users? security concerns and linguistic preferences to produce user-oriented descriptions. We evaluate our scheme through experiments and user studies. The results clearly demonstrate the improvement on readability and users? security awareness of PERSCRIPTION?s descriptions compared to existing description generators.
Introduction to the Special Issue on Real-Time aspects in Cyber-Physical Systems
Energy harvesters are becoming increasingly popular as power sources for IoT edge devices. However, one of the intrinsic problems of energy harvester is that the harvesting power is often weak and frequently interrupted. Therefore, energy harvesting powered edge devices have to work intermittently. To maintain execution progress, execution states need to be checkpointed into the non-volatile memory before each power failure. In this way, previous execution states can be resumed after power comes back again. Nevertheless, frequent checkpointing and low charging efficiency generate significant energy overhead. To alleviate these problems, this paper conducts a thorough energy efficiency analysis and proposes three algorithms to maximize the energy efficiency of program execution. First, a non-volatile processor (NVP) aware task scheduling (NTS) algorithm is proposed to reduce the size of checkpointing data. Second, a tentative checkpointing avoidance (TCA) technique is proposed to avoid checkpointing for further reduction of checkpointing overhead. Finally, a dynamic wake-up strategy (DWS) is proposed to wake up the edge device at proper voltages where the total hardware and software overhead is minimized for further energy efficiency maximization. The experiments on a real testbed demonstrate that, with the proposed algorithms, an edge device is resilient to extremely weak and intermittent power supply and the energy efficiency is as $2\times$ high as the baseline technique.
In recent years, rapid development of sensing and computing has led to very large data sets. There is an urgent demand for innovative data analysis and processing techniques that are secure, privacy-protected and sustainable. In this paper, taking human activities and interactions with Cyber-Physical Systems (CPS) into consideration, we propose a human behavior learning system based on Channel State Information (CSI) utilizing a series of algorithms for data analysis and processing. Aiming to recognize a set of gestures, our system is designed based on the observation that different gestures have different effects on signals and specific gesture signals have a unique energy spectrum. Specifically, an improved Linear Discriminant Analysis Algorithm (I-LDA) is devised to reduce the dimension of human behavior signalsand lower computational cost. Additionally, behaviors are learned by Logistic Regression Algorithm (LRA) where bandwidth ratios in energy spectrum are selected as features to eliminate the impact of different speeds. We implement our system on commercial off-the-shelf WiFi devices and conduct a large number of experiments in a typical indoor environment to evaluate its performance. Experimental results show that our system is robust with average recognition accuracy of up to 96%.
There is a growing trend for employing cyber-physical systems to help smart homes to improve the comfort of residents. However, a residential cyber-physical system is differed from a common cyber-physical system since it directly involves human interaction, which is full of uncertainty. The existing solutions could be effective for performance enhancement in some cases when no inherent and dominant human factors are involved. Besides, The rapidly rising interest in the deployments of cyber-physical systems at home does not normally integrate with energy management schemes, which is a central issue that smart homes have to face. In this paper, we propose a cyber-physical system based energy management framework to enable a sustainable edge computing paradigm while meeting the needs of home energy management and residents. This framework aims to enable the full use of renewable energy while reducing electricity bills for households. A prototype system was implemented using real world hardware. The experiment results demonstrated that renewable energy is fully capable of supporting the reliable running of home appliances most of the time and electricity bills could be cut by up to 60% when our proposed framework was employed.
Low-power wireless communication has been widely used in cyber-physical systems which require timecritical data delivery. Achieving this goal is challenging because of link burstiness and interference. Based on significant empirical evidence of 21 days and over 3,600,000 packets transmission per link, we propose both routing and scheduling algorithms that produce latency bounds of the real-time periodic streams and accounts for both link bursts and interference. The solution is achieved through the definition of a new metric Bmax that characterizes links by their maximum burst length, and by choosing a novel least-burst-route that minimizes the sum of worst-case burst lengths over all links in the route. With extensive data-driven analysis, we show that our algorithms outperform existing solutions by achieving accurate latency bound with much less energy consumption. In addition, a testbed evaluation consisting of 48 nodes spread across a floor of a building shows that we obtain 100% reliable packet delivery within derived latency bounds. We also demonstrate how performance deteriorates and discuss its implications for wireless networks with insufficient high-quality links.
Coordinated vehicles for intelligent traffic management are instances of a cyber-physical systems with strict correctness requirements. A key building block for these systems is the ability to establish a group membership view that accurately captures the locations of all vehicles in a particular area of interest. We formally define view correctness in terms of soundness and completeness and establish theoretical bounds for the ability to verify view correctness. Moreover, we present an architecture for an online view detection and verification process that uses the information available locally to a vehicle. This architecture uses an SMT solver to automatically prove view correctness. We evaluate this architecture and demonstrate that the ability to verify view correctness is on par with the ability to detect view violations.
Cyber-Physical Systems (CPS) play a significant role in our critical infrastructure networks from power-distribution to utility networks. In fact, the emerging smart-grid concept is an effective critical CPS infrastructure that relies on two-way communications between smart devices to increase efficiency, enhance reliability, and reduce costs. However, compromised devices in the smart grid poses several security challenges. Consequences of propagating fake data or stealing sensitive smart grid information via compromised devices are costly. Hence, an early behavioral detection of compromised devices is critical for protecting smart grid's components and data. To address these concerns, in this paper, we introduce a novel and configurable system-level framework to identify compromised smart grid devices. The framework combines system and function call tracing techniques with signal processing and statistical analysis to detect compromised devices based on their behavioral characteristics. We measure the efficacy of our framework with a realistic smart grid substation testbed that includes both resource-limited and resource-rich devices. In total, using our framework we analyze six different types of compromised device scenarios with different resources and attack payloads. To the best of our knowledge, the proposed framework is the first in detecting compromised CPS smart grid devices with system and function-level call tracing techniques. The experimental results reveal an excellent rate on the detection of the compromised devices. Specifically, performance metrics include accuracy values between 0.95 and 0.99 for the different attack scenarios. Finally, the performance analysis demonstrates that the use of the proposed framework has a minimal overhead on the smart grid devices' computing resources.
Cyber-Physical-Social Systems (CPSS) integrating the cyber, physical and social worlds, is a key technology to provide proactive and personalized services for humans. In this paper, we studied CPSS, by taking human-interaction-aware big data (HIBD) as the starting point. However, the HIBD collected from all aspects of our daily lives are of high-order and large-scale, which brings ever-increasing challenges for their cleaning, integration, processing and interpretation. Therefore, new strategies of representing and processing of HIBD becomes increasingly important in the provision of CPSS services. As an emerging technique, tensor, is proving to be a suitable and promising representation and processing tool of HIBD. In particular, tensor networks, as a kind of significant tensor decomposition, bring advantages of computing, storage and application of HIBD. Furthermore, Tensor-Train (TT), another kind of tensor networks, is particularly well suited for representing and processing high-order data by decomposing a high-order tensor into a series of low order tensors. However, at present, there is still need for an efficient Tensor-Train decomposition method for massive data. Therefore, for lager-scale HIBD, a highly-efficient computational method of Tensor-Train is required. In this paper, a distributed Tensor-Train (DTT) decomposition method is proposed to process the high-order and large-scale HIBD. The high performance of the proposed DTT such as the execution time is demonstrated with a case study on a typical CPSS data - CT (Computed Tomography) image data. Furthermore, as a typical CPSS application for HIBD - recognition was carried out in TT to illustrate the advantage of DTT.
It is challenging to design a secure and efficient multi-factor authentication scheme for real-time user data access in wireless sensor networks (WSNs). On the one hand, such real-time applications are generally security-critical, and various security goals need to be met. On the other hand, sensor nodes and users' mobile devices are typically of resource-constrained nature, and expensive cryptographic primitives cannot be used. In this work, we first revisit four foremost multi-factor authentication schemes, i.e., Srinivas et al.'s (IEEE TDSC'18), Amin et al.'s (JNCA'18), Li et al.'s (JNCA'18) and Li et al.'s (IEEE TII'18) schemes, and use them as case studies to reveal the difficulties and challenges in designing a multi-factor authentication scheme for WSNs right. We identify the root causes for their failures in achieving truly multi-factor security and forward secrecy. We further propose a robust multi-factor authentication scheme that makes use of the imbalanced computational nature of the RSA cryptosystem, particularly suitable for scenarios where sensor nodes (but not the user's device) are the main energy bottleneck. Comparison results demonstrate the superiority of our scheme. As far as we know, it is the first one that can satisfy all the twelve criteria of the state-of-the-art evaluation metric under the harshest adversary model so far.
The Internet of Things (IoT) is becoming a backbone of sensing infrastructure to several mission critical applications such as smart health, disaster management, smart cities in distributed networks. Due to resource constrained sensing devices, IoT infrastructures use Edge datacenters (EDCs) for real-time data processing. EDCs can be either static or mobile in nature and this paper considers both these scenarios. Generally, EDCs communicate with IoT devices in emergency scenarios to evaluate the data in real-time. Protecting data communications from malicious activity becomes a key factor, as all the communication flows through insecure channels. In such infrastructures, it is a challenging task for EDC to ensure the trustworthiness of the data for emergency evaluations. The current communication security pattern of ?communication before authentication? leaves a ?black hole? for intruders to become part of communication processes without authentication. To overcome this issue and to develop security infrastructures for IoT and distributed Edge datacenters, this paper proposes a user centric security solution. The proposed security solution shifts from a network centric approach to a user centric security approach by authenticating devices before communication is established. A trusted controller is initialized to authenticate and establishes the secure channel between the devices before they start communication between themselves. The centralized controller draws a perimeter for secure communications within the boundary. Theoretical analysis and experimental evaluation of the proposed security model show that it not only secures the communication infrastructure but also improves the overall network performance.
The rapid increase in the number and type of malicious programs has made malware forensics a daunting task and caused users system to become on danger. Timely identifcation of malware characteristics including its origin and the malware sample family would signifcantly limit the potential damage of the malware. This is a more profound risk in Cyber-Physical Systems (CPS) where a malware attack may cause signifcant physical damage to the infrastructure. Due to limited on-device available memory and processing power in CPS and Internet of Things (IoT) devices, most of the e?orts for protecting CPS networks are focused on the Edge layer, where the majority of security mechanisms are deployed. In this paper, we are proposing a novel fuzzy clustering system for malware attack attribution. Our system is deployed on the edge layer to provide an insight into applicable malware threats to the CPS network. Existing binary malware classifcation techniques are only capable of identifying if a malware belongs to a given family or not. However, the majority of advanced and sophisticated malware programs are combining features from di?erent families. Accordingly, these malicious programs are not similar enough to any existing malware family and easily evade binary classifers detection. This paper proposes a multi-label fuzzy relevance classifer to detect similarities between a given malware sample and other known malware families. We leverage static analysis by utilizing Opcode frequencies as the feature space to classify malware families. We observed that a multi-label classifer does not classify a part of samples. We named this problem as instance coverage problem. To overcome this problem, we developed an ensemble-based multi-label fuzzy classifcation method to suggest the relevance of a malware instance to the stricken families. We tested our technique with three widely used datasets collected from three major malware repositories namely VirusShare, RandsomwareTracker and Microsoft Malware Classifcation Challenge (BIG2015). Our results on BIG2015 indicated an accuracy of 97.56%, a precision of 90.68%, and an f-measure of 89.21%. Also, our results on RandsomwareTracker revealed an accuracy of 94.26%, a precision of 87.21%, and an f-measure of 83.52%. Moreover, our results on samples collected from VirusShare demonstrated an accuracy of 94.66%, a precision of 86.41%, and an f-measure of 84.37%. Our system is most suitable for deployment on the edge layer of CPS or other resource-constraint networks to provide a real-time view of malware threats applicable to the underlying network.