ARCH19:Papers with Abstracts

Papers
Abstract. This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with piecewise constant dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2019. In this third edition, six tools have been applied to solve five different benchmark problems in the category for piecewise constant dynamics: BACH, Lyse, Hy- COMP, PHAVer/SX, PHAVerLite, and VeriSiMPL. Compared to last year, a new tool has participated (HyCOMP) and PHAVerLite has replaced PHAVer-lite. The result is a snap- shot of the current landscape of tools and the types of benchmarks they are particularly suited for. Due to the diversity of problems, we are not ranking tools, yet the presented results probably provide the most complete assessment of tools for the safety verification of continuous and hybrid systems with piecewise constant dynamics up to this date.
Abstract. This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with linear continuous dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2019. In its third edition, seven tools have been applied to solve six different benchmark problems in the category for linear continuous dynamics (in alphabetical order): CORA, CORA/SX, HyDRA, Hylaa, JuliaReach, SpaceEx, and XSpeed. This report is a snapshot of the current landscape of tools and the types of benchmarks they are particularly suited for. Due to the diversity of problems, we are not ranking tools, yet the presented results provide one of the most complete assessments of tools for the safety verification of continuous and hybrid systems with linear continuous dynamics up to this date.
Abstract. We present the results of a friendly competition for formal verification of continuous and hybrid systems with nonlinear continuous dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2019. In this year, 6 tools Ariadne, CORA, DynIbex, Flow*, Isabelle/HOL, and JuliaReach (in alphabetic order) participated. They are applied to solve reachability analysis problems on four benchmark problems, one of them with hybrid dynamics. We do not rank the tools based on the results, but show the current status and discover the potential advantages of different tools.
Abstract. This report presents the results of a friendly competition for formal verification and policy synthesis of stochastic models. It also introduces new benchmarks within this category, and recommends next steps for this category towards next year’s edition of the competition. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in Spring 2019.
Abstract. This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with artificial intelligence (AI) components. Specifically, machine learning (ML) components in cyber-physical systems (CPS), such as feedforward neural networks used as feedback controllers in closed-loop systems are considered, which is a class of systems classically known as intelligent control systems, or in more modern and specific terms, neural network control systems (NNCS). For future iterations, we more broadly refer to this category as AI and NNCS (AINNCS). The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2019. In the first edition of this AINNCS category at ARCH-COMP, three tools have been applied to solve five different benchmark problems, (in alphabetical order): NNV, Sherlock, and Verisig. This report is a snapshot of the current landscape of tools and the types of benchmarks for which these tools are suited. Due to the diversity of problems and that this is the first iteration of this category, we are not ranking tools in terms of performance, yet the presented results probably provide the most complete assessment of tools for the safety verification of NNCS.
Abstract. This report presents results of a friendly competition for formal verification of continuous and hybrid systems with linear continuous dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2019. In its third edition, three tools have been applied to solve three different benchmark problems in the category ofbounded model checking of hybrid systems with piecewise constant dynamics (in alphabetical order): BACH, HyDRA, and XSpeed. Compare to last year, HyDRA is equipped with new optimization techniques and the performance is improved accordingly. This report is a snapshot of the current landscape of tools and the types of benchmarks they are particularly suited for. Due to the diversity of problems, we are not ranking tools and we also welcome more tools to join in this friendly competition in the future event.
Abstract. This report presents the results from the 2019 friendly competition in the ARCH workshop for the falsification of temporal logic specifications over Cyber-Physical Systems. We describe the organization of the competition and how it differs from previous years. We give background on the participating teams and tools and discuss the selected benchmarks and results. The benchmarks are available on the ARCH website1, as well as in the competition’s gitlab repository2. The main outcome of the 2019 competition is a common benchmark repository, and an initial base-line for falsification, with results from multiple tools, which will facilitate comparisons and tracking of the state-of-the-art in falsification in the future.
Abstract. This paper reports on the Hybrid Systems Theorem Proving (HSTP) category in the ARCH-COMP Friendly Competition 2019. The most important characteristic features of the HSTP category remain as in the previous edition [MST+18]: i) The flexibility of programming languages as structuring principles for hybrid systems, ii) The unambiguity and precision of program semantics, and iii) The mathematical rigor of logical reason- ing principles. The HSTP category especially features many nonlinear and parametric continuous and hybrid systems. Owing to the nature of theorem proving, HSTP again accommodates three modes: A) Automatic in which the entire verification is performed fully automatically without any additional input beyond the original hybrid system and its safety specification. H) Hints in which select proof hints are provided as part of the input problem specification, allowing users to communicate specific advice about the system such as loop invariants. S) Scripted in which a significant part of the verification is done with dedicated proof scripts or problem-specific proof tactics. This threefold split makes it possible to better identify the sources of scalability and efficiency bottlenecks in hybrid systems theorem proving. The existence of all three categories also makes it easier for new tools with a different focus to participate in the competition, wherever they focus on in the spectrum from fast proof checking all the way to full automation. The types of benchmarks considered and experimental findings are described in this paper as well.
Abstract. This report presents the results of the repeatability evaluation for the 3rd International Competition on Verifying Continuous and Hybrid Systems (ARCH-COMP'19). The competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2019, affiliated with the Cyber-Physical Systems and Internet of Things (CPS-IoT Week'19). In its third edition, twenty-five tools submitted artifacts through a Git repository for the repeatability evaluation, applied to solve benchmark problems for eight competition categories. The majority of participants adhered to new requirements for this year's repeatability evaluation, namely to submit scripts to automatically install and execute tools in containerized virtual environments (specifically Dockerfiles to execute within Docker). The repeatability results represent a snapshot of the current landscape of tools and the types of benchmarks for which they are particularly suited and for which others may repeat their analyses. Due to the diversity of problems in verification of continuous and hybrid systems, as well as basing on standard practice in repeatability evaluations, we evaluate the tools with pass and/or failing being repeatable.
Abstract. Collision detection algorithms are used in aerospace, swarm robotics, automotive, video gaming, dynamics simulation and other domains. As many applications of collision detection run online, timing requirements are imposed on the algorithm runtime: algorithms must, at a minimum, keep up with the passage of time. Even offline reachability computation can be slowed down by the process of safety checking when n is large and the specification is n-to-n collision avoidance. In practice, this places a limit on the number of objects, n, that can be concurrently tracked or verified. In this paper, we present an improved method for efficient object tracking and collision detection, based on a modified version of the axis-aligned bounding-box (AABB) tree data structure. We consider 4D AABB Trees, where a time dimension is added to the usual three space dimensions, in order to enable per-object time steps when checking for collisions in space-time. We evaluate the approach on a space debris collision benchmark, demonstrating efficient checking beyond the full catalog of n = 16848 space objects made public by the U.S. Strategic Command on www.space-track.org.
Abstract. Benchmark Proposal: The implementation of digital control systems in complex multi- core or distributed real-time systems results in non-deterministic input/output timing. Such timing deviations typically lead to degraded performance or even instability, which in turn may jeopardize safety goals. We present the problem of proving worst-case guarantees for given input/output timing bounds as a benchmark for the verification of hybrid dynamical systems.
Abstract. This benchmark suite presents a detailed description of a series of closed-loop control systems with artificial neural network controllers. In many applications, feed-forward neural networks are heavily involved in the implementation of controllers by learning and representing control laws through several methods such as model predictive control (MPC) and reinforcement learning (RL). The type of networks that we consider in this manuscript are feed-forward neural networks consisting of multiple hidden layers with ReLU activation functions and a linear activation function in the output layer. While neural network con- trollers have been able to achieve desirable performance in many contexts, they also present a unique challenge in that it is difficult to provide any guarantees about the correctness of their behavior or reason about the stability a system that employs their use. Thus, from a controls perspective, it is necessary to verify them in conjunction with their corresponding plants in closed-loop. While there have been a handful of works proposed towards the verification of closed-loop systems with feed-forward neural network controllers, this area still lacks attention and a unified set of benchmark examples on which verification techniques can be evaluated and compared. Thus, to this end, we present a range of closed-loop control systems ranging from two to six state variables, and a range of controllers with sizes in the range of eleven neurons to a few hundred neurons in more complex systems.
Abstract. Tool presentation: We present work in progress on a stand-alone implementation of Lagrangian reachability, a recently introduced over-approximation technique for nonlinear continuous systems. Unlike the previous prototype, the current implementation does not depend on the over-approximation tool CAPD, and invokes an improved Lohner’s QR method to tame the infamous wrapping effect.