Online Learning with Long Short-Term Memory Networks for Anomaly Detection in Time Series
ABSTRACT. Time series forecasting for streaming data plays an important role in many real applications. In real-world applications, the time series is usually contaminated by anomalies or outliers that are abrupt observations deviating from the normal behaviour where the underlying patterns of time series generally keep changing over time. The presence of outliers and change points can adversely affect the time series analysis and complicate the learning process of the time series, especially in the context of online learning mode. In this research, we focus on the task of anomaly detection, such that the anomaly detection model which learned with Long Short-Term Memory (LSTM) networks can detect and accordingly adapt whenever there is a change in data behaviour in an online way. We will perform some experimental analysis on real datasets to evaluate the performance of the proposed method.
Mindreading for Robots: Predicting Intentions via Dynamical Clustering of Human Postures
ABSTRACT. Recent advancements in robotics suggest a future where social robots will be deeply integrated in our society. In order to understand humans and engage in finer interactions, robots would greatly benefit from the ability of intention reading: the capacity to discern the high-level goal that is driving the low-level actions of an observed agent. This is particularly useful in joint action scenarios, where human and robot must collaborate to reach a shared goal: if the latter can predict the actions of the former, it will be able to use this information for decision making in order to improve the quality of the cooperation. This research proposes an artificial cognitive architecture, based on the developmental robotics paradigm, that can estimate the goals of a human partner engaged in a joint task to modulate synergistic behavior. This is accomplished using unsupervised dynamical clustering of human skeletal data and a hidden semi-Markov chain. The effectiveness of this architecture has been tested through an interactive cooperative experiment involving a block building game and involving the iCub robot and a human. The results show that the former is able to adopt a collaborative behavior by correctly performing intention reading based on the partner's physical clues.
Exploring Matrix Completion for Single Word to Multi-Word Transition in Cognitive Robotics
ABSTRACT. Infants acquire language in distinct stages, starting from single gestures and single words, and through utilising gestures, they learn multi-word combinations. To achieve this language development on artificial agents, we propose a multimodal computational model for single to multi-word transition through gesture-word combinations. Our approach relies on advancements in deep models for feature extraction and on
casting the supplementary word generation problem into a matrix completion task. Experimental evaluation is carried out on a dataset recorded directly from the humanoid iCub’s cameras, comprising the deictic gesture of pointing and real-world objects. Illustrated by our results, the proposed architecture further strengthens its potential to model early stage language acquisition.
Using Common-sense Knowledge to detect Human-Object Interactions
ABSTRACT. The task of Object Detection has seen significant progress in recent years due deep convolutional neural networks, the machine learning model most commonly used in this field, being increasingly more powerful and efficient. However, in order for machines to "understand" an image they have to be able to detect interactions in it. This is why detecting Human-Object Interactions (HOI) has received attention lately.
However, relying purely on Deep Learning to deal with this task has two main disadvantages. First, only using visual cues makes it challenging to make a prediction in case of "noise" such as occlusions, poor resolution or visually similar objects (e.g., surfboard-snowboard) or actions (e.g., eating-feeding). Second, the discriminating capability of a deep model is dictated by the data it is trained on, but this might not be ideal in case of significantly different visual features corresponding to the same action (for example, "riding" has been associated with the subject seating, but the same action can be used in the case of a person riding a surfboard, where the subject is not sitting). Motivated by this, we use the information coming from different common-sense knowledge bases to improve HOI detection.
Use of Graphic and Phonetic Features in Clinicial Named Entity Recognition in Chinese
ABSTRACT. Chinese characters present abundant information with its graphical feature, recent research on Chinese word embedding tries to use graphical information as subword. This research uses both graphical and phonetic features to improve the performance of Chinese Clinical Named Entity Recognition based on the presence of phono-semantic characters.
A Walk-based Model on Entity Graphs for Relation Extraction
ABSTRACT. Relation Extraction is a task of Natural Language Processing (NLP) that aims to identify associations between named entities in text. This talk will focus on providing the background of relation extraction and its importance for NLP.
Additionally, a graph-based neural network model will be introduced for relation extraction in sentences. The model aims to tackle the limitations of existing relation extraction models. The main intuition behind the proposed approach lies in the fact that existing named entities in sentences can provide supportive evidence for intra-sentential relation extraction.
Experiments on two different domain datasets indicate the effectiveness of the proposed method for sentence-level relation extraction.
Topic-centric sentiment analysis of UK parliamentary debate transcripts
ABSTRACT. Debate transcripts from the UK House of Commons provide access to a wealth of information concerning the opinions and attitudes of politicians and their parties towards arguably the most important topics facing societies and their citizens, as well as potential insights into the democratic processes that take place within Parliament.
By applying natural language processing and machine learning methods to debate speeches, it is possible to automatically determine the attitudes and positions expressed by speakers towards the topics they discuss.
This talk will focus on research on speech-level sentiment analysis and opinion-topic/policy detection, as well as discussing the challenges of working in this domain.
An Empirical Study on End-to-End Sentence Modelling
ABSTRACT. Accurately representing the meaning of a piece of text, otherwise known as sentence modelling is an important component in many natural language inference tasks. We survey the spectrum of these methods, which lie along two dimensions: input representation granularity and compositional model complexity. Using this framework, we reveal in our quantitative and qualitative experiments the limitations of the current state-of-the-art model in the context of sentence similarity tasks.
Modelling Instance-Level Annotator Reliability for Natural Language Labelling Tasks
ABSTRACT. When constructing models that learn from noisy labels produced by multiple annotators, it is important to accurately estimate the reliability of annotators. Annotators may provide labels of inconsistent quality due to their varying expertise and reliability in a domain. Previous studies have mostly focused on estimating each annotator’s overall reliability on the entire annotation task. However, in practice, the reliability of an annotator may depend on each specific instance. Only a limited number of studies have investigated modelling per-instance reliability and these only considered binary labels. In this paper, we propose an unsupervised model which can handle both binary and multi-class labels. It can automatically estimate the per-instance reliability of each annotator and the correct label for each instance. We specify our model as a probabilistic model which incorporates neural networks to model the dependency between latent variables and instances. For evaluation, the proposed method is applied to both synthetic and real data, including two labelling tasks: text classification and textual entailment. Experimental results demonstrate our novel method can not only accurately estimate the reliability of annotators across different instances, but also achieve superior performance in predicting the correct labels and detecting the least reliable annotators compared to state-of-the-art baselines.
ABSTRACT. Entity mentions embedded in longer entitymentions are referred to as nested entities.Most named entity recognition (NER) sys-tems deal only with the flat entities and ignorethe inner nested ones, which fails to capturefiner-grained semantic information in underly-ing texts. To address this issue, we proposea novel neural model to identify nested enti-ties by dynamically stacking flat NER layers.Each flat NER layer is based on the state-of-the-art flat NER model that captures sequen-tial context representation with bidirectionallong short-term memory (LSTM) layer andfeeds it to the cascaded CRF layer. Our modelmerges the output of the LSTM layer in thecurrent flat NER layer to build new represen-tation for detected entities and subsequentlyfeeds them into the next flat NER layer. Thisallows our model to extract outer entities bytaking full advantage of information encodedin their corresponding inner entities, in aninside-to-outside way. Our model dynamicallystacks the flat NER layers until no outer enti-ties are extracted. Extensive evaluation showsthat our dynamic model outperforms state-of-the-art feature-based systems on nested NER,achieving 74.7% and 72.2% on GENIA andACE2005 datasets, respectively, in terms of F-score.
Echo State: Embracing Randomness in Recurrent Neural Networks
ABSTRACT. Echo State Networks (ESNs) are a recurrent neural network model in which a large random recurrent state is used. In this talk, I discuss the advantages of this kind of model over conventional recurrent architectures, as well as progress that has been made in understanding exactly how they work. I will also discuss my research in this field, looking at fundamental properties of linear ESNs.
Novel approaches towards reducing switching current in magnetic tunnel junctions for STT-MRAM
ABSTRACT. Advances in modern consumer electronics have put an ever increasing demand on the data storage industry to provide high capacity and smaller scale storage media. Due to its fast read/write speed and possibility to overcome current DRAM scaling limitations, STT-MRAM (spin transfer torque magnetic random access memory) is a promising candidate for next generation non-volatile memory [1]. However, STT-MRAM faces challenges for future development which are dependent on optimising its key component, the magnetic tunnel junction (MTJ) [2]. Critical parameters required by the MTJ device in order to improve STT-MRAM performance include a high tunnelling magnetoresistance (TMR) and a low switching current density [3]. This work therefore investigates the potential of replacing the device’s magnetic free layer with multiple magnetic layers (graded freelayer) to reduce the MTJ switching current density. MTJs have been fabricated by sputter deposition, and deposition conditions have been optimised in order to achieve good TMR. Magnetic and electrical transport properties of fabricated MTJs have been investigated to evaluate the benefit of including a graded freelayer for improving STT-MRAM performance.
[1] Y. Chen et al. IEEE Xplore, 11283456 (Mar 2010).
[2] D. Apalkov, B. Dieny, Fellow IEEE & J. M. Slaughter, Proceedings of the IEEE 104, 1796-1830 (2016).
[3] Lee, D.-Y., Hong, S.-H., Lee, S.-E. & Park J.-G., Sci. Rep. 6, 38125 (2016).
ABSTRACT. Polyhedral techniques are, when applicable, an effective instrument for automatic parallelization and data locality optimization of sequential programs. This talk motivates their adoption in OpenStream, a task-parallel streaming language following the dataflow model of execution. We show that (1) it is possible to exploit the parallelism that naturally arises from dataflow task graphs with loop tiling transformations provided by the polyhedral model and (2) that a combination of dataflow task-parallelism and polyhedral optimizations performs significantly better than polyhedral parallelization techniques applied to sequential programs. Our technique obtains parallel speedups superior to 1.3x when compared to state-of-the-art polyhedral-tiled OpenMP for a simple Gauss-Seidel kernel. However, stream indexing is often polynomial in the general case, severely limiting the set of OpenStream programs amenable to polyhedral tools and hindering the automation of our technique. We further investigate how the approach of Feautrier may offer a path not only to automatically convert fine-grained task-level concurrency to coarser-grained tasks, but also for scheduling under resource constraints.
Adaptation of Instantaneous Time Mirror from Water Waves to Electromagnetic Waves
ABSTRACT. Cancer treatments come with risks, and it would be desirable to mitigate or reduce these risks as much as possible. This research project explores the possibility of developing a new cancer treatment that would solve problems with current treatments such as radiation misalignment in radiotherapy and non-uniform heat distribution in hyperthermia (thermotherapy). Hyperthermia with a technique known as Time Reversal Mirror (TRM) uses antennas to back-propagate and focus radiation on the tumor to apply heat. Here we study an alternative approach to create reversed propagations known as the Instantaneous Time Mirror (ITM) that does not require use of antennas and might well provide a better focus of heat on the tumour. In this presentation, we discuss how we adapt principles of ITM previously demonstrated with water waves to electromagnetic waves and conduct computer simulations to study ITM-inducing methods. We present promising preliminary results in 2D which demonstrate ITM is possible in electromagnetic waves and provide a characterization of the resulting reversed propagations produced.
A functional linear regression approach for electric load forecasting
ABSTRACT. As more and more renewable energy options have been added to the electrical grid, the need for a more efficient, robust and smarter grid has increased. The number of electric vehicles would also increase in the future which would result in a significant amount of strain on the electrical grid. Therefore, there is an increased need for advanced short term load forecasting techniques in order to maintain the quality of the current electrical grid and ensure that all the generation resources available are utilized efficiently. In this talk, a functional linear regression approach has been proposed to forecast short term electrical load one day in advance. Functional approach is useful as it gives a complete demand curve which makes planning easier for an electric utility. The forecast was obtained by using a functional B-spline approximation of past values. The performance of this functional data technique has been assessed by using historical hourly electric load data from an American electric utility. The results were obtained for four different regions separately and then aggregated. The aggregated approach is more useful as compared to overall prediction as individual models can capture details unique to a particular region. The aggregated result was compared with the overall result of whole region and an ARIMA model.
ABSTRACT. Regulation of gene expression occurs through binding of proteins called transcription factors (TFs) to the DNA. I will show how CNN models can be trained to predict experimental outcomes, how the prediction can be attributed to the elements of the input, and how to perform multitask learning with a bigger dataset and regularise a high-dimensional classification problem.
Modelling Expression Rates of Hypoxia-inducible Factors in Growing Tumours: Investigating Potentials for Non-Invasive Cancer Therapies
ABSTRACT. In-silico modelling has the potential to complement laboratory studies by accurately simulating biological processes, offering insights into tumour dynamics, significantly reducing experimental costs, and improving the quality of measurements. Furthermore, in-silico studies allow to explore counterfactual situations and test multiple hypotheses in a quick and efficient manner. We have developed a comprehensive artificial model (informed by the medical literature) to investigate the potential for a novel non-invasive therapy aimed at containing cancer growth in tumour cells. Hypoxia-Inducible Factors have been previously identified as a
target for non-invasive cancer therapy, where the requirement being that their expression could be artificially enhanced. While anti-angiogenic therapy has been suggested as potential vehicle to achieve this, the extent to which it is possible remains unclear, due to the challenges of making the necessary in-vivo measurements.
We executed in-silico experiments to explore the effects of anti-angiogenic therapy. The results suggest that it can successfully act as a necessary precursor [redisposing the tumour mass to non-invasive anti-cancer therapies. In addition to the medical value of the results, the results also highlight the importance of in-silico approaches as a complement to laboratory studies. The present study paves the way for future in-silico and laboratory experiments aiming at devising non-invasive therapies based on the joint action of anti-angiogenic and hypoxia-inducible-factor targeting.
ABSTRACT. The validation of physically-based deformable models in computer graphics is typically qualitative (through visual plausibility), which is necessarily user subjective. This research facilitates quantitative validation through the construction of a single scalar output that quantifies the agreement between the complete deformation histories of test and reference models. The proposed software framework can provide an objective measure of accuracy and a standardised way to quantitatively compare the accuracy of one method against another, whilst also supplying a quantitative rationale for trading accuracy and performance.
Gaussian process modelling of count data with application to bulk and single-cell RNA-Seq
ABSTRACT. Many biological datasets can be summarised as count data, including data from high-
throughput sequencing assays such as RNA-Seq or ATAC-Seq, or data from smFISH assays. Gaussian process (GP) regression provides a useful non-parametric approach for modelling time-series or spatial data. However, current implementations of GP regression do not support scalable inference over count data for large-scale datasets derived from genomic technologies. We have developed a GP regression method (GPcount) for data with negative binomial and zero-inflated negative binomial likelihoods. We show that our method provides robust inference for these
intractable likelihood functions and provides good performance for data with high dispersion or dropout. We used a transformed GP that relates the GP prior to a negative binomial likelihood through an exponentiated link function. We model dropout using zero-inflated negative binomial likelihood where the gene-specific
drop-out rate is linked to the GP prior rate through a Michaelis-Menten function. We use our model to identify differentially expressed genes in bulk RNA-seq time series and single cell RNA-seq (scRNA-seq) after pseudotime inference. We validate our model on simulated bulk RNA-seq and scRNA-seq times series and assess its performance using ROC curves. Our results confirm that GPcount can efficiently model high dispersion and dropout in bulk and scRNA-seq time series where simpler Gaussian and Poisson likelihoods fail.
GeVIR: Gene Variation Intolerance Rank – an in silico approach to prioritise disease candidate genes.
ABSTRACT. With the advent of large population variant databases it has been possible to quantitatively assess the ability of genes to tolerate functional genetic variation leading to the development of gene level variation intolerance metrics. However, previous methods were focused on analysing the variant load in a gene, but not their distribution pattern within a gene. Utilising sequence data from 138,632 whole exome/genome sequences, we developed Gene Variation Intolerance Rank (GeVIR), which is based on the hypothesis that long regions lacking protein altering variants within a gene can be used to prioritise disease candidate genes. To evaluate GeVIR, we compared its ability to prioritise known genes responsible for Mendelian diseases with gnomAD gene constraint metrics (https://gnomad.broadinstitute.org/). GeVIR showed a superior performance to missense constraint metrics and comparable, but complementary to the loss-of-function (LoF) constraint metric since it was able to prioritise short genes, for which LoF constraint currently cannot be confidently estimated.