Departmental workshop on applications of machine learning and data mining

We will be running departmental mini-workshop on the applications of machine learning and data mining on the 10th of June, 2016.

For the workshop we will be welcoming Borja Balle from the University of Lancaster, whose talk will be given along others from members of the department.

For further information please contact Alexander Kurz.

The morning's talks will be on the 8th floor of the Attenborough Tower in room ATT802. Borja Balle's seminar will be in ATT LT3.

The current schedule is as follows:

Leandro Minku

Online Ensemble Learning of Data Streams with Gradually Evolved Classes

10:00-10:30

ATT 802

In machine learning, class evolution is the phenomenon of class emergence and disappearance. It is likely to occur in many data stream problems, where additional training data is made available over time. For example, in the problem of classifying tweets according to their topic, new topics may emerge over time, and certain topics may become unpopular and not discussed anymore. Therefore, class evolution is an important research topic for data stream mining. Existing work implicitly regards class evolution as an abrupt change. However, in many real world problems, classes emerge or disappear gradually. In this talk, I will present a class-based ensemble approach able to deal with gradually evolved classes. The ensemble maintains a base learner for each class and dynamically creates, updates and (de)activates base learners whenever new training data become available in order to quickly adjust to class evolution. A dynamic under-sampling mechanism is also used to deal with the class-imbalance problem resulting from gradual class evolution. Empirical studies demonstrate the effectiveness of the proposed approach in various class evolution scenarios in comparison to existing class evolution approaches.

Abdullah Alshanqiti and Reiko Heckel

Extracting Visual Contracts from Java Programs

10:30-11:00 ATT 802

Visual contracts model the operations of components or services by pre- and post-conditions formalised as graph transformation rules. They provide a precise intuitive notation to support testing, understanding and analysis of software. However, due to their detailed specification of data states and transformations, modelling real applications is an error-prone process. In this paper we propose a dynamic approach to reverse engineering visual contracts from Java based on tracing the execution of Java operations. The resulting contracts give an accurate description of the observed object transformations, their effects and preconditions in terms of object structures, parameter and attribute values, and their generalised specification by universally quantified (multi) objects. An interesting question is the notion of correctness and completeness of this method. In particular the approach, being based on testing, only explores a subset of the possible behaviours and only records the conditions and effects for a subset of the classes in the system.

For the slides see https://prezi.com/eqwd9wyesdzy.

30 minute break
Neil Walkinshaw Black-box Test Data Generation by Uncertainty Sampling
11:30-12:00

ATT 802

Our work is concerned with the problem of generating test inputs for a program in the absence of any source code that can be used as guidance. Several papers have investigated the possibility of using models inferred by observing its behaviour during test executions to drive the identification of new tests. So far, test-generation from inferred models has been limited, and has often depended on the type of model inferred. In this work, we investigate the application of a Machine Learning technique known as `Uncertainty Sampling'; random tests are generated, and the tests that are selected are those about which the inferred model is `least certain'. We provide an implementation (including a GP engine that is used for model inference), and an evaluation on six subject systems from the Apache Commons Math framework. The results indicate that this approach outperforms conventional and Adaptive Random Testing.
Nir Piterman

Extending Probabilistic Temporal Logics

12:00-12:30 ATT 802

The increase in popularity of probabilistic model checking has led to further interest in the studies of probabilistic temporal logics. In this talk I will discuss our work on the introduction of a probabilistic $\mu$-calculus that is strong enough to encode well-known logics such as pCTL and pCTL*. Our logic uses probabilistic quantification over paths as an atomic operation, leading to a logic that is tightly connected to two-player games, with a game semantics that is reminiscent of that of the classical $\mu$-calculus. In addition, the complexity of its model-checking procedure is the same as that of the classical $\mu$-calculus and the algorithms are similar. I will further highlight the ability of this logic to identify repeating patterns of probability by discussing an extention of pCTL with recursive operators.

The talk is based on joint publications with Pablo Castro and Cecilia Kilmurray that appeared in STACS'15 and ICFEM'15

Borja de Balle Pigem(U.o. Lancaster) Differentially Private Policy Evaluation
2pm-3pm, ATT LT3 Differential privacy is a mathematical model of privacy which has recently gained a lot of attention within the machine learning community. Essentially, an algorithm operating on a database is differentially private if its output only reveals a small amount of information about any entity in the database. It can be shown that this provides a notion of privacy which is robust against attackers with side knowledge. The first part of this talk will review the basics of differential privacy. In the second part we will present our recent work on differentially private algorithms for policy evaluation in Markov decision processes. Policy evaluation is an important part of reinforcement learning algorithms. In the batch setting, Monte Carlo algorithms for policy evaluation can be formalized using a least squares regression problem with correlated data. Following this approach, we will present two private algorithms for policy evaluation, explain their privacy and utility guarantees, and discuss empirical results on a simple problem that mimics typical applications of Markov decision processes to medical domains.
Rest of afternoon Open discussion.

Share this page: