StatScale Workshop Abstracts 2021

StatScale Workshop 2020 Abstracts

Tom Berrett (University of Warwick)

Title: Online nonparametric changepoint detection under local privacy constraints’

Abstract: In recent years there has been a surge of interest in data analysis methodology that is able to achieve strong statistical performance without compromising the privacy and security of individual data holders, for example by only publishing randomised versions of data sets. In this talk I will present new work that studies the problem of online changepoint detection under local differential privacy (LDP) constraints, where the statistician does not have direct access to the raw data at any point.

At each time point t, the raw data is assumed to be of the form (X_t, Y_t), where X_t is a d-dimensional feature vector and Y_t is a response variable. Our aim is to detect changes in the regression function. We provide algorithms which respect the LDP constraint, which control the false alarm probability, and which detect changes with a minimal (rate-optimal) delay. We compare the optimal rate to the optimal rate in the benchmark, non-private setting, thus quantifying the cost of privacy. Our techniques also apply to give optimal algorithms for the simpler problem of the detection of a change in mean in univariate data, under privacy constraints.

This is joint work with Yi Yu.

Oliver Feng (University of Cambridge)

Title: Nonparametric, tuning-free estimation of S-shaped functions

Abstract: We consider the nonparametric estimation of an S-shaped regression function. The least squares estimator provides a very natural, tuning-free approach, but results in a nonconvex optimisation problem, since the inflection point is unknown. We show that the estimator may nevertheless be regarded as a projection onto a finite union of convex cones, which allows us to propose a mixed primal-dual bases algorithm for its efficient, sequential computation. After developing a general projection framework that demonstrates the consistency and robustness to misspecification of the estimator, we prove worst-case and adaptive risk bounds for the estimation of the regression function, in the form of sharp oracle inequalities, and establish bounds on the rate of convergence of the estimated inflection point. These theoretical results reveal not only that the estimator achieves the minimax optimal rate for both the estimation of the regression function and its inflection point (up to a logarithmic factor in the latter case), but also that it is able to achieve an almost-parametric rate when the true regression function is piecewise affine with not too many affine pieces. Simulations also confirm the desirable finite-sample properties of the estimator.

Piotr Fryzlewicz (LSE)

Title: Narrowest Significance Pursuit: inference for multiple change-points in linear models

Abstract: We propose Narrowest Significance Pursuit (NSP), a general and flexible methodology for automatically detecting localised regions in data sequences which each must contain a change-point, at a prescribed global significance level. Here, change-points are understood as abrupt changes in the parameters of an underlying linear model. NSP works by fitting the postulated linear model over many regions of the data, using a certain multiresolution sup-norm loss, and identifying the shortest interval on which the linearity is significantly violated. The procedure then continues recursively to the left and to the right until no further intervals of significance can be found. The use of the multiresolution sup-norm loss is a key feature of NSP, as it enables the transfer of significance considerations to the domain of the unobserved true residuals, a substantial simplification. It also guarantees important stochastic bounds which directly yield exact desired coverage probabilities, regardless of the form or number of the regressors. NSP works with a wide range of distributional assumptions on the errors, including Gaussian with known or unknown variance, some light-tailed distributions, and some heavy-tailed, possibly heterogeneous distributions via self-normalisation. It also works in the presence of autoregression. The mathematics of NSP is, by construction, uncomplicated, and its key computational component uses simple linear programming. In contrast to the widely studied “post-selection inference” approach, NSP enables the opposite viewpoint and paves the way for the concept of “post-inference selection”. Pre-CRAN R code implementing NSP is available at https://github.com/pfryz/nsp.

Ilmun Kim (University of Cambridge)

Title: Minimax optimality of permutation tests

Abstract: Permutation tests are widely used in statistics, providing a finite-sample guarantee on the type I error rate whenever the distribution of the samples under the null hypothesis is invariant to some rearrangement. Despite its increasing popularity and empirical success, theoretical properties of the permutation test, especially its power, have not been fully explored beyond simple cases.

In this talk, we attempt to fill this gap by introducing a general framework for analyzing the non-asymptotic power of the permutation test. Our framework will be illustrated in the context of two-sample and independence testing under both discrete and continuous settings. In each setting, we introduce permutation tests based on U-statistics and study their minimax performance. This talk is based on joint work with Sivaraman Balakrishnan and Larry Wasserman.

Hyeyoung Maeng (Lancaster University)

Title: Collective anomaly detection in High-dimensional VAR Models

Abstract: There is an increasing interest in detecting collective anomalies: potentially short periods of time where the features of data change before reverting back to normal behaviour. We propose a new method for detecting a collective anomaly in VAR models. Our focus is on situations where the change in the VAR coefficient matrix at an anomaly is sparse (i.e. a small number of entries of the VAR coefficient matrix change). To tackle this problem, we propose a test statistic for a local segment which is built on the lasso estimator of the change in model parameters. This enables us to detect a sparse change more efficiently and our lasso-based approach become especially advantageous when the anomalous interval is relatively short. We show that the new procedure controls Type 1 error and has asymptotic power tending to one. The practicality of our approach is demonstrated through simulations and two data examples, involving New York taxi trip data and EEG data.

Florian Pein (University of Cambridge)

Title: Robust change-point regression

Abstract: Piecewise constant regression is undeniable the most common form of change-point regression. However, the assumption of a piecewise constant signal is questionable, to varyingly degrees, in many applications. To mitigate those issues, we will assume a signal that can be decomposed into a piecewise constant function and a smooth function. A non-parametric estimation is obtained by combining the fused lasso with smoothing techniques. Tuning parameters are selected by cross-validation using L1-loss. We will explain in detail why L2-loss is a bad choice.

To this end, we will consider in the second part of the talk a piecewise constant signal. We will demonstrate that cross-validation using L2-loss has not only a tendency to overestimate the number of change-points in some examples, it also underestimates the number of change-points in other examples. Consequently, even L2-consistency cannot be guaranteed.

We will discuss these points theoretically and in simulated examples. We will then propose a modified cross-validation criterion for which consistent estimation of the change-points can be showed. Moreover, we will argue and verify by simulations that cross-validation using L1-loss can be good alternative.

Gaetano Romano (Lancaster University)

Title: FOCuS: a CUSUM statistics for fast online changepoint detection

Abstract: Changepoint analysis has been of major interest in recent times, with an increasing number of applications demanding an online analysis of a data stream. And as one enters the real-time domain, several challenges appear that render most of the current methods infeasible. We will present the FOCuS procedure, a fast online changepoint detection algorithm based off the Page-CUSUM sequential likelihood ratio test, and show how it is possible to solve the online changepoint detection problem sequentially through an efficient dynamic programming recursion. The FOCuS procedure outperforms current state-of-the-art algorithms both in terms of efficiency and statistical power.
We will discuss the use of FOCuS in collaboration with Italian astrophysicists for detecting gamma ray bursts from a grid of satellites in space. Adaptions include moving from the Gaussian setting to Poisson data representing underlying photon counts, and working with significant computational and memory limitations due to the cost of launching hardware into orbit.

Joint presentation with Kim Ward

Martin Tveten (University of Oslo)

Title: Scalable changepoint and anomaly detection in cross-correlated data

Abstract: Motivated by a problem of detecting time-periods of suboptimal operation of a sensor-monitored subsea pump, this talk presents work on detecting anomalies or changes in the mean of a subset of variables in cross-correlated data. The approach is based on penalised likelihood methodology, but the maximum likelihood solution scale exponentially in the number of variables. I.e., not many variables are needed before an approximation is necessary. We propose an approximation in terms of a binary quadratic program and derive a dynamic programming algorithm for computing its solution in linear time in the number of variables, given that the precision matrix is banded. Our simulations indicate that little power is lost by using the approximation in place of the exact maximum likelihood, and that our method performs well even if the sparsity structure of the precision matrix estimate is misspecified. Through the simulation study, we also aim to understand when it is worth the effort to incorporate correlations rather than assuming all variables to be independent in terms of detection power and accuracy.

Nicolas Verzelen (INRAE)

Title: Optimal univariate change-point detection and Localization

Abstract: Given a times series Y in ℝn, with a piece-wise contant mean and independent components, the twin problems of change-point detection and change-point localization respectively amount to detecting the existence of times where the mean varies and estimating the positions of those change-points. In this work, we tightly characterize optimal rates for both problems and uncover the phase transition phenomenon from a global testing problem to a local estimation problem. Introducing a suitable definition of the energy of a change-point, we first establish in the single change-point setting that the optimal detection threshold is 2loglog(n)‾‾‾‾‾‾‾‾‾‾‾√. When the energy is just above the detection threshold, then the problem of localizing the change-point becomes purely parametric: it only depends on the difference in means and not on the position of the change-point anymore. Interestingly, for most change-point positions, it is possible to detect and localize them at a much smaller energy level. In the multiple change-point setting, we establish the energy detection threshold and show similarly that the optimal localization error of a specific change-point becomes purely parametric. Along the way, tight optimal rates for Hausdorff and l1 estimation losses of the vector of all change-points positions are also established. Two procedures achieving these optimal rates are introduced. The first one is a least-squares estimator with a new multiscale penalty that favours well spread change-points. The second one is a two-step multiscale post-processing procedure whose computational complexity can be as low as O(nlog(n)). Notably, these two procedures accommodate with the presence of possibly many low-energy and therefore undetectable change-points and are still able to detect and localize high-energy change-points even with the presence of those nuisance parameters.

Kim Ward (Lancaster University)

Title: FOCuS: a CUSUM statistics for fast online changepoint detection

Joint presentation with Gaetano Romano

Ines Wilms (University of Maastricht)

Title: Tree-based Node Aggregation in Sparse Graphical Models

Abstract: High-dimensional graphical models are often estimated using regularization by relying on edge sparsity as a simplifying assumption. In this work, we aggregate the nodes of the graphical model to produce parsimonious graphical models that provide an even simpler description of the dependence structure than would otherwise be possible. We develop a convex regularizer, called the tree-aggregated graphical lasso or tag-lasso, that estimates graphical models that are both edge-sparse and node-aggregated. The aggregation is performed in a data-driven fashion by leveraging side information in the form of a tree that encodes node similarity and facilitates the interpretation of the resulting aggregated nodes. We illustrate our proposal's practical advantages in simulations and applications.

Authors: Ines Wilms and Jacob Bien

Yi Yu (University of Warwick)

Title: Localising change points in general graphs

Abstract: In this talk, I will present some ongoing work on detecting and localising change points in general graphs. I will start with formalising the “change points” in general graphs and the rich history in this problem. I will then compare the results with those in chain graphs. The essence of our problem lies in the VC-dimension of general graphs. I will show the minimal signal-to-noise ratio condition needed to achieve a consistent localisation and show the optimal localisation rates in some scenarios. I will end this talk with various open problems. This is joint work with Alessandro Rinaldo (CMU).

Yoav Zemel (University of Cambridge)

Title: On estimation of log concave densities in Wasserstein distance

Abstract: The class of log concave densities provides a flexible nonparametric generalisation of many common parametric families. Unlike many other nonparametric problems, estimation of log concave densities can be carried out without the need to select any tuning parameters. The study of the minimax rate of estimation of a log concave density has mostly been focussed on total variation, Hellinger, or Kullback-Leibler loss. In this work, we investigate the minimax rate when the loss function is the Wasserstein distance of optimal transport. In the last two decades, Wasserstein distances have become popular among statisticians and machine learners, in part because of their ability to incorporate the geometry underlying the data. In particular, they are qualitatively of a different nature than that of f-divergences, most notably in terms of invariance and equivariance properties. We will give some upper and lower bounds with respect to the Wasserstein loss, and put them in the context of existing work.

This talk is based on work in progress with Jonathan Weed.

Chao Zheng (University of Southampton)

Title: Nonparametric changepoint estimation for high-dimensional data

Abstract: In this talk, we will introduce estimating single/multiple distributional changepoint(s) for high dimensional data. We start with a projection pursuit type moving-window test statistic to investigate existence of a single changepoint. To perform the hypothesis testing, we derive a bootstrap approximation to calculate the critical values for the test in practice and prove its theoretical validity. Next, to extend the method to detect multiple changepoints, we combine the moving-window test statistic with the binary segmentation algorithm. The consistency properties of the algorithm are carefully studied and the guidance on how to select the tuning parameters are provided.

Oliver Feng

Piotr

Ilmun

Hyeyoung

Florian

Gaetano

Martin

Nicolas

Kim

Ines

Yoav

Chao