We address the problem of detecting changes in multivariate datastreams, and we investigate the intrinsic difficulty that change-detection methods have to face when the data-dimension scales. In particular, we consider the general approach that detects changes by comparing the distribution of the log-likelihood of the datastream over different time windows. Despite the fact that this approach constitutes the frame for several change-detection methods, its effectiveness when the dimension of data scales has never been investigated, which is indeed the goal of our paper. We show that the magnitude of the change can be naturally measured by the symmetric Kullback-Leibler divergence between the pre- and post-change distributions, and that the detectability of a change of a given magnitude worsens when the data-dimension increases. This structural problem, which we refer to as detectability loss, is due to the linear relationship existing between the variance of the log-likelihood and the data dimension, and reveals to be harmful even at low data-dimensions (say, 10). We analytically derive the detectability loss on Gaussian-distributed datastreams, and empirically demonstrate that this problem holds also on real-world datasets.
Atti di conferenza
25th International Joint Conference on Artificial Intelligence (IJCAI-16)