IEEE Information Theory Society

Learning Algorithm Generalization Error Bounds via Auxiliary Distributions

Submitted by admin on Mon, 06/10/2024 - 05:00

Generalization error bounds are essential for comprehending how well machine learning models work. In this work, we suggest a novel method, i.e., the Auxiliary Distribution Method, that leads to new upper bounds on expected generalization errors that are appropriate for supervised learning scenarios.

Neural Distributed Compressor Discovers Binning

Submitted by admin on Mon, 06/10/2024 - 05:00

We consider lossy compression of an information source when the decoder has lossless access to a correlated one. This setup, also known as the Wyner-Ziv problem, is a special case of distributed source coding. To this day, practical approaches for the Wyner-Ziv problem have neither been fully developed nor heavily investigated. We propose a data-driven method based on machine learning that leverages the universal function approximation capability of artificial neural networks.

Differentially Private Stochastic Linear Bandits: (Almost) for Free

Submitted by admin on Mon, 06/10/2024 - 05:00

In this paper, we propose differentially private algorithms for the problem of stochastic linear bandits in the central, local and shuffled models. In the central model, we achieve almost the same regret as the optimal non-private algorithms, which means we get privacy for free. In particular, we achieve a regret of $\tilde {O}\left({\sqrt {T}+{}\frac {1}{\varepsilon }}\right)$ matching the known lower bound for private linear bandits, while the best previously known algorithm achieves $\tilde {O}\left({{}\frac {1}{\varepsilon }\sqrt {T}}\right)$ .

Training Generative Models From Privatized Data via Entropic Optimal Transport

Submitted by admin on Mon, 06/10/2024 - 05:00

Local differential privacy is a powerful method for privacy-preserving data collection. In this paper, we develop a framework for training Generative Adversarial Networks (GANs) on differentially privatized data. We show that entropic regularization of optimal transport – a popular regularization method in the literature that has often been leveraged for its computational benefits – enables the generator to learn the raw (unprivatized) data distribution even though it only has access to privatized samples.

Improving Group Testing via Gradient Descent

Submitted by admin on Mon, 06/10/2024 - 05:00

We study the problem of group testing with non-identical, independent priors. So far, the pooling strategies that have been proposed in the literature take the following approach: a hand-crafted test design along with a decoding strategy is proposed, and guarantees are provided on how many tests are sufficient in order to identify all infections in a population. In this paper, we take a different, yet perhaps more practical, approach: we fix the decoder and the number of tests, and we ask, given these, what is the best test design one could use?

Optimal Binary Differential Privacy via Graphs

Submitted by admin on Mon, 06/10/2024 - 05:00

We present the notion of reasonable utility for binary mechanisms, which applies to all utility functions in the literature. This notion induces a partial ordering on the performance of all binary differentially private (DP) mechanisms. DP mechanisms that are maximal elements of this ordering are optimal DP mechanisms for every reasonable utility. By looking at differential privacy as a randomized graph coloring, we characterize these optimal DP in terms of their behavior on a certain subset of the boundary datasets we call a boundary hitting set.

Total Variation Meets Differential Privacy

Submitted by admin on Mon, 06/10/2024 - 05:00

The framework of approximate differential privacy is considered, and augmented by leveraging the notion of “the total variation of a (privacy-preserving) mechanism” (denoted by $\eta $ -TV). With this refinement, an exact composition result is derived, and shown to be significantly tighter than the optimal bounds for differential privacy (which do not consider the total variation). Furthermore, it is shown that $(\varepsilon ,\delta )$ -DP with $\eta $ -TV is closed under subsampling. The induced total variation of commonly used mechanisms are computed.

Iterative Sketching for Secure Coded Regression

Submitted by admin on Mon, 06/10/2024 - 05:00

Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance. In this work, we propose methods for speeding up distributed linear regression. We do so by leveraging randomized techniques, while also ensuring security and straggler resiliency in asynchronous distributed computing systems. Specifically, we randomly rotate the basis of the system of equations and then subsample blocks, to simultaneously secure the information and reduce the dimension of the regression problem.

The Worst-Case Data-Generating Probability Measure in Statistical Learning

Submitted by admin on Mon, 06/10/2024 - 05:00

The worst-case data-generating (WCDG) probability measure is introduced as a tool for characterizing the generalization capabilities of machine learning algorithms.

On the Computation of the Gaussian Rate–Distortion–Perception Function

Submitted by admin on Mon, 06/10/2024 - 05:00

In this paper, we study the computation of the rate-distortion-perception function (RDPF) for a multivariate Gaussian source assuming jointly Gaussian reconstruction under mean squared error (MSE) distortion and, respectively, Kullback–Leibler divergence, geometric Jensen-Shannon divergence, squared Hellinger distance, and squared Wasserstein-2 distance perception metrics.

Subscribe to Our Mailing List