Abstract
The increasing reliance on machine learning models has prompted growing concerns regarding the privacy of sensitive information used in the training process. As a result, using differential privacy techniques has become a viable paradigm for attaining strong privacy preservation without sacrificing the models' usefulness. This study investigates and summarises important approaches in the field of machine learning differential privacy. Adding controlled noise at various points in the machine learning pipeline is the first class of approaches. To avoid unintentionally revealing private information, Laplace and Gaussian noise are deliberately added to training data, predictions, and model parameters. By employing strategies like randomised response mechanisms, data perturbation can enhance privacy for each individual without compromising the model's quality. Collaborative model training is made easier by privacy-preserving aggregation techniques like Secure Multi-Party Computation (SMPC), which protects raw data. By adding noise to gradients during training, Differential Privacy Stochastic Gradient Descent (DP-SGD) provides privacy guarantees during the optimization stage. When differential privacy is combined with federated learning, it allows for decentralized model training across devices while maintaining the security and localization of sensitive data. By allowing computations on encrypted data or safely aggregating model updates, advanced cryptographic approaches like homomorphic encryption and secure aggregation protocols give another degree of privacy. When taken as a whole, these methods add to a thorough framework for machine learning differential privacy that strikes a balance between the need to protect individual privacy and the drive to create accurate models.
View more >>