Part 5: Machine Learning Methods to Process Datasets With QI Values

Differential Privacy (DP): This mathematical framework gives the ability to control to what extent the model ‘remembers’ and ‘forgets’ potentially sensitive data, which is its big advantage. The most popular concept of DP is ‘noisy counting’, which is based on drawing samples from Laplace distribution and using them to make ... Read More

Part 4: Standard Ways to Process Datasets with QI Values

K-anonymity: This approach is quite different from the one that I described earlier. With K-anonymity, we’re not aiming to ‘hide’ any data, but rather are softly ‘masking’ the QI values. The most popular techniques used in k-anonymity are purging and generalization. Purging simply replaces QI values with random strings like ... Read More

Part 3: Machine Learning Ways to De-Identify Personal Data (Homomorphic Encryption)

-Homomorphic Encryption: The main idea behind homomorphic encryption is that the inferences we make based on computations of encrypted data should be as accurate as if we had used decrypted data. Homomorphic encryption is an evolving field, and at this point in time as certain limitations. For example, only polynomial ... Read More

Part 2: Standard Ways to De-Identify Personal Data

Usually, maintainers of the database try to eliminate all channels that could potentially help an attacker leverage queries to gain personal/sensitive information about a specific person. Here are a few examples: Pseudonymization: This method of processing of personal data is based on replacing the values, which contain personal information, with ... Read More

Part 1: Introduction and Resources of the Data Breach

Terms like ‘sensitive data’ and ‘personal data’ have been floating in the air for a while ever since GDPR, CCPA, and similar privacy acts were introduced to companies across the globe. One challenge with them is that the complexity of the federal laws and quite complicated terminology used to identify ... Read More

Secure Guardrails