
How Data Breaches Empower Malicious AI: The AT&T Case Study
100 Million…
Recent data breaches continue to underscore the vulnerabilities inherent in our increasingly interconnected world. One of the most alarming incidents involves AT&T, where metadata for 100 million subscribers was exfiltrated. This is essentially their entire customer base.
What is Metadata?
Metadata is data that provides information about other data. It summarizes basic information about data, making it easier to find, use, and manage. Metadata is often described as “data about data.”
The stolen records from 2022 and 2023 included detailed information such as customers’ voice and text contacted numbers, frequency, duration, and even cell tower locations.
By analyzing the contacted numbers and communication patterns, one can map out the social and professional networks of the individuals. Frequency and duration of calls and texts can reveal habits, routines, and significant relationships, while cell tower data can trace the physical movements of individuals, showing where they live, work, and travel.
Basically, you don’t need to see the actual communications in a lot of cases, to know what was likely discussed, and what kind of relationship exists between two numbers where at least one number can be identified to an individual or business.
The Data Breach Impact
The exfiltration of AT&T’s subscriber metadata is not just a singular event; it represents a treasure trove of information for cybercriminals.
When metadata from this breach is combined with data from other breaches, it can lead to even more comprehensive profiles of individuals, increasing the risk of identity theft, fraud, and more sophisticated cyber attacks…
Training Malicious AI Models
With a rich data set, malicious AI models can be trained more effectively and efficiently. In the context of supervised learning, the detailed records from AT&T can help AI models learn specific patterns of communication and movement.
By analyzing communication patterns, AI can craft highly personalized phishing messages that are more likely to succeed, especially if you can identify the parties involved and the nature of the relationship.
Unsupervised models can detect unusual patterns and anomalies within the data. For instance, they can identify infrequent contacts that, depending on the contact, could represent life events or stressors. These could also be potential targets for fraud, and open the door to more targeted attacks.
Additionally, iterative training can open the door to further adaptive attacks. This is where malicious AI can adapt in real-time to evade detection and increase the effectiveness of their attacks. By using reinforcement learning to continuously improve their strategies based on the success or failure of past attempts, AI can learn in real time to outsmart it’s target.
Deepfake Attacks
When combined with other breached data, the capabilities of malicious AI expand even further. With access to detailed communication patterns and potentially voice samples, AI can create highly convincing deepfakes. This can be used for social engineering attacks, such as impersonating a trusted contact to gain access to sensitive information, or accessing a target’s banking information through voice verification bypass.
AI can already generate synthetic identities that mimic real people, making it extremely difficult to differentiate between legitimate and fraudulent activities. Imagine what maliciously trained AI can do with information from a target, sourced from multiple breaches.
Case Study: AT&T Breach
The AT&T breach is shocking in it’s scope and magnitude. The data exfiltrated, for pretty much their entire subscriber base (100 out of 127 Million subscribers), encompasses a solid 6 month period from May to October of 2022, and then additional days in early 2023.
What’s worse, is that AT&T did not even realize the breach had occurred until April of 2024. Considering the length of time between the actual incident, and the awareness of the breach, there is no telling who has obtained copies of the data since then.
Regardless, AT&T seems to have determined that it was worth trying to remove, and paid the equivalent of a 370k ransom to do so.
Attacks in a ransomware scenario often use multiple levels and scenarios of extortion, so this may not be the last place this data set pops up.
Conclusion: AI Won’t Forget
Once a model is trained on a data set, if you delete that data set as AT&T has paid to have done, the model still retains the knowledge derived from that data. Training a model involves adjusting the model’s parameters based on the data it has seen, so even if the original data is deleted, the insights and patterns learned by the model remain embedded in its parameters.
Malicious AIs won’t be forgetting about these 100 Million subscribers any time soon.
*** This is a Security Bloggers Network syndicated blog from Berry Networks authored by David Michael Berry. Read the original post at: https://berry-networks.com/2024/07/16/how-data-breaches-empower-malicious-ai-the-att-case-study/