Capital One Breach: A Crime Board & A Case of Speculative Sleuthing
Capital One is not only one of the most well respected financial institutions in the world for their business success, but they’ve also been a leader in driving software modernization in financial services.
Circa 2015, Capital One unveiled its cloud strategy on the main stage of AWS re:Invent, showing plans for a massive consolidation of data center real estate. Its strategy centered on operating more like a technology company, weaving in next-generation application development with agile and DevOps. As an early customer Capital One helped shape AWS security as a key reason for its public cloud strategy because the cloud native company could update its security technology far faster than a traditional data centric financial services company. Such practices can be advanced, secure and agile, but not infallible.
Hence, some have been quick to criticize their cloud strategy as moving too fast and others can hardly conceal their schadenfreude. That’s not what this article is about.
Capital One should be praised for their bold leadership. What (we think) happened to them could happen to any of us and, as the complexity of software development grows and the state of Application Security largely stagnates. Hence, the importance of diagnosing what happened and how we can all get better.
Capital One has acknowledged an attacker breached 140,000 Social Security numbers and 80,000 bank account numbers. No credit card account numbers or log-in credentials are believed to have been compromised. The bank said the bulk of the compromised data consisted of information supplied by consumers and small businesses who applied for credit cards between 2005 and early 2019.
Capital One’s stock tumbled nearly 6% Tuesday last week, the largest single-day decline since January 2019.
The protagonist in this unfolding incident is Paige A. Thompson, who uses the online handle “erratic,” was charged with a single count of computer fraud and abuse in U.S. District Court in Seattle. Thompson is a former employee of Amazon Web Services, which is Amazon’s cloud computing division. Thompson was a systems engineer at AWS between 2015 and 2016, about three years before the breach took place.
Upon sifting through the indictment, one particular section (Investigation, # 13) becomes our point of interest in this exercise. This section unravels a series of events that eventually led to exfiltration of sensitive data.
Capital One’s engineering, operations and security teams have been leaders in cloud computing, and what happened to them could have happened to any one of our infrastructure. Why?
Too Many Buttons and Controls!
Configuration problems are not only prevalent, but also severely impair the security of today’s system and infrastructure software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters (“knobs”). With tens and hundreds of knobs, configuring application software and software defined infrastructure (AWS/Azure/GCP) to ensure high reliability and performance becomes a daunting, error-prone task.
“Too many controls” also prevents users from understanding every parameter thoroughly and examining its settings carefully. Without depth of knowledge, many users’ rely on default values for parameters that truly need to be set based on the runtime environments.
- Developer A : Since we are not sure what is a good choice, how about making it configurable?
- Developer B : We should add a configuration option for it. Even if it’s unlikely to change, if someone does want to change it they’ll thank us that they don’t have to change the code/recompile to do so
For example, MySQL 5.6 database server has 461 configuration parameters; Similarly, Apache HTTP server 2.4 has more than 550 parameters across all the modules. Moreover, many of these parameters have dependencies and correlations, which further worsens the situation. Such high complexity level makes system configuration a daunting, error-prone task.
One could argue that core problem is lack of in-depth knowledge. In a competitive market, slowing down innovation is the kiss-of-death. The true problem is that complexity has pushed security beyond human-scale. This problem isn’t going away and its likely to get worse before it gets better.
The consequence of poorly understood complexity is more frequent misconfiguration of policies and resources, which can permit an unauthorized user to elevate his/her privileges and exfiltrate sensitive data.
Threading the Needle — Evidence Markers From Indictment
- Slack Chat Logs excerpt
2. Gaining access to a EC2 instance with Misconfigured firewall (with configured IAM role — *****-WAF-WebRole)
3. Query EC2 instance metadata service (http://169.254.169.254/iam/security-credentials) to acquire role name (refer to slack chat log where <erratic> calls out *****-WAF-WebRole)
3. Using role name to extract speculatively rotating credentials using metadata service (http://169.254.169.254/iam/security-credentials/*****-WAF-WebRole) that gives the instance access to other AWS value services (S3, EBS, etc) based on a permission policy defined in the instance’s IAM role.
4. Using this IAM role policy, the attacker accessed S3 service (trust policy) and executed list bucket command (permission policy)
5. Used AWS CLI S3 “sync” command several times to recursively copy data from the “*****-WAF-WebRole” accessible buckets to attackers local instance.
A Speculative Recount of Events
First, let me be clear that I have no insider knowledge.
This is my best guess at what occurred, based on publicly available information.
Step 1 : Discovery
The attacker uses TOR (anonymity network) and iPredator VPN in combination to conceal her identity. Using her concealed identity, she discovered a misconfigured EC2 instance hosted in Capital One’s cloud infrastructure hosted by AWS (Amazon Web Services).
The details in terms of how she accessed one of their misconfigured EC2 instances are scarce. All we’ve know is that it was through a “misconfigured firewall.”
Launched by programmer John Matherly in 2009, Shodan is a search engine that enables users to scour the web for webcams, routers, exposed cloud resources and other connectable smart products.
A simple search conducted by researcher Giovanni Collazo yielded 2,284 etcd servers open to the web in that their authentication mechanism was disabled by default. That meant each server’s stored credentials were publicly viewable.
This misconfigured EC2 instance that the attacker used, could have speculatively been a misconfigured WAF (Web Application Firewall) discovered and indexed on Shodan (perhaps in an event of a maintenance window).
What was the misconfigured instance hosting?
There is no clarity in the indictment as to whether the instance in question was hosting a WAF (ModSecurity based) or a vulnerable application with integration via webhooks, etc.
Given that the IAM role was prefixed with WAF** , it is speculated that the exploitation was used upon a WAF (ModSecurity based) module.
Refer to common SSRF based bypass patterns by Swissky https://github.com/swisskyrepo/PayloadsAllTheThings/tree/master/Server%20Side%20Request%20Forgery
Step 2: Extracting credentials using SSRF by pivoting from exposed host to Cloud Management Plane
Modern event-based technology stack utilizes several third-party SaaS services like SendGrid (for emails), Stripe (for payments), Slack (for triage notifications), Segment (for analytics and insights). All of these aforementioned services provide a feature called WebHooks that enables them to call our services back (at a specified URI endpoint) after the event workflow is fulfilled.
So, what exactly is a webhook?
A webhook (also called a web callback, reversing API or HTTP push API) is a way for an application to provide other applications with real-time information. A webhook delivers data to other applications as it happens, meaning you get data immediately. Unlike typical APIs where you would need to poll for data very frequently in order to get it real-time. This makes webhooks much more efficient for both provider and consumer.
What is SSRF?
SSRF (Server-Side Request Forgery) is a type of vulnerability that allows an attacker to force an application to issue requests on behalf of the attacker, to unintended resources.
Server Side Request Forgery is an example (SSRF), is an example of “Confused Deputy” is the business logic of the web application itself.
Conceptually, the attack “encapsulates” a malicious request in an original request , by means of manipulating the vulnerable feature or business logic.
These features will probably have some of the following characteristics
- Linking embedding external resources
- Checking status of a set of resources (monitoring and management function)
- Server forwarding request to another services with WebHook feature to call back original service (email , notification/tracking, analytics)
How can an attacker leverage a webhook to initiate SSRF?
SSRF vulnerability (CWE-918) can enable attackers to do the following things
- Perform port scan to enumerate other connected resources
- Scan internal services (corporate network, cloud services, metadata service, etc) behind exposed service
- Additional layer of anonymity to attacker as vulnerable system as all requests are proxied through vulnerable service
- Can use HTTP to enumerate other protocols (using URI such as “ssh://”, “ftp://”, and “file://”)
In each case the attackers goal was to gain credentials to the AWS management plane and then leverage existing privileges or escalate privileges (based on the role’s permission boundary).
Privilege escalation based on IAM role
IAM Roles are an AWS specific. You can assign them to resources within AWS (like an EC2 Instance, S3 Bucket, ECS, etc) and that resource can now be accessed via API calls without stored credentials.
The two components of a IAM role are:
- Trust Policy — what services can this role use? — S3, …
- Permission Policy — what does this role allow? — list, add, remove
Based on trust policy IAM Roles are used to federate connections amongst other peering EC2 instances, containers and lambda functions, and every other value add services (S3, RedShift, DynamoDB, etc) within AWS based on permission policy associated with the role.
Leveraging the AWS Metadata service (http://169.254.169.254/iam/security-credentials) encapsulated in the original vulnerable request, the attacker was able to extract the IAM role “*****-WAF-WebRole”.
Using this IAM role, the attacker was further on able to repeat the SSRF attack to extract the temporary security credentials using the IAM role extracted from prior request (http://169.254.169.254/iam/security-credentials/*****-WAF-WebRole)
Refer to Cloud MetaData dictionary for SSRF Testing by Jason Haddix https://gist.github.com/jhaddix/78cece26c91c6263653f31ba453e273b
Step 3: Exfiltration of Sensitive Data
IAM Roles use temporary security credentials that auto-expire and auto-renew, so you don’t have to worry about access key rotation — AWS does it for you.
Given it’s temporal nature, the attacker had to act fast through several iterations before the credentials are auto-expired.
Stepping back behind the iPredator and TOR network, the attacker executed AWS S3 “ListBuckets” command (using AWS CLI)
$ aws s3 ls
The indictment mentions that this command was run several times and results yielded access around 700 buckets (based on permission boundary of IAM role).
Thereafter AWS s3:sync command was leveraged to exfiltrate the data to her local instance
for [bucket] in [700_bucket_list]
aws s3 sync s3://[bucket] .
Refer to last blurb in the chat log screenshot above.
Attackers think in Graphs, Defenders think in lists
Like we illustrated above, the attacker discovered a exposed and vulnerable endpoint in an application, then pivoted from application to cloud management plane by exploiting the vulnerability (SSRF based) and thereafter laterally moved to a high value asset (S3).
Defenders need to think in Graphs too!
It is evident that the exploit perpetuated by attacker emerged in the application and then pivoted to exfiltrate the high-value asset (customer data).
Rest assured that this was neither a case of insider attack , nor discovery of leaky S3 bucket scenario.
Besides formulating a network of connected resources as a graph, one would have to reason about how an application internally in itself is a connected graph too.
An application written in any programming language comprises of syntax trees, control flows and information flows — all of which are representation as graphs.
An attacker is often motivated to craft a gadget that violates the control flow integrity of an application’s graph in order to enable pivoting from within the application to outside of it towards the high value asset.
Using code graphs to access and mitigate SSRF
How can we address impact of SSRF to minimize the potential for damage from an attacker?
Listed below are a set of discrete and connected application centric steps that need to be investigated in order to assess for risk of SSRF (Server Side Request Forgery)
- Extract the application’s attack surface
- From attack surface, list all exposed API endpoints (inbound and outbound)
- List all business functions supporting these exposed API endpoints (with webhooks, resource embedding, management & monitoring)
- List all public API interfaces that accept an untrusted input (taint) that can propagate through business function flows to trigger a sensitive function.
- Is the application supporting an eventing model using webhooks (to slack, sendgrid, github, etc) with a callback interface
- Are these paths (from untrusted source to business function) rate limiting requests? Legitimate users do not need hundreds of requests per minute as an attacker carrying out a port scanning or Denial of Service attack would.
- What protocol URI schemes are supported by the application in question?
(configure a whitelist that restricts certain URI protocol formats to HTTP/HTTPS)
data:// — Data (RFC 2397)
glob:// — Find pathnames matching pattern
ssh2:// — Secure Shell 2
rar:// — Archive
ogg:// — Audio streams
expect:// — Process Interaction Streams
file:// — Accessing local filesystem
http:// — Accessing HTTP(s) URLs
ftp:// — Accessing FTP(s) URLs
zlib:// — Compression Streams
- Is your application by intent designed to act a proxy or management interface to monitor services?
- Verify if the application is communicating with services such as Kibana, Redis, Elasticsearch, MongoDB and Memcached which are operating in default configuration mode in production (port and authentication scheme)
Listed below are a set of infrastructure centric steps that need to be investigated in order to assess for risk of SSRF (Server Side Request Forgery)
- Misconfigured trust and permission policy associated with the IAM-ROLE in question. Regular audits of policy with principle of least privileges would be a first good step.
- Continuous monitoring of S3 access (read/write) and IAM API calls using AWS CloudTrail
ShiftLeft’s Ocular is a application security platform built over the foundational Code Property Graph that is uniquely positioned to deliver a specification model to query for vulnerable conditions, business logic flaws and insider attacks that might exist in your application’s code base.
Capital One breach crime board — case of speculative sleuthing was originally published in ShiftLeft Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.
*** This is a Security Bloggers Network syndicated blog from ShiftLeft Blog - Medium authored by Chetan Conikee. Read the original post at: https://blog.shiftleft.io/capital-one-breach-crime-board-case-of-speculative-sleuthing-e18fa937fa21?source=rss----86a4f941c7da---4