Data in the browser is data at risk

Many third party web applications share sensitive data with parties other than the website owner. This sharing can be intentional or inadvertent, but to prevent breaches and manage risk, website owners should adopt a data-centric approach to securing their web applications.


The average website user doesn’t realise that, whenever they access a web application, the website owner is rarely completely in control of it – two thirds of the code comes from third parties. 

What this means in practical terms is that any sensitive data entered could potentially be handled by third parties that in turn may be loading to fourth parties, which may be loading to fifth parties…and on it goes. That’s a lot more than the average website user – and often the website owner – is bargaining for when they enter sensitive log-in details for online banking or payment card details for shopping. 

It’s not just customer data that’s at risk, either: remote employees accessing or updating sensitive information via web applications over SSL VPNs are also impacted. Processing sensitive information over an encrypted channel isn’t automatically secure: once the information is within a browser, it’s just as vulnerable to formjacking or other client-side web attacks, and none of the DLP endpoint products have visibility into what’s going on inside the browser. 

From a cybercriminal’s perspective, it’s an expanded attack surface offering a rich vein of high-value data: a single compromise of this ‘supply chain’ framework enables the theft of information from many thousands of users. One effort to mug the whole crowd, as it were.


Time to shift security focus

Most enterprises have invested heavily in securing backend databases and filesystems but overlook one critical aspect: what happens within the browser (often the point of origination for sensitive data such as credit card details or driver’s license). What if this sensitive information is hijacked at the point of origination and never reaches a backend database or log of sensitive information?

In this landscape, the security focus has to shift to the data. Not to the exclusion of traditional approaches but, as this very insightful graph originally produced by the Burton Group (now part of Gartner)  illustrates, data-centric controls are the most effective, for the longest period of time.


Why data-centric controls matter

When you look at protection, it’s not just about having a WAF or a Windows firewall. At some level, application controls are effective but, for today’s web architecture and the ecosystem of JavaScript-based integrations that powers it, data-centric controls are critical. When you consider what JavaScript delivered by first and third-party sites but running in the browser can do, it’s easy to understand why:

  • First and third-party code running on your site can collect and read data. 
  • While it’s loading, JavaScript can make network requests. That can include complete information, such as an email address in clear. Sometimes it’s obfuscated, sometimes it’s dis-aggregated, like the first and last name together constitute a PII but if I exfiltrate them separately, to separate locations, they’re technically not PII and may not be flagged as such
  • It embeds a footprint on disks via cookies, local storage, or Index DB – potentially having a footprint on the file system.

Cookies and storage are a widely overlooked aspect of web data security. From local storage objects (LSOs, or ‘flash’ cookies) capable of surviving deletion to stateful and stateless cookies, customer data collected on websites can be a source of unintentional data leakage. 

When it comes to the portion of your apps that handle data, every website owner  should be asking:

  • How well are we scrutinizing these apps?
  • Who has access to what? Who can read it? Who actually captures/reads it? And who actually leaks it – and how?
  • What kind of rules or controls to block potential exfiltration or data theft do we have in place?
  • Who enforces those controls? The browser? Or something else that’s potentially in conflict with another piece of JavaScript? 

Finally, when there are actual incidents of exfiltration or data leakage, can you detect and protect at that time?


Solving the data leakage problem for web applications

To prevent data leakage, you need to be able to detect it. And the ultimate, foolproof way to do that is through information flow analysis (because the information flows through your application) and taint analysis (which identifies every source of user data, including things like form inputs and headers, and follows them all the way through your system, as if everything that touches that data is “contaminated” or tained,  to sanitize them before they can do anything). 

Sounds great, right? And it is. The problem for regular business owners is that these are broadly academic, heavy-hitting techniques that require complex tools, manuals and experts to drive them and identify possible information leakage or channels and vectors for them within an application. 


PII Exposure Scanning and Leakage Mapping

Tala’s latest innovation is built on these foundations – but  with easier implementation and management for businesses. To build our new PII features  we developed lightweight analysis, so that while we crawl and analyse an application, using techniques such as synthetic transactions, we can automatically identify the actual libraries or JavaScript that’s leaking information, which is often obfuscated using encoding techniques, like Base 64, hashed or even encrypted.

Tala’s PII Exposure Scanning and PII Leakage Mapping work with our patented analytics engine to enable the fine-tuning of policies that prevent sensitive data exfiltration from trusted applications. We synthetically monitor data flows to identify sensitive data leakage, without customers having to install or instrument anything on their web apps. Tala’s analytics engine dynamically scans for this specific data leakage. Users can create customized data patterns and types to define data destination policies, similar to DLP for web applications (but performing a lot more than high-end keyword matching). An alert is generated when any sensitive data violates policy-defined permissions or is identified as suspicious by analytics. 

This kind of insight into application behavior is critical, because simply knowing what third-party services are capable of collecting data is not the same as knowing when they are doing this. Monitoring, together with dynamic prevention, significantly reduces the risk of applications stealing or accessing data. 



*** This is a Security Bloggers Network syndicated blog from Tala Blog authored by Sanjay Sawhney, Co-Founder and VP of Engineering. Read the original post at:

Cloud Workload Resilience PulseMeter

Step 1 of 8

How do you define cloud resiliency for cloud workloads? (Select 3)(Required)