GraphQL - Security Overview and Testing Tips

With the increasing popularity of GraphQL technology we are summarizing some documentation and tips about common security mistakes.

What is GraphQL?

GraphQL is a data query language developed by Facebook and publicly released in 2015. It is an alternative to REST API.

Even if you don’t see any GraphQL out there, it is likely you’re already using it since it’s running on some big tech giants like Facebook, GitHub, Pinterest, Twitter, HackerOne and a lot more.

A few key points on this technology

  • GraphQL provides a complete and understandable description of the data in the API and gives clients the power to ask for exactly what they need. Queries always return predictable results.

  • While typical REST APIs require loading from multiple URLs, GraphQL APIs get all the data your app needs in a single request.

  • GraphQL APIs are organized in terms of types and fields, not endpoints. You can access the full capabilities of all your data from a single endpoint.

  • GraphQL is strongly typed to ensure that application only ask for what’s possible and provide clear and helpful errors.

  • New fields and types can be added to the GraphQL API without impacting existing queries. Aging fields can be deprecated and hidden from tools.

Before we start diving into the GraphQL security landscape, here is a brief recap on how it works. The official documentation is well written and was really helpful.

A GraphQL query looks like this:

Basic GraphQL Query

query{
	user{
		id
		email
		firstName
		lastName
	}
}

While the response is JSON:

Basic GraphQL Response

{
	"data": {
		"user": {
			"id": "1",
			"email": "paolo@doyensec.com",
			"firstName": "Paolo",
			"lastName": "Stagno"
		}
	}
}

Security Testing Tips

Since Burp Suite does not understand GraphQL syntax well, I recommend using the graphql-ide, an Electron based app that allows you to edit and send requests to a GraphQL endpoint; I also wrote a small python script GraphQL_Introspection.py that enumerates a GraphQL endpoint (with introspection) in order to pull out documentation. The script is useful for examining the GraphQL schema looking for information leakage, hidden data and fields that are not intended to be accessible.

The tool will generate a HTML report similar to the following:

Python Script pulling data from a GraphQL endpoint

Introspection is used to ask for a GraphQL schema for information about what queries, types and so on it supports.

As a pentester, I would recommend to look for requests issued to “/graphql” or “/graphql.php” since those are usual GraphQL endpoint names; you should also search for “/graphiql”, ”graphql/console/”, online GraphQL IDEs to interact with the backend, and “/graphql.php?debug=1” (debugging mode with additional error reporting) since they may be left open by developers.

When testing an application, verify whether requests can be issued without the usual authorization token header:

GraphQL Bearer Authorization Header Example

Since the GraphQL framework does not provide any means for securing your data, developers are in charge of implementing access control as stated in the documentation:

“However, for a production codebase, delegate authorization logic to the business logic layer”.

Things may go wrong, thus it is important to verify whether a user without proper authentication and/or authorization can request the whole underlying database from the server.

When building an application with GraphQL, developers have to map data to queries in their chosen database technology. This is where security vulnerabilities can be easily introduced, leading to Broken Access Controls, Insecure Direct Object References and even SQL/NoSQL Injections.

As an example of a broken implementation, the following request/response demonstrates that we can fetch data for any users of the platform (cycling through the ID parameter), while simultaneously dumping password hashes:

Query

query{
	user(id: 165274){
		id
		email
		firstName
		lastName
		password
	}
}

Response

{
	"data": {
		"user": {
			"id": "165274",
			"email": "johndoe@mail.com",
			"firstName": "John",
			"lastName": "Doe"
			"password": "5F4DCC3B5AA765D61D8327DEB882CF99"
		}
	}
}

Another thing that you will have to check is related to information disclosure when trying to perform illegal queries:

Information Disclosure

{
	"errors": [
		{
			"message": "Invalid ID.",
			"locations": [
				{
					"line": 2,
					"column": 12
				}
				"Stack": "Error: invalid ID\n at (/var/www/examples/04-bank/graphql.php)\n"
      			]
		}
	]
}

Even though GraphQL is strongly typed, SQL/NoSQL Injections are still possible since GraphQL is just a layer between client apps and the database. The problem may reside in the layer developed to fetch variables from GraphQL queries in order to interrogate the database; variables that are not properly sanitized lead to old simple SQL Injection. In case of Mongodb, NoSQL injection may not be that simple since we cannot “juggle” types (e.g. turning a string into an array. See PHP MongoDB Injection).

GraphQL SQL Injection

mutation search($filters Filters!){
	authors(filter: $filters)
	viewer{
		id
		email
		firstName
		lastName
	} 
}

{
	"filters":{
		"username":"paolo' or 1=1--"
		"minstories":0
	}
}

Beware of nested queries! They can allow a malicious client to perform a DoS (Denial of Service) attack via overly complex queries that will consume all the resources of the server:

Nested Query

query {
 stories{
  title
  body
  comments{
   comment
   author{
    comments{
     author{
      comments{
       comment
       author{
        comments{
         comment
         author{
          comments{
           comment
           author{
            name
           }
          }
         }
        }
       }
      }
     }
    }
   }
  }
 }
}

An easy remediation against DoS could be setting a timeout, a maximum depth or a query complexity threshold value.

Keep in mind that in the PHP GraphQL implementation:

  • Complexity analysis is disabled by default

  • Limiting Query Depth is disabled by default

  • Introspection is enabled by default. It means that anybody can get a full description of your schema by sending a special query containing meta fields type and schema

Outro

GraphQL is a new interesting technology, which can be used to build secure applications. Since developers are in charge of implementing access control, applications are prone to classical web application vulnerabilites like Broken Access Controls, Insecure Direct Object References, Cross Site Scripting (XSS) and Classic Injection Bugs. As any technology, GraphQL-based applications may be prone to development implementation errors like this real-life example:

“By using a script, an entire country’s (I tested with the US, the UK and Canada) possible number combinations can be run through these URLs, and if a number is associated with a Facebook account, it can then be associated with a name and further details (images, and so on).”

@voidsec

Resources:

We're hiring - Join Doyensec!

At Doyensec, we believe that quality is the natural product of passion and care. We love what we do and we routinely take on difficult engineering challenges to help our customers build with security.

We are a small highly focused team. We concentrate on application security and do fewer things better. We don’t care about your education, background and certifications. If you are really good and passionate at building and breaking complex software, you’re the right candidate.

Open Positions

:: Full-stack Security Automation Engineer (Six Months Collaboration, Remote Work) ::

We are looking for a full-stack senior software engineer that can help us build security automation tools. If you’ve ever built a fuzzer, played with static analysis and enhanced a web scanner engine, you probably have the right skillset for the job.

We offer a well-paid six-months collaboration, combined with an additional bonus upon successful completion of the project.

Responsibilities:

  • Full-stack development (front-end, back-end components) of web security testing tools
  • Solve technical challenges at the edge of web security R&D, together with Doyensec’s founders

Requirements:

  • Experience developing multi-tiered software applications or products. We generally use Node.js and Java, and require proficiency in those languages
  • Ability to work with standard dev tools and techniques (IDE, git, …)
  • You’re passionate about building great software and can have fun while doing it
  • Interested in web security, with good understanding of common software vulnerabilities
  • You’re self-driven and can focus on a project to make it happen
  • Eager to learn, adapt and perfect your work

Contact us at info@doyensec.com

:: Application Security Engineer (Full-time, Remote Work - Europe) ::

We are looking for an experienced security engineer to join our consulting team. We perform graybox security testing on complex web and mobile applications. We need someone who can hit the ground running. If you’re good at “crawling around in the ventilation ducts of the world’s most popular and important applications”, you probably have the right skillset for the job.

We offer a competitive salary in a supportive and dynamic environment that rewards hard work and talent. We are dedicated to providing research-driven application security and therefore invest 25% of your time exclusively to research where we build security testing tools, discover new attack techniques, and develop countermeasures.

Responsibilities:

  • Security testing of web, mobile (iOS, Android) applications
  • Vulnerability research activities, coordinated and executed with Doyensec’s founders
  • Partner with customers to ensure project’s objectives are achieved 

Requirements:

  • Ability to discover, document and fix security bugs
  • You’re passionate about understanding complex systems and can have fun while doing it
  • Top-notch in web security. Show us public research, code, advisories, etc.
  • Eager to learn, adapt, and perfect your work

Contact us at info@doyensec.com


Staring into the Spotlight

Spotlight is the all pervasive seeing eye of the OSX userland. It drinks from a spout of file events sprayed out of the kernel and neatly indexes such things for later use. It is an amalgamation of binaries and libraries, all neatly fitted together just to give a user oversight of their box. It presents interesting attack surface and this blog post is an explanation of how some of it works.

One day, we found some interesting looking crashes recorded in /Users/<name>/Library/Logs/DiagnosticReports

Yet the crashes weren’t from the target. In OSX, whenever a file is created, a filesystem event is generated and sent down from the kernel. Spotlight listens for this event and others to immediately parse the created file for metadata. While fuzzing a native file parser these Spotlight crashes began to appear from mdworker processes. Spotlight was attempting to index each of the mutated input samples, intending to include them in search results later.

fsevents

The Spotlight system is overseen by mds. It opens and reads from /dev/fsevents, which streams down file system event information from the kernel. Instead of dumping the events to disk, like fseventsd, it dumps the events into worker processes to be parsed on behalf of Spotlight. Mds is responsible for delegating work and managing mdworker processes with whom it communicates through mach messaging. It creates, monitors, and kills mdworkers based on some light rules. The kernel does not block and the volume of events streaming through the fsevents device can be quite a lot. Mds will spawn more mdworker processes when handling a higher event magnitude but there is no guarantee it can see and capture every single event.

The kernel filters which root level processes can read from this device. fsevents filter

Each of the mdworker processes get spawned, parse some files, write the meta info, and die. Mdworker shares a lot of code with mdimport, its command line equivalent. The mdimport binary is used to debug and test Spotlight importers and therefore makes a great target for auditing and fuzzing. Much of what we talk about in regards to mdimport also applies to mdworker.

Importers

You can see what mdworkers are up to with the following: sudo fs_usage -w -f filesys mdworker

Importers are found in /Library/Spotlight, /System/Library/Spotlight, or in an application’s bundle within “/Contents/Library/Spotlight”. If the latter is chosen, the app typically runs a post install script with mdimport -r <importer> and/or lsregister. The following command shows the list of importers present on my laptop. It shows some third party apps have installed their own importers.

$ mdimport -L
2017-07-30 00:36:15.518 mdimport[40541:1884333] Paths: id(501) (
    "/Library/Spotlight/iBooksAuthor.mdimporter",
    "/Library/Spotlight/iWork.mdimporter",
    "/Library/Spotlight/Microsoft Office.mdimporter",
    "/System/Library/Spotlight/Application.mdimporter",
...
    "/System/Library/Spotlight/SystemPrefs.mdimporter",
    "/System/Library/Spotlight/vCard.mdimporter",
    "/Applications/Xcode.app/Contents/Applications/Application Loader.app/Contents/Library/Spotlight/MZSpotlight.mdimporter",
    "/Applications/LibreOffice.app/Contents/Library/Spotlight/OOoSpotlightImporter.mdimporter",
    "/Applications/OmniGraffle.app/Contents/Library/Spotlight/OmniGraffle.mdimporter",
    "/Applications/GarageBand.app/Contents/Library/Spotlight/LogicX_MDImport.mdimporter",
    "/Applications/Xcode.app/Contents/Library/Spotlight/uuid.mdimporter"
)

These .mdimporter files are actually just packages holding a binary. These binaries are what we are attacking.

Using mdimport is simple - mdimport <file>. Spotlight will only index metadata for filetypes having an associated importer. File types are identified through magic. For example, mdimport reads from the MAGIC environment variable or uses the “/usr/share/file/magic” directory which contains both the compiled .mgc file and the actual magic patterns. The format of magic files is discussed at https://developer.apple.com/legacy/library/documentation/Darwin/Reference/ManPages/man5/magic.5.html.

Crash File

crash logging

One thing to notice is that the crash log will contain some helpful information about the cause. The following message gets logged by both mdworker and mdimport, which share much of the same code:

Application Specific Information:
import fstype:hfs fsflag:480D000 flags:40000007E diag:0 isXCode:0 uti:com.apple.truetype-datafork-suitcase-font plugin:/Library/Spotlight/Font.mdimporter - find suspect file using: sudo mdutil -t 2682437

The 2682437 is the iNode reference number for the file in question on disk. The -t argument to mdutil will ask it to lookup the file based on volume ID and iNode and spit out the string. It performs an open and fcntl on the pseudo directory /.vol/<Volume ID>/<File iNode>. You can see this info with the stat syscall on a file.

$ stat /etc
16777220 418395 lrwxr-xr-x 1 root wheel 0 11 "Dec 10 05:13:41 2016" "Dec 10 05:13:41 2016" "Dec 10 05:15:47 2016" "Dec 10 05:13:41 2016" 4096 8 0x88000 /etc

$ ls /.vol/16777220/418395
afpovertcp.cfg    fstab.hd            networks          protocols
aliases           ftpd.conf           newsyslog.conf    racoon
aliases.db        ftpd.conf.default   newsyslog.d       rc.common

The UTI registered by the importer is also shown “com.apple.truetype-datafork-suitcase-font”. In this case, the crash is caused by a malformed Datafork TrueType suitcase (.dfont) file.

When we find a bug, we can study it under lldb. Launch mdimport under the debugger with the crash file as an argument. In this particular bug it breaks with an exception in the /System/Library/Spotlight/Font.mdimporter importer.

crash logging

The screenshot below shows the problem procedure with the crashing instruction highlighted for this particular bug.

The rsi register points into the memory mapped font file. A value is read out and stored in rax which is then used as an offset from rcx which points to the text segment of the executable in memory. A lookup is done on a hardcoded table and parsing proceeds from there. The integer read out of the font file is never validated.

When writing or reversing a Spotlight importer, the main symbol to first look at will be GetMetadataForFile or GetMetadataForURL. This function receives a path to parse and is expected to return the metadata as a CFDictionary.

We can see, from the stacktrace, how and where mdimport jumps into the GetMetadataForFile function in the Font importer. Fuzzing mdimport is straightforward, crashes and signals are easily caught.

The variety of importers present on OSX are sometimes patched alongside the framework libraries, as code is shared. However, a lot of code is unique to these binaries and represents a nice attack surface. The Spotlight system is extensive, including its own query language and makes a great target where more research is needed.

When fuzzing in general on OSX, disable Spotlight oversight of the folder where you generate and remove your input samples. The folder can be added in System Preferences->Spotlight->Privacy. You can’t fuzz mdimport from this folder, instead disable Spotlight with “mdutil -i off” and run your fuzzer from a different folder.

@day6reak