Reversing Pickles with r2pickledec

R2pickledec is the first pickle decompiler to support all instructions up to protocol 5 (the current). In this post we will go over what Python pickles are, how they work and how to reverse them with Radare2 and r2pickledec. An upcoming blog post will go even deeper into pickles and share some advanced obfuscation techniques.

What are pickles?

Pickles are the built-in serialization algorithm in Python. They can turn any Python object into a byte stream so it may be stored on disk or sent over a network. Pickles are notoriously dangerous. You should never unpickle data from an untrusted source. Doing so will likely result in remote code execution. Please refer to the documentation for more details.

Pickle Basics

Pickles are implemented as a very simple assembly language. There are only 68 instructions and they mostly operate on a stack. The instruction names are pretty easy to understand. For example, the instruction empty_dict will push an empty dictionary onto the stack.

The stack only allows access to the top item, or items in some cases. If you want to grab something else, you must use the memo. The memo is implemented as a dictionary with positive integer indexes. You will often see memoize instructions. Naively, the memoize instruction will copy the item at the top of the stack into the next index in the memo. Then, if that item is needed later, a binget n can be used to get the object at index n.

To learn more about pickles, I recommend playing with some pickles. Enable descriptions in Radare2 with e asm.describe = true to get short descriptions of each instruction. Decompile simple pickles that you build yourself, and see if you can understand the instructions.

Installing Radare2 and r2pickledec

For reversing pickles, our tool of choice is Radare2 (r2 for short). Package managers tend to ship really old r2 versions. In this case it’s probably fine, I added the pickle arch to r2 a long time ago. But if you run into any bugs I suggest installing from source.

In this blog post, we will primarily be using our R2pickledec decompiler plugin. I purposely wrote this plugin to only rely on r2 libraries. So if r2 works on your system, r2pickledec should work too. You should be able to instal with r2pm.

$ r2pm -U             # update package db
$ r2pm -ci pickledec  # clean install

You can verify everything worked with the following command. You should see the r2pickledec help menu.

$ r2 -a pickle -qqc 'pdP?' -
Usage: pdP[j]  Decompile python pickle
| pdP   Decompile python pickle until STOP, eof or bad opcode
| pdPj  JSON output
| pdPf  Decompile and set pick.* flags from decompiled var names

Reversing a Real pickle with Radare2 and r2pickledec

Let’s reverse a real pickle. One never reverses without some context, so let’s imagine you just broke into a webserver. The webserver is intended to allow employees of the company to perform privileged actions on client accounts. While poking around, you find a pickle file that is used by the server to restore state. What interesting things might we find in the pickle?

The pickle appears below base64 encoded. Feel free to grab it and play along at home.

$ base64 -i /tmp/blog2.pickle -b 64
gASVDQYAAAAAAACMCF9fbWFpbl9flIwDQXBplJOUKYGUfZQojAdzZXNzaW9ulIwR
cmVxdWVzdHMuc2Vzc2lvbnOUjAdTZXNzaW9ulJOUKYGUfZQojAdoZWFkZXJzlIwT
cmVxdWVzdHMuc3RydWN0dXJlc5SME0Nhc2VJbnNlbnNpdGl2ZURpY3SUk5QpgZR9
lIwGX3N0b3JllIwLY29sbGVjdGlvbnOUjAtPcmRlcmVkRGljdJSTlClSlCiMCnVz
ZXItYWdlbnSUjApVc2VyLUFnZW50lIwWcHl0aG9uLXJlcXVlc3RzLzIuMjguMpSG
lIwPYWNjZXB0LWVuY29kaW5nlIwPQWNjZXB0LUVuY29kaW5nlIwNZ3ppcCwgZGVm
bGF0ZZSGlIwGYWNjZXB0lIwGQWNjZXB0lIwDKi8qlIaUjApjb25uZWN0aW9ulIwK
Q29ubmVjdGlvbpSMCmtlZXAtYWxpdmWUhpR1c2KMB2Nvb2tpZXOUjBByZXF1ZXN0
cy5jb29raWVzlIwRUmVxdWVzdHNDb29raWVKYXKUk5QpgZR9lCiMB19wb2xpY3mU
jA5odHRwLmNvb2tpZWphcpSME0RlZmF1bHRDb29raWVQb2xpY3mUk5QpgZR9lCiM
CG5ldHNjYXBllIiMB3JmYzI5NjWUiYwTcmZjMjEwOV9hc19uZXRzY2FwZZROjAxo
aWRlX2Nvb2tpZTKUiYwNc3RyaWN0X2RvbWFpbpSJjBtzdHJpY3RfcmZjMjk2NV91
bnZlcmlmaWFibGWUiIwWc3RyaWN0X25zX3VudmVyaWZpYWJsZZSJjBBzdHJpY3Rf
bnNfZG9tYWlulEsAjBxzdHJpY3RfbnNfc2V0X2luaXRpYWxfZG9sbGFylImMEnN0
cmljdF9uc19zZXRfcGF0aJSJjBBzZWN1cmVfcHJvdG9jb2xzlIwFaHR0cHOUjAN3
c3OUhpSMEF9ibG9ja2VkX2RvbWFpbnOUKYwQX2FsbG93ZWRfZG9tYWluc5ROdWKM
CF9jb29raWVzlH2UdWKMBGF1dGiUjAVhZG1pbpSMD1BpY2tsZXMgYXJlIGZ1bpSG
lIwHcHJveGllc5R9lIwFaG9va3OUfZSMCHJlc3BvbnNllF2Uc4wGcGFyYW1zlH2U
jAZ2ZXJpZnmUiIwEY2VydJROjAhhZGFwdGVyc5RoFClSlCiMCGh0dHBzOi8vlIwR
cmVxdWVzdHMuYWRhcHRlcnOUjAtIVFRQQWRhcHRlcpSTlCmBlH2UKIwLbWF4X3Jl
dHJpZXOUjBJ1cmxsaWIzLnV0aWwucmV0cnmUjAVSZXRyeZSTlCmBlH2UKIwFdG90
YWyUSwCMB2Nvbm5lY3SUTowEcmVhZJSJjAZzdGF0dXOUTowFb3RoZXKUTowIcmVk
aXJlY3SUTowQc3RhdHVzX2ZvcmNlbGlzdJSPlIwPYWxsb3dlZF9tZXRob2RzlCiM
BVRSQUNFlIwGREVMRVRFlIwDUFVUlIwDR0VUlIwESEVBRJSMB09QVElPTlOUkZSM
DmJhY2tvZmZfZmFjdG9ylEsAjBFyYWlzZV9vbl9yZWRpcmVjdJSIjA9yYWlzZV9v
bl9zdGF0dXOUiIwHaGlzdG9yeZQpjBpyZXNwZWN0X3JldHJ5X2FmdGVyX2hlYWRl
cpSIjBpyZW1vdmVfaGVhZGVyc19vbl9yZWRpcmVjdJQojA1hdXRob3JpemF0aW9u
lJGUdWKMBmNvbmZpZ5R9lIwRX3Bvb2xfY29ubmVjdGlvbnOUSwqMDV9wb29sX21h
eHNpemWUSwqMC19wb29sX2Jsb2NrlIl1YowHaHR0cDovL5RoVymBlH2UKGhaaF0p
gZR9lChoYEsAaGFOaGKJaGNOaGROaGVOaGaPlGhoaG9ocEsAaHGIaHKIaHMpaHSI
aHUojA1hdXRob3JpemF0aW9ulJGUdWJoeH2UaHpLCmh7SwpofIl1YnWMBnN0cmVh
bZSJjAl0cnVzdF9lbnaUiIwNbWF4X3JlZGlyZWN0c5RLHnVijAdiYXNldXJslIwU
aHR0cHM6Ly9leGFtcGxlLmNvbS+UdWIu

We decode the pickle and put it in a file, lets call it test.pickle. We then open the file with r2. We also run x to see some hex and pd to print dissassembly. If you ever want to know what an r2 command does, just run the command but append a ? to the end to get a help menu (e.g., pd?).

$ r2 -a pickle test.pickle
 -- .-. .- -.. .- .-. . ..---
[0x00000000]> x
- offset -   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0123456789ABCDEF
0x00000000  8004 95bf 0500 0000 0000 008c 1172 6571  .............req
0x00000010  7565 7374 732e 7365 7373 696f 6e73 948c  uests.sessions..
0x00000020  0753 6573 7369 6f6e 9493 9429 8194 7d94  .Session...)..}.
0x00000030  288c 0768 6561 6465 7273 948c 1372 6571  (..headers...req
0x00000040  7565 7374 732e 7374 7275 6374 7572 6573  uests.structures
0x00000050  948c 1343 6173 6549 6e73 656e 7369 7469  ...CaseInsensiti
0x00000060  7665 4469 6374 9493 9429 8194 7d94 8c06  veDict...)..}...
0x00000070  5f73 746f 7265 948c 0b63 6f6c 6c65 6374  _store...collect
0x00000080  696f 6e73 948c 0b4f 7264 6572 6564 4469  ions...OrderedDi
0x00000090  6374 9493 9429 5294 288c 0a75 7365 722d  ct...)R.(..user-
0x000000a0  6167 656e 7494 8c0a 5573 6572 2d41 6765  agent...User-Age
0x000000b0  6e74 948c 1670 7974 686f 6e2d 7265 7175  nt...python-requ
0x000000c0  6573 7473 2f32 2e32 382e 3294 8694 8c0f  ests/2.28.2.....
0x000000d0  6163 6365 7074 2d65 6e63 6f64 696e 6794  accept-encoding.
0x000000e0  8c0f 4163 6365 7074 2d45 6e63 6f64 696e  ..Accept-Encodin
0x000000f0  6794 8c0d 677a 6970 2c20 6465 666c 6174  g...gzip, deflat
[0x00000000]> pd
            0x00000000      8004           proto 0x4
            0x00000002      95bf05000000.  frame 0x5bf
            0x0000000b      8c1172657175.  short_binunicode "requests.sessions" ; 0xd
            0x0000001e      94             memoize
            0x0000001f      8c0753657373.  short_binunicode "Session"  ; 0x21 ; 2'!'
            0x00000028      94             memoize
            0x00000029      93             stack_global
            0x0000002a      94             memoize
            0x0000002b      29             empty_tuple
            0x0000002c      81             newobj
            0x0000002d      94             memoize
            0x0000002e      7d             empty_dict
            0x0000002f      94             memoize
            0x00000030      28             mark
            0x00000031      8c0768656164.  short_binunicode "headers"  ; 0x33 ; 2'3'
            0x0000003a      94             memoize
            0x0000003b      8c1372657175.  short_binunicode "requests.structures" ; 0x3d ; 2'='
            0x00000050      94             memoize
            0x00000051      8c1343617365.  short_binunicode "CaseInsensitiveDict" ; 0x53 ; 2'S'
            0x00000066      94             memoize
            0x00000067      93             stack_global

From the above assembly it appears this file is indeed a pickle. We also see requests.sessions and Session as strings. This pickle likely imports requests and uses sessions. Let’s decompile it. We will run the command pdPf @0 ~.... This takes some explaining though, since it uses a couple of r2’s features.

  • pdPf - R2pickledec uses the pdP command (see pdP?). Adding an f causes the decompiler to set r2 flags for every variable name. This will make renaming variables and jumping to interesting locations easier.

  • @0 - This tells r2 to run the command at offset 0 instead of the current seek address. This does not matter now because our current offset defaults to
    1. I just make this a habit in general to prevent mistakes when I am seeking around to patch something.
  • ~.. - This is the r2 version of |less. It uses r2’s built in pager. If you like the real less better, you can just use |less. R2 commands can be piped to any command line program.

Once we execute the command, we will see a Python-like source representation of the pickle. The code is seen below, but snipped. All comments below were added by the decompiler.

## VM stack start, len 1
## VM[0] TOP
str_xb = "__main__"
str_x16 = "Api"
g_Api_x1c = _find_class(str_xb, str_x16)
str_x24 = "session"
str_x2e = "requests.sessions"
str_x42 = "Session"
g_Session_x4c = _find_class(str_x2e, str_x42)
str_x54 = "headers"
str_x5e = "requests.structures"
str_x74 = "CaseInsensitiveDict"
g_CaseInsensitiveDict_x8a = _find_class(str_x5e, str_x74)
str_x91 = "_store"
str_x9a = "collections"
str_xa8 = "OrderedDict"
g_OrderedDict_xb6 = _find_class(str_x9a, str_xa8)
str_xbc = "user-agent"
str_xc9 = "User-Agent"
str_xd6 = "python-requests/2.28.2"
tup_xef = (str_xc9, str_xd6)
str_xf1 = "accept-encoding"
...
str_x5c9 = "stream"
str_x5d3 = "trust_env"
str_x5e0 = "max_redirects"
dict_x51 = {
        str_x54: what_x16c,
        str_x16d: what_x30d,
        str_x30e: tup_x32f,
        str_x331: dict_x33b,
        str_x33d: dict_x345,
        str_x355: dict_x35e,
        str_x360: True,
        str_x36a: None,
        str_x372: what_x5c8,
        str_x5c9: False,
        str_x5d3: True,
        str_x5e0: 30
}
what_x5f3 = g_Session_x4c.__new__(g_Session_x4c, *())
what_x5f3.__setstate__(dict_x51)
str_x5f4 = "baseurl"
str_x5fe = "https://example.com/"
dict_x21 = {str_x24: what_x5f3, str_x5f4: str_x5fe}
what_x616 = g_Api_x1c.__new__(g_Api_x1c, *())
what_x616.__setstate__(dict_x21)
return what_x616

It’s usually best to start reversing at the end with the return line. That is what is being returned from the pickle. Hit G to go to the end of the file. You will see the following code.

str_x5f4 = "baseurl"
str_x5fe = "https://example.com/"
dict_x21 = {str_x24: what_x5f3, str_x5f4: str_x5fe}
what_x616 = g_Api_x1c.__new__(g_Api_x1c, *())
what_x616.__setstate__(dict_x21)
return what_x616

The what_x616 variable is getting returned. The what part of the variable indicates that the decompiler does not know what type of object this is. This is because what_x616 is the result of a g_Api_x1c.__new__ call. On the other hand, g_Api_x1c gets a g_ prefix. The decompiler knows this is a global, since it is from an import. It even adds the Api part in to hint at what the import it. The x1c and x616 indicate the offset in the pickle where the object was created. We will use that later to patch the pickle.

Since we used flags, we can easily rename variables by renaming the flag. It might be helpful to rename the g_Api_x1c to make it easier to search for. Rename the flag with fr pick.g_Api_x1c pick.api. Notice, the flag will tab complete. List all flags with the f command. See f? for help.

Now run pdP @0 ~.. again. Instead of g_Api_x1c you will see api. If we search for its first use, you will find the below code.

str_xb = "__main__"
str_x16 = "Api"
api = _find_class(str_xb, str_x16)
str_x24 = "session"
str_x2e = "requests.sessions"
str_x42 = "Session"
g_Session_x4c = _find_class(str_x2e, str_x42)

Naively, _find_class(module, name) is equivalent to _getattribute(sys.modules[module], name)[0]. We can see the module is __main__ and the name is Api. So the api variable is just __main__.Api.

In this snippet of code, we see the request session being imported. You may have noticed the baseurl field in the previous snippet of code. Looks like this object contains a session for making backend API requests. Can we steal something good from it? Googling for “requests session basic authentication” turns up the auth attribute. Let’s look for “auth” in our pickle.

str_x30e = "auth"
str_x315 = "admin"
str_x31d = "Pickles are fun"
tup_x32f = (str_x315, str_x31d)
str_x331 = "proxies"
dict_x33b = {}
...
dict_x51 = {
        str_x54: what_x16c,
        str_x16d: what_x30d,
        str_x30e: tup_x32f,
        str_x331: dict_x33b,
        str_x33d: dict_x345,
        str_x355: dict_x35e,
        str_x360: True,
        str_x36a: None,
        str_x372: what_x5c8,
        str_x5c9: False,
        str_x5d3: True,
        str_x5e0: 30
}

It might be helpful to rename variables for understanding, or run pdP > /tmp/pickle_source.py to get a .py file to open in your favorite text editor. In short though, the above code sets up the dictionary dict_x51 where the auth element is set to the tuple ("admin", "Pickles are fun").

We just stole the admin credentials!

Patching

Now I don’t recommend doing this on a real pentest, but let’s take things farther. We can patch the pickle to use our own malicious webserver. We first need to find the current URL, so we search for “https” and find the following code.

str_x5f4 = "baseurl"
str_x5fe = "https://example.com/"
dict_x21 = {str_x24: what_x5f3, str_x5f4: str_x5fe}
what_x616 = api.__new__(g_Api_x1c, *())

So the baseurl of the API is being set to https://example.com/. To patch this, we seek to where the URL string is created. We can use the x5fe in the variable name to know where the variable was created, or we can just seek to the pick.str_x5e flag. When seeking to a flag in r2 you can tab complete the flag. Notice the prompt changes its location number after the seek command.

[0x00000000]> s pick.str_x5fe
[0x000005fe]> pd 1
            ;-- pick.str_x5fe:
            0x000005fe      8c1468747470.  short_binunicode "https://example.com/" ; 0x600

Let’s overwrite this URL with https://doyensec.com/. The below Radare2 commands are commented so you can understand what they are doing.

[0x000005fe]> oo+ # reopen file in read/write mode
[0x000005fe]> pd 3 # double check what next instructions should be
            ;-- pick.str_x5fe:
            0x000005fe      8c1468747470.  short_binunicode "https://example.com/" ; 0x600
            0x00000614      94             memoize
            0x00000615      75             setitems
[0x000005fe]> r+ 1 # add one extra byte to the file, since our new URL is slightly longer
[0x000005fe]> wa short_binunicode "https://doyensec.com/"
INFO: Written 23 byte(s) (short_binunicode "https://doyensec.com/") = wx 8c1568747470733a2f2f646f79656e7365632e636f6d2f @ 0x000005fe
[0x000005fe]> pd 3     # double check we did not clobber an instruction
            ;-- pick.str_x5fe:
            0x000005fe      8c1568747470.  short_binunicode "https://doyensec.com/" ; 0x600
            0x00000615      94             memoize
            ;-- pick.what_x616:
            0x00000616      75             setitems
[0x000005fe]> pdP @0 |tail      # check that the patch worked
        str_x5e0: 30
}
what_x5f3 = g_Session_x4c.__new__(g_Session_x4c, *())
what_x5f3.__setstate__(dict_x51)
str_x5f4 = "baseurl"
str_x5fe = "https://doyensec.com/"
dict_x21 = {str_x24: what_x5f3, str_x5f4: str_x5fe}
what_x617 = g_Api_x1c.__new__(g_Api_x1c, *())
what_x617.__setstate__(dict_x21)
return what_x617

JSON and Automation

Imagine this is just the first of 100 files and you want to patch them all. Radare2 is easy to script with r2pipe. Most commands in r2 have a JSON variant by adding a j to the end. In this case, pdPj will produce an AST in JSON. This is complete with offsets. Using this you can write a parser that will automatically find the baseurl element of the returned api object, get the offset and patch it.

JSON can also be helpful without r2pipe. This is because r2 has a bunch of built-in features for dealing with JSON. For example, we can pretty print JSON with ~{}, but for this pickle it would produce 1492 lines of JSON. So better yet, use r2’s internal gron output with ~{=} and grep for what you want.

[0x000005fe]> pdPj @0 ~{=}https
json.stack[0].value[1].args[0].value[0][1].value[1].args[0].value[1][1].value[1].args[0].value[0][1].value[1].args[0].value[10][1].value[0].value = "https";
json.stack[0].value[1].args[0].value[0][1].value[1].args[0].value[8][1].value[1].args[0].value = "https://";
json.stack[0].value[1].args[0].value[1][1].value = "https://doyensec.com/";

Now we can go use the provided JSON path to find the offset of the doyensec.com URL.

[0x00000000]> pdPj @0 ~{stack[0].value[1].args[0].value[1][1].value}
https://doyensec.com/
[0x00000000]> pdPj @0 ~{stack[0].value[1].args[0].value[1][1]}
{"offset":1534,"type":"PY_STR","value":"https://doyensec.com/"}
[0x00000000]> pdPj @0 ~{stack[0].value[1].args[0].value[1][1].offset}
1534
[0x00000000]> s `pdPj @0 ~{stack[0].value[1].args[0].value[1][1].offset}` ## seek to address using subcomand
[0x000005fe]> pd 1
            ;-- pick.str_x5fe:
            0x000005fe      8c1568747470.  short_binunicode "https://doyensec.com/" ; 0x600

Don’t forget you can pipe to external commands. For example, pdPj |jq can be used to search the AST for different patterns. For example, you could return all objects where the type is PY_GLOBAL.

Conclusion

The r2pickledec plugin simplifies reversing of pickles. Because it is a r2 plugin, you get all the features of r2. We barely scratched the surface of what r2 can do. If you’d like to learn more, check out the r2 book. Be sure to keep an eye out for my next post where I will go into Python pickle obfuscation techniques.


Testing Zero Touch Production Platforms and Safe Proxies

As more companies develop in-house services and tools to moderate access to production environments, the importance of understanding and testing these Zero Touch Production (ZTP) platforms grows 1 2. This blog post aims to provide an overview of ZTP tools and services, explore their security role in DevSecOps, and outline common pitfalls to watch out for when testing them.

SRE? ZTP?

“Every change in production must be either made by automation, prevalidated by software or made via audited break-glass mechanism.” – Seth Hettich, Former Production TL, Google

This terminology was popularized by Google’s DevOps teams and is the golden standard to this day. According to this picture, there are SREs, a selected group of engineers that can exclusively use their SSH production access to act when something breaks. But that access introduces reliability and security risks if they make a mistake or their accounts are compromised. To balance this risk, companies should automate the majority of the production operations while providing routes for manual changes when necessary. This is the basic reasoning behind what was introduced by the “Zero Touch Production” pattern.

The way we all feel about SRE

Safe Proxies In Production

The “Safe Proxy” model refers to the tools that allow authorized persons to access or modify the state of physical servers, virtual machines, or particular applications. From the original definition:

At Google, we enforce this behavior by restricting the target system to accept only calls from the proxy through a configuration. This configuration specifies which application-layer remote procedure calls (RPCs) can be executed by which client roles through access control lists (ACLs). After checking the access permissions, the proxy sends the request to be executed via the RPC to the target systems. Typically, each target system has an application-layer program that receives the request and executes it directly on the system. The proxy logs all requests and commands issued by the systems it interacts with.

The safety & security roles of Safe Proxies

There are various outage scenarios prevented by ZTP (e.g., typos, cut/paste errors, wrong terminals, underestimating blast radius of impacted machines, etc.). On paper, it’s a great way to protect production from human errors affecting the availability, but it can also help to prevent some forms of malicious access. A typical scenario involves an SRE that is compromised or malicious and tries to do what an attacker would do with privileges. This could include bringing down or attacking other machines, compromising secrets, or scraping user data programmatically. This is why testing these services will become more and more important as the attackers will find them valuable and target them.

Generic scheme about Safe Proxies

What does ZTP look like today

Many companies nowadays need these secure proxy tools to realize their vision, but they are all trying to reinvent the wheel in one way or another. This is because it’s an immature market and no off-the-shelf solutions exist. During the development, the security team is often included in the steering committee but may lack the domain-specific logic to build similar solutions. Another issue is that since usually the main driver is the DevOps team wanting operational safety, availability and integrity are prioritized at the expense of confidentiality. In reality, the ZTP framework development team should collaborate with SRE and security teams throughout the design and implementation phases, ensuring that security and reliability best practices are woven into the fabric of the framework and not just bolted on at the end.

Last but not least, these solutions are to this day suffering in their adoption rates and are subjected to lax intepretations (to a point where developers are the ones using these systems to access what they’re allowed to touch in production). These services are particularly juicy for both pentesters and attackers. It’s not an understatement to say that every actor compromising a box in a corporate environment should first look at these services to escalate their access.

What to look for when auditing ZTP tools/services

We compiled some of the most common issues we’ve encountered while testing ZTP implementations below:

A. Web Attack Surface

ZTP services often expose a web-based frontend for various purposes such as monitoring, proposing commands or jobs, and checking command output. These frontends are prime targets for classic web security vulnerabilities like Cross-Site Request Forgery (CSRF), Server-Side Request Forgery (SSRF), Insecure Direct Object References (IDORs), XML External Entity (XXE) attacks, and Cross-Origin Resource Sharing (CORS) misconfigurations. If the frontend is also used for command moderation, it presents an even more interesting attack surface.

B. Hooks

Webhooks are widely used in ZTP platforms due to their interaction with team members and on-call engineers. These hooks are crucial for the command approval flow ceremony and for monitoring. Attackers may try to manipulate or suppress any Pagerduty, Slack, or Microsoft Teams bot/hook notifications. Issues to look for include content spoofing, webhook authentication weaknesses, and replay attacks.

C. Safe Centralization

Safety checks in ZTP platforms are usually evaluated centrally. A portion of the solution is often hosted independently for availability, to evaluate the rules set by the SRE team. It’s essential to assess the security of the core service, as exploiting or polluting its visibility can affect the entire infrastructure’s availability (what if the service is down? who can access this service?).

In an hypotetical sample attack scenario, if a rule is set to only allow reboots of a certain percentage of the fleet, can an attacker pollute the fleet status and make the hosts look alive? This can be achieved with ping reply spoofing or via MITM in the case of plain HTTP health endpoints. Under these premises, network communications must be Zero Trust too to defend against this.

D. Insecure Default Templates

The templates for the policy configuration managing the access control for services are usually provided to service owners. These can be a source of errors themselves. Users should be guided to make the right choices by providing templates or automatically generating settings that are secure by default. For a full list of the design strategies presented, see the “Building Secure and Reliable Systems” bible 3.

E. Logging

Inconsistent or excessive logging retention of command outputs can be hazardous. Attackers might abuse discrepancies in logging retention to access user data or secrets logged in a given command or its results.

F. Rate-limiting

Proper rate-limiting configuration is essential to ensure an attacker cannot change all production “at once” by themselves. The rate limiting configuration should be agreed upon with the team responsible for the mediated services.

G. ACL Ownership

Another pitfall is found in what provides the ownership or permission logic for the services. If SREs can edit membership data via the same ZTP service or via other means, an attacker can do the same and bypass the solution entirely.

ZTP

H. Command Safeguards

Strict allowlists of parameters and configurations should be defined for commands or jobs that can be run. Similar to “living off the land binaries” (lolbins), if arguments to these commands are not properly vetted, there’s an increased risk of abuse.

I. Traceability and Scoping

A reason for the pushed command must always be requested by the user (who, when, what, WHY). Ensuring traceability and scoping in the ZTP platform helps maintain a clear understanding of actions taken and their justifications.

J. Scoped Access

The ZTP platform should have rules in place to detect not only if the user is authorized to access user data, but also which kind and at what scale. Lack of fine-grained authorization or scoping rules for querying user data increases the risk of abuse.

K. Different Interfaces, Different Requirements

ZTP platforms usually have two types of proxy interfaces: Remote Procedure Call (RPC) and Command Line Interface (CLI). The RPC proxy is used to run CLI on behalf of the user/service in production in a controlled way. Since the implementation varies between the two interfaces, looking for discrepancies in the access requirements or logic is crucial.

L. Service vs Global rules

The rule evaluation priority (Global over Service-specific) is another area of concern. In general, service rules should not be able to override global rules but only set stricter requirements.

M. Command Parsing

If an allowlist is enforced, inspect how the command is parsed when an allowlist is created (abstract syntax tree (AST), regex, binary match, etc.).

N. Race Conditions

All operations should be queued, and a global queue for the commands should be respected. There should be no chance of race conditions if two concurrent operations are issued.

O. Break-glass

In the ZTP pattern, a break-glass mechanism is always available for emergency response. Auditing this mode is essential. Entering it must be loud, justified, alert security, and be heavily logged. As an additional security measure, the breakglass mechanism for zero trust networking should be available only from specific locations. These locations are the organization’s panic rooms, specific locations with additional physical access controls to offset the increased trust placed in their connectivity.

Conclusions

As more companies develop and adopt Zero Touch Production platforms, it is crucial to understand and test these services for security vulnerabilities. With an increase in vendors and solutions for Zero Touch Production in the coming years, researching and staying informed about these platforms’ security issues is an excellent opportunity for security professionals.

References

  1. Michał Czapiński and Rainer Wolafka from Google Switzerland, “Zero Touch Prod: Towards Safer and More Secure Production Environments”. USENIX (2019). Link / Talk 

  2. Ward, Rory, and Betsy Beyer. “Beyondcorp: A new approach to enterprise security”, (2014). Link 

  3. Adkins, Heather, et al. ““Building secure and reliable systems: best practices for designing, implementing, and maintaining systems”. O’Reilly Media, (2020). Link 


The Case For Improving Crypto Wallet Security

Anatomy Of A Modern Day Crypto Scam

A large number of today’s crypto scams involve some sort of phishing attack, where the user is tricked into visiting a shady/malicious web site and connecting their wallet to it. The main goal is to trick the user into signing a transaction which will ultimately give the attacker control over the user’s tokens.

Usually, it all starts with a tweet or a post on some Telegram group or Slack channel, where a link is sent advertising either a new yield farming protocol boasting large APYs, or a new NFT project which just started minting. In order to interact with the web site, the user would need to connect their wallet and perform some confirmation or authorization steps.

Let’s take a look at the common NFT approve scam. The user is lead to the malicious NFT site, advertising a limited pre-mint of their new NFT collection. The user is then prompted to connect their wallet and sign a transaction, confirming the mint. However, for some reason, the transaction fails. The same happens on the next attempt. With each failed attempt, the user becomes more and more frustrated, believing the issue causes them to miss out on the mint. Their concentration and focus shifts slightly from paying attention to the transactions, to missing out on a great opportunity.

At this point, the phishing is in full swing. A few more failed attempts, and the victim bites.

Wallet Phishing

(Image borrowed from How scammers manipulate Smart Contracts to steal and how to avoid it)

The final transaction, instead of the mint function, calls the setApprovalForAll, which essentially will give the malicious actor control over the user’s tokens. The user by this point is in a state where they blindly confirm transactions, hoping that the minting will not close.

Unfortunately, the last transaction is the one that goes through. Game over for the victim. All the attacker has to do now is act quickly and transfer the tokens away from the user’s wallet before the victim realizes what happened.

These type of attacks are really common today. A user stumbles on a link to a project offering new opportunities for profits, they connect their wallet, and mistakenly hand over their tokens to malicious actors. While a case can be made for user education, responsibility, and researching a project before interacting with it, we believe that software also has a big part to play.

The Case For Improving Crypto Wallet Security

Nobody can deny that the introduction of both blockchain-based technologies and Web3 have had a massive impact on the world. A lot of them have offered the following common set of features:

  • transfer of funds
  • permission-less currency exchange
  • decentralized governance
  • digital collectibles

Regardless of the tech-stack used to build these platforms, it’s ultimately the users who make the platform. This means that users need a way to interact with their platform of choice. Today, the most user-friendly way of interacting with blockchain-based platforms is by using a crypto wallet. In simple terms, a crypto wallet is a piece of software which facilitates signing of blockchain transactions using the user’s private key. There are multiple types of wallets including software, hardware, custodial, and non-custodial. For the purposes of this post, we will focus on software based wallets.

Before continuing, let’s take a short detour to Web2. In that world, we can say that platforms (also called services, portals or servers) are primarily built using TCP/IP based technologies. In order for users to be able to interact with them, they use a user-agent, also known as a web browser. With that said, we can make the following parallel to Web3:

Technology Communication Protocol User-Agent
Web2 HTTP/TLS Web Browser
Web3 Blockchain JSON RPC Crypto Wallet

Web browsers are arguably much, much more complex pieces of software compared to crypto wallets - and with good reason. As the Internet developed, people figured out how to put different media on it and web pages allowed for dynamic and scriptable content. Over time, advancements in HTML and CSS technologies changed what and how content could be shown on a single page. The Internet became a place where people went to socialize, find entertainment, and make purchases. Browsers needed to evolve, to support new technological advancements, which in turn increased complexity. As with all software, complexity is the enemy, and complexity is where bugs and vulnerabilities are born. Browsers needed to implement controls to help mitigate web-based vulnerabilities such as spoofing, XSS, and DNS rebinding while still helping to facilitate secure communication via encrypted TLS connections.

Next, lets see what a typical crypto wallet interaction for a normal user might look like.

The Current State Of Things In The Web3 World

Using a Web3 platform today usually means that a user is interacting with a web application (Dapp), which contains code to interact with the user’s wallet and smart contracts belonging to the platform. The steps in that communication flow generally look like:

1. Open the Dapp

In most cases, the user will navigate their web browser to a URL where the Dapp is hosted (ex. Uniswap). This will load the web page containing the Dapp’s code. Once loaded, the Dapp will try to connect to the user’s wallet.

Dapp and User Wallet

2. Authorizing The Dapp

A few of the protections implemented by crypto wallets include requiring authorization before being able to access the user’s accounts and requests for transactions to be signed. This was not the case before EIP-1102. However, implementing these features helped keep users anonymous, stop Dapp spam, and provide a way for the user to manage trusted and un-trusted Dapp domains.

Authorizing The Dapp 1

If all the previous steps were completed successfully, the user can start using the Dapp.

When the user decides to perform an action (make a transaction, buy an NFT, stake their tokens, etc.), the user’s wallet will display a popup, asking whether the user confirms the action. The transaction parameters are generated by the Dapp and forwarded to the wallet. If confirmed, the transaction will be signed and published to the blockchain, awaiting confirmation.

Authorizing The Dapp 2

Besides the authorization popup when initially connecting to the Dapp, the user is not shown much additional information about the application or the platform. This ultimately burdens the user with verifying the legitimacy and trustworthiness of the Dapp and, unfortunately, this requires some degree of technical knowledge often out-of-reach for the majority of common users. While doing your own research, a common mantra of the Web3 world, is recommended, one misstep can lead to significant loss of funds.

That being said, let’s now take another detour to Web2 world, and see what a similar interaction looks like.

How Does The Web2 World Handle Similar Situations?

Like the previous example, we’ll look at what happens when a user wants to use a Web2 application. Let’s say that the user wants to check their email inbox. They’ll start by navigating their browser to the email domain (ex. Gmail). In the background, the browser performs a TLS handshake, trying to establish a secure connection to Gmail’s servers. This will enable an encrypted channel between the user’s browser and Gmail’s servers, eliminating the possibility of any eavesdropping. If the handshake is successful, an encrypted connection is established and communicated to the user through the browser’s UI.

Browser Lock

The secure connection is based on certificates issued for the domain the user is trying to access. A certificate contains a public key used to establish the encrypted connection. Additionally, certificates must be signed by a trusted third-party called a Certificate Authority (CA), giving the issued certificate legitimacy and guaranteeing that it belongs to the domain being accessed.

But, what happens if that is not the case? What happens when the certificate is for some reason rejected by the browser? Well, in that case a massive red warning is shown, explaining what happened to the user.

Browser Certificate Error

Such warnings will be shown when a secure connection could not be established, the certificate of the host is not trusted or if the certificate is expired. The browser also tries to show, in a human-readable manner, as much useful information about the error as possible. At this point, it’s the choice of the user whether they trust the site and want to continue interacting with it. The task of the browser is to inform the user of potential issues.

What Can Be Done?

Crypto wallets should show the user as much information about the action being performed as possible. The user should see information about the domain/Dapp they are interacting with. Details about the actual transaction’s content, such as what function is being invoked and its parameters should be displayed in a user-readable fashion.

Comparing both previous examples, we can notice a lack of verification and information being displayed in crypto wallets today. This, then poses the question: what can be done? There exist a number of publicly available indicators for the health and legitimacy of a project. We believe communicating these to the user may be a good step forward in addressing this issue. Let’s go quickly go through them.

Proof Of Smart Contract Ownership

It is important to prove that a domain has ownership over the smart contracts with which it interacts. Currently, this mechanism doesn’t seem to exist. However, we think we have a good solution. Similarly to how Apple performs merchant domain verification, a simple JSON file or dapp_file can be used to verify ownership. The file can be stored on the root of the Dapp’s domain, on the path .well-known/dapp_file. The JSON file can contain the following information:

  • address of the smart contract the Dapp is interacting with
  • timestamp showing when the file was generated
  • signature of the content, verifying the validity of the file

At this point, a reader might say: “How does this show ownership of the contract?”. The key to that is the signature. Namely, the signature is generated by using the private key of the account which deployed the contract. The transparency of the blockchain can be used to get the deployer address, which can then be used to verify the signature (similarly to how Ethereum smart contracts verify signatures on-chain).

This mechanism enables creating an explicit association between a smart contract and the Dapp. The association can later be used to perform additional verification.

Domain Registration Records

When a new domain is purchased or registered, a public record is created in a public registrar, indicating the domain is owned by someone and is no longer available for purchase. The domain name is used by the Domain Name Service, or DNS, which translates it (ex www.doyensec.com) to a machine-friendly IP address (ex. 34.210.62.107).

The creation date of a DNS record shows when the Dapp’s domain was initially purchased. So, if a user is trying to interact with an already long established project and runs into a domain which claims to be that project with a recently created domain registration record, it may be a signal of possible fraudulent activities.

TLS Certificates

Creation and expiration dates of TLS certificates can be viewed in a similar fashion as DNS records. However, due to the short duration of certificates issued by services such as Let’s Encrypt, there is a strong chance that visitors of the Dapp will be shown a relatively new certificate.

TLS certificates, however, can be viewed as a way of verifying a correct web site setup where the owner took additional steps to allow secure communication between the user and their application.

Smart Contract Source Code Verification Status

Published and verified source code allows for audits of the smart contract’s functionality and can allow quick identification of malicious activity.

Smart Contract Deployment Date

The smart contract’s deployment date can provide additional information about the project. For example, if attackers set up a fake Uniswap web site, the likelihood of the malicious smart contract being recently deployed is high. If interacting with an already established, legitimate project, such a discrepancy should alarm the user of potential malicious activity.

Smart Contract Interactions

Trustworthiness of a project can be seen as a function of the number of interactions with that project’s smart contracts. A healthy project, with a large user base will likely have a large number of unique interactions with the project’s contracts. A small number of interactions, unique or not, suggest the opposite. While typical of a new project, it can also be an indicator of smart contracts set up to impersonate a legitimate project. Such smart contracts will not have the large user base of the original project, and thus the number of interactions with the project will be low.

Overall, a large number of unique interactions over a long period of time with a smart contract may be a powerful indicator of a project’s health and the health of its ecosystem.

Our Suggestion

While there are authorization steps implemented when a wallet is connecting to an unknown domain, we think there is space for improvement. The connection and transaction signing process can be further updated to show user-readable information about the domain/Dapp being accessed.

As a proof-of-concept, we implemented a simple web service https://github.com/doyensec/wallet-info. The service utilizes public information, such as domain registration records, TLS certificate information and data available via Etherscan’s API. The data is retrieved, validated, parsed and returned to the caller.

The service provides access to the following endpoints:

  • /host?url=<url>
  • /contract?address=<address>

The data these endpoints return can be integrated in crypto wallets at two points in the user’s interaction.

Initial Dapp Access

The /host endpoint can be used when the user is initially connecting to a Dapp. The Dapp’s URL should be passed as a parameter to the endpoint. The service will use the supplied URL to gather information about the web site and its configuration. Additionally, the service will check for the presence of the dapp_file on the site’s root and verify its signature. Once processing is finished, the service will respond with:

{
  
  "name": "Example Dapp",
  "timestamp": "2000-01-01T00:00:00Z",
  "domain": "app.example.com",
  "tls": true,
  "tls_issued_on": "2022-01-01T00:00:00Z",
  "tls_expires_on": "2022-01-01TT00:00:00Z",
  "dns_record_created": "2012-01-01T00:00:00Z",
  "dns_record_updated": "2022-08-14T00:01:31Z",
  "dns_record_expires": "2023-08-13T00:00:00Z",
  "dapp_file": true,
  "valid_signature": true
}

This information can be shown to the user in a dialog UI element, such as:

Wallet Dialog Safe

As a concrete example, lets take a look at this fake Uniswap site was active during the writing of this post. If a user tried to connect their wallet to the Dapp running on the site, the following information would be returned to the user:

{
  
  "name": null,
  "timestamp": null,
  "domain": "apply-uniswap.com",
  "tls": true,
  "tls_issued_on": "2023-02-06T22:37:19Z",
  "tls_expires_on": "2023-05-07T22:37:18Z",
  "dns_record_created": "2023-02-06T23:31:09Z",
  "dns_record_updated": "2023-02-06T23:31:10Z",
  "dns_record_expires": "2024-02-06T23:31:09Z",
  "dapp_file": false,
  "valid_signature": false
}

The missing information from the response reflect that the dapp_file was not found on this domain. This information will then be reflected on the UI, informing the user of potential issues with the Dapp:

Wallet Dialog Domain Unsafe

At this point, the users can review the information and decide whether they feel comfortable giving the Dapp access to their wallet. Once the Dapp is authorized, this information doesn’t need to be shown anymore. Though, it would be beneficial to occasionally re-display this information, so that any changes in the Dapp or its domain will be communicated to the user.

Making A Transaction

Transactions can be split in two groups: transactions that transfer native tokens and transactions which are smart contract function calls. Based on the type of transaction being performed, the /contract endpoint can be used to retrieve information about the recipient of the transferred assets.

For our case, the smart contract function calls are the more interesting group of transactions. The wallet can retrieve information about both the smart contract on which the function will be called as well as the function parameter representing the recipient. For example the spender parameter in the approve(address spender, uint256 amount) function call. This information can be retrieved on a case-by-case basis, depending on the function call being performed.

Signatures of widely used functions are available and can be implemented in the wallet as a type of an allow or safe list. If a signature is unknown, the user should be informed about it.

Verifying the recipient gives users confidence they are transferring tokens, or allowing access to their tokens for known, legitimate addresses.

An example response for a given address will look something like:

{
  "is_contract": true,
  "contract_address": "0xF4134146AF2d511Dd5EA8cDB1C4AC88C57D60404",
  "contract_deployer": "0x002362c343061fef2b99d1a8f7c6aeafe54061af",
  "contract_deployed_on": "2023-01-01T00:00:00Z",
  "contract_tx_count": 10,
  "contract_unique_tx": 5,
  "valid_signature": true,
  "verified_source": false
}

In the background, the web service will gather information about the type of address (EOA or smart contract), code verification status, address interaction information etc. All of that should be shown to the user as part of the transaction confirmation step.

Wallet Dialog  Unsafe

Links to the smart contract and any additional information can be provided here, helping users perform additional verification if they so wish.

In the case of native token transfers, the majority of verification consists of typing in the valid to address. This is not a task that is well suited for automatic verification. For this use case, wallets provide an “address book” like functionality, which should be utilized to minimize any user errors when initializing a transaction.

Conclusion

The point of this post is to highlight the shortcomings of today’s crypto wallet implementations, to present ideas, and make suggestions for how they can be improved. This field is actively being worked on. Recently, MetaMask updated their confirmation UI to display additional information, informing users of potential setApprovalForAll scams. This is a step in the right direction, but there is still a long way to go. Features like these can be built upon and augmented, to a point where users can make transactions and know, to a high level of certainty, that they are not making a mistake or being scammed.

There are also third-party groups like WalletGuard and ZenGo who have implemented similar verifications described in this post. These features should be a standard and required for every crypto wallet, and not just an additional piece of software that needs to be installed.

Like the user-agent of Web2, the web browser, user-agents of Web3 should do as much as possible to inform and protect their users.

Our implementation of the wallet-info web service is just an example of how public information can be pooled together. That information, combined with a good UI/UX design, will greatly improve the security of crypto wallets and, in turn, the security of the entire Web3 ecosystem.

Does Dapp verification completely solve the phishing/scam problem? Unfortunately, the answer is no. The proposed changes can help users in distinguishing between legitimate projects and potential scams, and guide them to make the right decision. Dedicated attackers, given enough time and funds, will always be able to produce a smart contract, Dapp or web site, which will look harmless using the indicators described above. This is true for both the Web2 and Web3 world.

Ultimately, it is up to the user to decide if the they feel comfortable giving their login credentials to a web site, or access to their crypto wallet to a Dapp. All software can do is point them in the right direction.


Windows Installer EOP (CVE-2023-21800)

TL;DR: This blog post describes the details and methodology of our research targeting the Windows Installer (MSI) installation technology. If you’re only interested in the vulnerability itself, then jump right there

Introduction

Recently, I decided to research a single common aspect of many popular Windows applications - their MSI installer packages.

Not every application is distributed this way. Some applications implement custom bootstrapping mechanisms, some are just meant to be dropped on the disk. However, in a typical enterprise environment, some form of control over the installed packages is often desired. Using the MSI packages simplifies the installation process for any number of systems and also provides additional benefits such as automatic repair, easy patching, and compatibility with GPO. A good example is Google Chrome, which is typically distributed as a standalone executable, but an enterprise package is offered on a dedicated domain.

Another interesting aspect of enterprise environments is a need for strict control over employee accounts. In particular, in a well-secured Windows environment, the rule of least privileges ensures no administrative rights are given unless there’s a really good reason. This is bad news for malware or malicious attackers who would benefit from having additional privileges at hand.

During my research, I wanted to take a look at the security of popular MSI packages and understand whether they could be used by an attacker for any malicious purposes and, in particular, to elevate local privileges.

Typical installation

It’s very common for the MSI package to require administrative rights. As a result, running a malicious installer is a straightforward game-over. I wanted to look at legitimate, properly signed MSI packages. Asking someone to type an admin password, then somehow opening elevated cmd is also an option that I chose not to address in this blog post.

Let’s quickly look at how the installer files are generated. Turns out, there are several options to generate an MSI package. Some of the most popular ones are WiX Toolset, InstallShield, and Advanced Installer. The first one is free and open-source, but requires you to write dedicated XML files. The other two offer various sets of features, rich GUI interfaces, and customer support, but require an additional license. One could look for generic vulnerabilities in those products, however, it’s really hard to address all possible variations of offered features. On the other hand, it’s exactly where the actual bugs in the installation process might be introduced.

During the installation process, new files will be created. Some existing files might also be renamed or deleted. The access rights to various securable objects may be changed. The interesting question is what would happen if unexpected access rights are present. Would the installer fail or would it attempt to edit the permission lists? Most installers will also modify Windows registry keys, drop some shortcuts here and there, and finally log certain actions in the event log, database, or plain files.

The list of actions isn’t really sealed. The MSI packages may implement the so-called custom actions which are implemented in a dedicated DLL. If this is the case, it’s very reasonable to look for interesting bugs over there.

Once we have an installer package ready and installed, we can often observe a new copy being cached in the C:\Windows\Installers directory. This is a hidden system directory where unprivileged users cannot write. The copies of the MSI packages are renamed to random names matching the following regular expression: ^[0-9a-z]{7}\.msi$. The name will be unique for every machine and even every new installation. To identify a specific package, we can look at file properties (but it’s up to the MSI creator to decide which properties are configured), search the Windows registry, or ask the WMI:

$ Get-WmiObject -class Win32_Product | ? { $_.Name -like "*Chrome*" } | select IdentifyingNumber,Name

IdentifyingNumber                      Name
-----------------                      ----
{B460110D-ACBF-34F1-883C-CC985072AF9E} Google Chrome

Referring to the package via its GUID is our safest bet. However, different versions of the same product may still have different identifiers.

Assuming we’re using an unprivileged user account, is there anything interesting we can do with that knowledge?

Repair process

The builtin Windows tool, called msiexec.exe, is located in the System32 and SysWOW64 directories. It is used to manage the MSI packages. The tool is a core component of Windows with a long history of vulnerabilities. As a side note, I also happen to have found one such issue in the past (CVE-2021-26415). The documented list of its options can be found on the MSDN page although some additional undocumented switches are also implemented.

The flags worth highlighting are:

  • /l*vx to log any additional details and search for interesting events
  • /qn to hide any UI interactions. This is extremely useful when attempting to develop an automated exploit. On the other hand, potential errors will result in new message boxes. Until the message is accepted, the process does not continue and can be frozen in an unexpected state. We might be able to modify some existing files before the original access rights are reintroduced.

The repair options section lists flags we could use to trigger the repair actions. These actions would ensure the bad files are removed, and good files are reinstalled instead. The definition of bad is something we control, i.e., we can force the reinstallation of all files, all registry entries, or, say, only those with an invalid checksum.

Parameter Description
/fp Repairs the package if a file is missing.
/fo Repairs the package if a file is missing, or if an older version is installed.
/fe Repairs the package if file is missing, or if an equal or older version is installed.
/fd Repairs the package if file is missing, or if a different version is installed.
/fc Repairs the package if file is missing, or if checksum does not match the calculated value.
/fa Forces all files to be reinstalled.
/fu Repairs all the required user-specific registry entries.
/fm Repairs all the required computer-specific registry entries.
/fs Repairs all existing shortcuts.
/fv Runs from source and re-caches the local package.

Most of the msiexec actions will require elevation. We cannot install or uninstall arbitrary packages (unless of course the system is badly misconfigured). However, the repair option might be an interesting exception! It might be, because not every package will work like this, but it’s not hard to find one that will. For these, the msiexec will auto-elevate to perform necessary actions as a SYSTEM user. Interestingly enough, some actions will be still performed using our unprivileged account making the case even more noteworthy.

The impersonation of our account will happen for various security reasons. Only some actions can be impersonated, though. If you’re seeing a file renamed by the SYSTEM user, it’s always going to be a fully privileged action. On the other hand, when analyzing who exactly writes to a given file, we need to look at how the file handle was opened in the first place.

We can use tools such as Process Monitor to observe all these events. To filter out the noise, I would recommend using the settings shown below. It’s possible to miss something interesting, e.g., a child processes’ actions, but it’s unrealistic to dig into every single event at once. Also, I’m intentionally disabling registry activity tracking, but occasionally it’s worth reenabling this to see if certain actions aren’t controlled by editable registry keys.

Procmon filter settings

Another trick I’d recommend is to highlight the distinction between impersonated and non-impersonated operations. I prefer to highlight anything that isn’t explicitly impersonated, but you may prefer to reverse the logic.

Procmon highlighting settings

Then, to start analyzing the events of the aforementioned Google Chrome installer, one could run the following command:

msiexec.exe /fa '{B460110D-ACBF-34F1-883C-CC985072AF9E}'

The stream of events should be captured by ProcMon but to look for issues, we need to understand what can be considered an issue. In short, any action on a securable object that we can somehow modify is interesting. SYSTEM writes a file we control? That’s our target.

Typically, we cannot directly control the affected path. However, we can replace the original file with a symlink. Regular symlinks are likely not available for unprivileged users, but we may use some tricks and tools to reinvent the functionality on Windows.

Windows EoP primitives

Although we’re not trying to pop a shell out of every located vulnerability, it’s interesting to educate the readers on what would be possible given some of the Elevation of Privilege primitives.

With an arbitrary file creation vulnerability we could attack the system by creating a DLL that one of the system processes would load. It’s slightly harder, but not impossible, to locate a Windows process that loads our planted DLL without rebooting the entire system.

Having an arbitrary file creation vulnerability but with no control over the content, our chances to pop a shell are drastically reduced. We can still make Windows inoperable, though.

With an arbitrary file delete vulnerability we can at least break the operating system. Often though, we can also turn this into an arbitrary folder delete and use the sophisticated method discovered by Abdelhamid Naceri to actually pop a shell.

The list of possible primitives is long and fascinating. A single EoP primitive should be treated as a serious security issue, nevertheless.

One vulnerability to rule them all (CVE-2023-21800)

I’ve observed the same interesting behavior in numerous tested MSI packages. The packages were created by different MSI creators using different types of resources and basically had nothing in common. Yet, they were all following the same pattern. Namely, the environment variables set by the unprivileged user were also used in the context of the SYSTEM user invoked by the repair operation.

Although I initially thought that the applications were incorrectly trusting some environment variables, it turned out that the Windows Installer’s rollback mechanism was responsible for the insecure actions.

7-zip

7-Zip provides dedicated Windows Installers which are published on the project page. The following file was tested:

Filename Version
7z2201-x64.msi 22.01

To better understand the problem, we can study the source code of the application. The installer, defined in the DOC/7zip.wxs file, refers to the ProgramMenuFolder identifier.

     <Directory Id="ProgramMenuFolder" Name="PMenu" LongName="Programs">
        <Directory Id="PMenu" Name="7zip" LongName="7-Zip" />
      </Directory>
      ...
     <Component Id="Help" Guid="$(var.CompHelp)">
        <File Id="_7zip.chm" Name="7-zip.chm" DiskId="1" >
            <Shortcut Id="startmenuHelpShortcut" Directory="PMenu" Name="7zipHelp" LongName="7-Zip Help" />
        </File>
     </Component>

The ProgramMenuFolder is later used to store some components, such as a shortcut to the 7-zip.chm file.

As stated on the MSDN page:

The installer sets the ProgramMenuFolder property to the full path of the Program Menu folder for the current user. If an “All Users” profile exists and the ALLUSERS property is set, then this property is set to the folder in the “All Users” profile.

In other words, the property will either point to the directory controlled by the current user (in %APPDATA% as in the previous example), or to the directory associated with the “All Users” profile.

While the first configuration does not require additional explanation, the second configuration is tricky. The C:\ProgramData\Microsoft\Windows\Start Menu\Programs path is typically used while C:\ProgramData is writable even by unprivileged users. The C:\ProgramData\Microsoft path is properly locked down. This leaves us with a secure default.

However, the user invoking the repair process may intentionally modify (i.e., poison) the PROGRAMDATA environment variable and thus redirect the “All Users” profile to the arbitrary location which is writable by the user. The setx command can be used for that. It modifies variables associated with the current user but it’s important to emphasize that only the future sessions are affected. A completely new cmd.exe instance should be started to inherit the new settings.

Instead of placing legitimate files, a symlink to an arbitrary file can be placed in the %PROGRAMDATA%\Microsoft\Windows\Start Menu\Programs\7-zip\ directory as one of the expected files. As a result, the repair operation will:

  • Remove the arbitrary file (using the SYSTEM privileges)
  • Attempt to restore the original file (using an unprivileged user account)

The second action will fail, resulting in an Arbitrary File Delete primitive. This can be observed on the following capture, assuming we’re targeting the previously created C:\Windows\System32\__doyensec.txt file. We intentionally created a symlink to the targeted file under the C:\FakeProgramData\Microsoft\Windows\Start Menu\Programs\7-zip\7-Zip Help.lnk path.

The operation result in REPARSE

Firstly, we can see the actions resulting in the REPARSE status. The file is briefly processed (or rather its attributes are), and the SetRenameInformationFile is called on it. The rename part is slightly misleading. What is actually happening is that file is moved to a different location. This is how the Windows installer creates rollback instructions in case something goes wrong. As stated before, the SetRenameInformationFile doesn’t work on the file handle level and cannot be impersonated. This action runs with the full SYSTEM privileges.

Later on, we can spot attempts to restore the original file, but using an impersonated token. These actions result in ACCESS DENIED errors, therefore the targeted file remains deleted.

The operation result in REPARSE

The same sequence was observed in numerous other installers. For instance, I worked with PuTTY’s maintainer on a possible workaround which was introduced in the 0.78 version. In that version, the elevated repair is allowed only if administrator credentials are provided. However, this isn’t functionally equal and has introduced some other issues. The 0.79 release should restore the old WiX configuration.

Redirection Guard

The issue was reported directly to Microsoft with all the above information and a dedicated exploit. Microsoft assigned CVE-2023-21800 identifier to it.

It was reproducible on the latest versions of Windows 10 and Windows 11. However, it was not bounty-eligible as the attack was already mitigated on the Windows 11 Developer Preview. The same mitigation has been enabled with the 2022-02-14 update.

In October 2022 Microsoft shipped a new feature called Redirection Guard on Windows 10 and Windows 11. The update introduced a new type of mitigation called ProcessRedirectionTrustPolicy and the corresponding PROCESS_MITIGATION_REDIRECTION_TRUST_POLICY structure. If the mitigation is enabled for a given process, all processed junctions are additionally verified. The verification first checks if the filesystem junction was created by non-admin users and, if so, if the policy prevents following them. If the operation is prevented, the error 0xC00004BC is returned. The junctions created by admin users are explicitly allowed as having a higher trust-level label.

In the initial round, Redirection Guard was enabled for the print service. The 2022-02-14 update enabled the same mitigation on the msiexec process.

This can be observed in the following ProcMon capture:

The 0xC00004BC error returned by the new mitigation

The msiexec is one of a few applications that have this mitigation enforced by default. To check for yourself, use the following not-so-great code:

#include <windows.h>
#include <TlHelp32.h>
#include <cstdio>
#include <string>
#include <vector>
#include <memory>

using AutoHandle = std::unique_ptr<std::remove_pointer_t<HANDLE>, decltype(&CloseHandle)>;
using Proc = std::pair<std::wstring, AutoHandle>;

std::vector<Proc> getRunningProcesses() {
    std::vector<Proc> processes;

    std::unique_ptr<std::remove_pointer_t<HANDLE>, decltype(&CloseHandle)> snapshot(CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0), &CloseHandle);

    PROCESSENTRY32 pe32;
    pe32.dwSize = sizeof(pe32);
    Process32First(snapshot.get(), &pe32);

    do {
        auto h = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, pe32.th32ProcessID);
        if (h) {
            processes.emplace_back(std::wstring(pe32.szExeFile), AutoHandle(h, &CloseHandle));
        }
    } while (Process32Next(snapshot.get(), &pe32));

    return processes;
}

int main() {
    auto runningProcesses = getRunningProcesses();

    PROCESS_MITIGATION_REDIRECTION_TRUST_POLICY policy;

    for (auto& process : runningProcesses) {
        auto result = GetProcessMitigationPolicy(process.second.get(), ProcessRedirectionTrustPolicy, &policy, sizeof(policy));

        if (result && (policy.AuditRedirectionTrust | policy.EnforceRedirectionTrust | policy.Flags)) {
            printf("%ws:\n", process.first.c_str());
            printf("\tAuditRedirectionTrust: % d\n\tEnforceRedirectionTrust : % d\n\tFlags : % d\n", policy.AuditRedirectionTrust, policy.EnforceRedirectionTrust, policy.Flags);
        }
    }
}

The Redirection Guard should prevent an entire class of junction attacks and might significantly complicate local privilege escalation attacks. While it addresses the previously mentioned issue, it also addresses other types of installer bugs, such as when a privileged installer moves files from user-controlled directories.

Microsoft Disclosure Timeline

Status Data
Vulnerability reported to Microsoft 9 Oct 2022
Vulnerability accepted 4 Nov 2022
Patch developed 10 Jan 2023
Patch released 14 Feb 2023