Saturday, January 31, 2015

Patching, Emulating, and Debugging a Netgear Embedded Web Server

Previously I posted about running and remotely debugging a Netgear UPnP daemon using QEMU and IDA Pro. This time we’ll take on the challenge of running the built-in web server from the Netgear R6200 in emulation.

The httpd daemon is responsible for so much more than the web interface. This daemon is responsible for a silly amount of system management, including configuring firewall rules, managing the samba and ftp file servers, managing attached USB storage, and many other things. And it does all of this management as part of its initialization, which means lots of opportunities to fail or crash when running in emulation as a standalone service.

Running this device’s web server in emulation involves substantially more work, but it is still doable. First we need to figure out how to invoke the httpd program. Below is a script I use to start up httpd in emulation.


# run with DEBUG=1 to attach gdbserver


if [ "x1" = "x$DEBUG" ];

rm ./tmp/shm_id
rm ./var/run/
ipcrm -S 0x0001e240
for ipc in $(ipcs -m | grep 0x | cut -d " " -f 2); do ipcrm -m $ipc; done

chroot $ROOTFS /bin/sh -c "LD_PRELOAD=/ $DEBUGGER /usr/sbin/httpd -S -E /usr/sbin/ca.pem /usr/sbin/httpsd.pem"

I partly based this on the command line arguments that httpd is invoked with on the actual device. The script chroots into the router's filesystem and then runs httpd. Further, if you set DEBUG=1 on the command line, the script will use gdbserver to execute the daemon and wait for a debugger connection on port 1234.

As with the UPnP daemon, an early challenge is the fact that QEMU doesn’t provide NVRAM for configuration parameters, so calls to nvram_get() will fail. We can work around this with my project nvram-faker. Nvram-faker is loaded using LD_PRELOAD and hooks calls to nvram_get(). It reads a configuration from a text file and prints the results of nvram queries to standard error. Queries for unknown parameters are printed in red, helping to diagnose what parameters are needed that you haven’t yet provided. If you don’t have an instance of the hardware, this is an exercise in guesswork and trial and error. You need to intuit sane values for the queried parameters and iteratively fill in missing parameters as they are queried. If you do have the actual gear, and you can get a shell[1] on the device, you can extract the NVRAM configuration from flash and convert it into an INI file for nvram-faker. I’ll post the nvram configuration I ended up using at the end.

When attempting to run httpd, I found it kept crashing with SIGBUS. It turns out that on startup the daemon wants to open a shared memory segment for IPC. It looks for the file /tmp/shm_id, which may have been created by another process. If it finds that file it attempts to attach to the shared memory segment associated with the ID in the file. If the file doesn’t exist, then httpd opens a new shared memory segment and then writes the ID to that file. The problem is there is no check to see if shmat() failed, returning negative one (cast to a void *). In any case, the program attempts to deference the return value of shmat() at 0x00419308. In the case of failure, 0xffffffff is dereferenced, crashing the program. The crash is SIGBUS rather than SIGSEGV due to the misaligned memory access.

shmat fail

If the http server has previously run and not exited cleanly, then /tmp/shm_id will hang around. The solution is easy: have our script delete /tmp/shm_id before starting the server.

Another problem is that the http daemon…well…daemonizes. As far as I know, there’s no way to have IDA and gdbserver follow fork()s, so this is a problem. If we could get the daemon to run reliably, we could start it, let it daemonize, and then attach to the forked process. However, we’re going to need to do a fair amount of debugging just to get this program running, so letting it start and daemonize isn’t an option.

jalr to daemon()

In the above screenshot we see the jump to daemon() at 0x004183fc. The easiest approach is to patch out the call to daemon().

Just a few thoughts about binary patching: It’s important to keep in mind that if you change the binary you’re analyzing, you’re no longer analyzing the same program. Instead you’re analyzing some other, slightly different program, with different behaviors. Hopefully nothing you change will make a material difference, but it’s hard to know. For example, it may be difficult to tell if a function that you patched out would have initialized some global structures that will now result in a crash or a slightly different code path. Just be aware of the ramifications, and patch only when necessary.

The return value of daemon() is checked and a value of 0 indicates success. A relatively nonintrusive way of replacing a call to daemon() and simulating success is to xor the $v0 register (which contains a function’s return value) with itself. The assembled bytes for the instruction:

xor $v0,$v0


00 42 10 26 

The target system is little endian, so these bytes must be swapped:

26 10 42 00 

I’ll leave the actual method[2] of assembling a single instruction and deriving its corresponding sequence of bytes as an exercise for the reader. If you have a cross compiler set up, one way is to create a source file with your instruction, assemble it with the gnu assembler, and then disassemble it with objdump. Another option is to use this lightweight python module, mips-assembler.

To patch the program using IDA, click on the instruction you want to patch, in this case the jalr to daemon() at 0x004183fc. Then switch to IDA’s hex view. The corresponding bytes will be highlighted. Right click and select "edit".

Hex editing in IDA

The hex view changes to overwrite mode. Change the selected bytes to the ones corresponding to your patch. Right click again and choose “apply.” When you switch back to disassembly, you should see the patch.

patch out call to daemon()

Note that IDA hasn’t actually changed the original binary. In fact, one of IDA’s features is that once you disassemble a file, not only does it not touch the original file again, you don’t even need it. All you need from that point forward is the .idb file. However, IDA does have the ability to apply the patch to the original file. Select the Edit menu, “Patch Program,” then “Apply patches to input file.” IDA will prompt you to name the patch file and whether to make backup copy. This is a relatively new feature, so if you’re new to IDA, know that it wasn’t always this easy. Send Ilfak an email thanking him.

Now you should be able run the patched httpd without it daemonizing. Set a breakpoint somewhere past the original call to daemon() and verify that IDA stays attached.

With the daemon running, you can start the iterative process of building up an NVRAM configuration that will satisfy the many initialization steps. The goal is for execution to reach the select() at 0x00415564 in the http_d() function.

During this process, there was one initialization function that I wasn’t able to get past. The call to fwPtRulesInit() at 0x00419640 always hung in an endless loop. Since this has to do with firewall policy configuration, I decided it was worth the risk to patch it out with a nop instruction.

patch out call to fwPtRulesInit()

Here is the configuration I ended up with. I ended up using the complete configuration copied from the hardware's NVRAM. I tweaked a few settings such as LAN IP address, and the HTTP admin's password, but this is mostly a stock configuration[3]. (Apologies to mobile users. This is supposed to be a 500px iframe that shows about 20 lines and scrolls, but for some reason in my mobile browser, the entire 1500+ line configuration is rendered.)

Once I had the trouble spots patched out and had a working configuration, I was able to get the web server running and responding to requests from a web browser. Mostly.

webserver in emulation

As you can see, the web interface chrome appears to be working, but the text is mostly missing. It turns out there was a bug in libnvram-faker. As the library handles NVRAM queries, it prints to the console the parameters and their values. The problem is it was printing to standard output. The web server executes a number of shell commands using system(). Some of those commands redirect their standard output to a file, which the web server then uses. In particular, at 0x004B5C40, a string table gets generated by a shell command, then read in, and then immediately deleted. Since the file only exists for a moment, it's not obvious this is happening.

Below, we see the function CreateHeader() getting called with two arguments. These are a string table in /www, and a compressed copy of the same file in /tmp.

creating string table

Then in CreateHeader() we see a shell command being generated via sprintf() and then executed via system(). That shell command is bzip2 -c somefile > some_other_file. The resulting file is a redirection of bzip's standard output.

bzip2 stdout

Once I noticed this, I set a breakpoint just after the bzip command, and was able to make a copy of the file. When I decompressed it, I saw that it had all of libnvram-faker's output mixed in with the strings. This had the result of breaking the templating system that generates the web interface's HTML and javascript. Once I fixed that in libnvram-faker, I was able to get the web server working.

httpd working in emulation

This should get you started emulating and debugging some more challenging binaries. With enough work you can get fairly complicated programs from an embedded device running in emulation. Sometimes this is convenient, so you don't have to carry actual gear around when you're doing research. Other times, it's necessary; you may not be able to get interactive access to the hardware in order to debug its processes. In that case, emulation may be your only choice.

[1] Many devices have a UART connection which will let you connect via minicom or other serial terminal in order to get console access. Further, nearly every consumer Netgear device has a telnet backdoor listening on the local network.

[2] I use a tool that Craig Heffner wrote called “shellgasm.” It’s a nifty python program that calls gcc and objdump from your cross compiler toolchain. It will also turn asm code into a C-style or Python-style byte array that can be used for payloads in buffer overflows and such. Unfortunately it’s not available anywhere. Maybe if you pester Craig, he’ll post it on Github.

[3] This was actually more work than it sounds like. There was a bug in libnvram-faker. It didn't allocate enough memory to accommodate all the lines of the INI file. This resulted in a crash with large configuration files. The crash was difficult to debug, so I mostly worked around it by iteratively uncommenting parts of the file until the web server worked. Just before finishing this post, I finally tracked down the bug, so now I can use the entire 1500+ lines of the default configuration.

Saturday, January 03, 2015

Remote Debugging with QEMU and IDA Pro

It's often the case, when analyzing an embedded device's firmware, that static analysis isn't enough. You need to actually execute a binary you're analyzing in order to see how it behaves. In the world of embedded Linux devices, it's often fairly easy to put a debugger on the target hardware for debugging. However it's a lot more convenient if you can run the binary right on your own system and not have to drag hardware around to do your analysis. Enter emulation with QEMU.

An upcoming series of posts will focus on reverse engineering the UPnP daemon for one of Netgear's more popular wireless routers. This post will describe how to run that daemon in system emulation so that it can analyzed in a debugger.


First, I'd recommend reading the description I posted of my workspace and tools that I use. Here's a link.

You'll need an emulated MIPS Linux environment. For that, I'll refer readers to my previous post on setting up QEMU.

You'll also need a MIPS Linux cross compiler. I won't go into the details of setting this up because cross compilers are kind of a mess. Sometimes you need an older toolchain, and other times you need a newer toolchain. A good starting point is to build both big endian and little endian MIPS Linux toolchains using the uClibc buildroot project. In addition to that, whenever I find other cross compiling toolchains, I save them. A good source of older toolchains is the GPL release tarballs that vendors like D-Link and Netgear make available.

Once you have a cross compiling toolchain for your target architecture, you'll need to build GDB for that target. At the very least, you'll need gdbserver statically compiled for the target. If you want to remotely debug using GDB, you'll need gdb compiled to run on your local architecture (e.g., x86-64) and to debug your target architecture (e.g., mips or mipsel). Again, I won't go into building these tools, but if you have your toolchains set up, it shouldn't be too bad.

I use IDA Pro, so that's how I'll describe remote debugging. However,  if you want to use gdb check out my MIPS gdbinit file:

Emulating a Simple Binary

Assuming you've gotten the tools described above set up and working properly, you should now be able to SSH into your emulated MIPS system. As described in my Debian MIPS QEMU post, I like to bridge QEMU's interface to VMWare's NAT interface so I can SSH in from my Mac, without first shelling into my Ubuntu VM. This also allows me to mount my Mac's workspace right in the QEMU system via NFS. That way whether I'm working in the host environment, in Ubuntu, or in QEMU, I'm working with the same workspace.

zach@malastare:~ (130) $ ssh root@
root@'s password:
Linux debian-mipsel 2.6.32-5-4kc-malta #1 Wed Jan 12 06:13:27 UTC 2011 mips

root@debian-mipsel:~# mount
/dev/sda1 on / type ext3 (rw,errors=remount-ro)
malastare:/Users/share/code on /root/code type nfs (rw,addr=
root@debian-mipsel:~# cd code

Once shelled into your emulated system, cd into the extracted file system from your device's firmware. You should be able to chroot into the firmware's root file system. You need to use chroot since the target binary is linked against the firmware's libraries and likely won't work with Debian's shared libraries.

root@debian-mipsel:~# cd code/wifi-reversing/netgear/r6200/extracted-
root@debian-mipsel:~/code/wifi-reversing/netgear/r6200/extracted- file ./bin/ls
./bin/ls: symbolic link to `busybox'
root@debian-mipsel:~/code/wifi-reversing/netgear/r6200/extracted- file ./bin/busybox
./bin/busybox: ELF 32-bit LSB executable, MIPS, MIPS32 version 1 (SYSV), dynamically linked (uses shared libs), stripped
root@debian-mipsel:~/code/wifi-reversing/netgear/r6200/extracted- chroot . /bin/ls -l /bin/busybox
-rwxr-xr-x    1 10001    80         276413 Sep 20  2012 /bin/busybox

In the above example, I have changed into the root directory of the extracted file system. Then using the file command I show that busybox is a little endian MIPS executable. Then I chrooted into the extracted root directory and ran bin/ls, which is a symlink to busybox.

If you attempt to simply start a chrooted shell with "chroot .", it won't work. Your user's default shell is bash, and most embedded devices don't have bash.

root@debian-mipsel:~/code/wifi-reversing/netgear/r6200/extracted- chroot .
chroot: failed to run command `/bin/bash': No such file or directory

Instead you can chroot and execute bin/sh:

root@debian-mipsel:~/code/wifi-reversing/netgear/r6200/extracted- chroot . /bin/sh

BusyBox v1.7.2 (2012-09-20 10:26:08 CST) built-in shell (ash)
Enter 'help' for a list of built-in commands.

# exit

Hardware Workarounds

Even with the necessary tools and emulation environment set up and working properly, you can still run into roadblocks. Although QEMU does a pretty good job of emulating the core chipset, including the CPU, there is often hardware the binary you're trying to run is expecting that QEMU can't provide. If you try to emulate something simple like /bin/ls, that will usually work fine. But something more complicated such as the UPnP daemon will almost certainly have particular hardware dependencies that QEMU isn't going to satisfy. This is especially true for programs whose job it is to manage the embedded system's hardware, such as turning wireless adapters on or off.

The most common problem you will run into when running system services such as the web server or UPnP daemon is the lack of NVRAM. Non-volatile RAM is usually a partition of the device's flash storage that contains configuration parameters. When a daemon starts up, it will usually attempt to query NVRAM for its run-time configuration. Sometimes a daemon will query NVRAM for tens or even hundreds of parameters.

To work around the lack of NVRAM in emulation, I wrote a library called nvram-faker. The nvram-faker library should be preloaded using LD_PRELOAD when you run your binary. It will intercept calls to nvram_get(), normally provided by Rather than attempting to query NVRAM, nvram-faker will query an INI-style configuration file that you provide.

The included README provides a more complete description. Here's a link to the project:

Even with NVRAM solved, the program may make assumptions about what hardware is present. If that hardware isn't present, the program may not run or, if it does run, it may behave differently than it would on the target hardware. In this case, you may need to patch the binary. The specifics of binary patching vary from one situation to another. It really depends on what hardware is expected, and what the behavior is when it is absent. You may need to patch out a conditional branch that is taken if hardware is missing. You may need to patch out an ioctl() to a special device if you're trying to substitute a regular file for reading and writing. I won't cover patching in detail here, but I did discuss it briefly in my BT HomeHub paper and the corresponding talk I gave at 44CON. Here is a link to those resources:

Attaching the Debugger

Once you've got your binary running in QEMU, it's time to attach a debugger. For this, you'll need gdbserver. Again, this tool should be statically compiled for your target architecture because you'll be running it in a chroot. You'll need to copy it into the root directory of the extracted filesystem.

# ./gdbserver
Usage: gdbserver [OPTIONS] COMM PROG [ARGS ...]
 gdbserver [OPTIONS] --attach COMM PID
 gdbserver [OPTIONS] --multi COMM

COMM may either be a tty device (for serial debugging), or
HOST:PORT to listen for a TCP connection.

  --debug               Enable general debugging output.
  --remote-debug        Enable remote protocol debugging output.
  --version             Display version information and exit.
  --wrapper WRAPPER --  Run WRAPPER to start new programs.
  --once                Exit after the first connection has closed.

You can either attach gdbserver to a running process, or use it to execute your binary directly. If you need to debug initialization routines that only happen once, you'll want to do the latter.

On the other hand, you may want to wait until the daemon forks. As far as I know there's no way to have IDA follow forked processes. You need to attach to them separately. If you do it this way, you can attach to the already running process from outside the chroot.

The following shell script will execute upnpd in a chroot. If DEBUG is set to 1, it will attach to upnpd and pause for a remote debugging session on port 1234.

chroot $ROOTFS /bin/sh -c "LD_PRELOAD=/ /usr/sbin/upnpd"

#Give upnpd a bit to initialize and fork into the background.
sleep 3;

if [ "x1" = "x$DEBUG" ];

 $ROOTFS/gdbserver --attach $(pgrep upnpd)

You can create a breakpoint right before the call to recvfrom() and then verify the debugger breaks when you send upnpd an M-SEARCH packet.

break before recvfrom()

Then, in IDA, go to Process Options under the Debugger menu. Set "hostname" to the IP address of your QEMU system, and set the port to the port you have gdbserver listening on. I use 1234.

Debug application setup: gdb

Accept the settings, then attach to the remote debugging session with IDA's ctrl+8 hotkey. Hit ctrl+8 again to resume execution. You should be able to send an M-SEARCH packet[1] and see the debugger hit the breakpoint.

debugger hits breakpoint in upnp_main()

There is obviously a lot more to explore, and there are lots of situations that can come up that aren't addressed here, but hopefully this gets you started.

[1] I recommend Craig Heffner's miranda tool for UPnP analysis: