TL;DR

In this post I explain a quick-and-dirty way to test a GDB plugin (written in Python) that targets the ARM architecture as part of a continuous integration configuration for travis-ci.org. Ingredients include qemu-system-arm, a CodeSourcery compiler, jimdb, and some spaghetti (tied tight to keep it all together, of course).

I realize that to many this project may be Esoteric, Ephemeral, and Excruciating, but if you are trying to emulate an ARM system in a lightweight fashion, test a cross-platform GDB plugin, or just get similar stuff done in the travis-ci environment this post might help you.

Background: Merge now, pay later

I was happy to hear that the gracious security folks at Blackberry were not only using the exploitable GDB plugin but had added support for ARM, QNX and Google AddressSanitizer AND were motivated to contribute the code back to the project. I accepted their pull request (which I very much appreciate) hastily but discovered later that the code had not been tested.

The pull request motivated me to re-design the plugin to support more platforms, something that has been on my todo list since I was working with VDB a while back. In fact, I had stubbed out the re-design before, but tabled it (along with some other enhancements) as I was pulled to other projects.

So I got to work. As the story often goes, after spending many hours untangling code, I decided I had to complete another item on my todo list: add open source continuous integration testing to the project master branch to ease future collaboration. One crazy merge is usually enough.

Disclaimer: I don’t know a whole lot about the ARM architecture. Or, I know a lot more about x86, x86_64, even MIPS and PowerPC. I hope to learn more in the future, but in the meantime the world keeps turning, and code gets contributed. So some day I may view this a massive hack that I wish I could re-work. So it goes.

‘exploitable’ primer

The exploitable plugin is kind of like !exploitable but for Linux/GDB. The plugin looks at the state of the inferior (the process being debugged) in GDB, generally after the process has crashed, and tries to determine if an attacker might have been able to take advantage of the underlying bug to hijack the process. The idea is to help developers prioritize bugs in cases where they don’t have time to examine them all. This can happen when fuzz testing a program with a tool like BFF or Peach, or when working with data from a crash collection system like ABRT or Apport.

It works like this:

(gdb) exploitable
Description: Access violation on destination operand
Short description: DestAv (7/21)
Hash: 056f8e491910886253b42506ac8d7fa0.056f8e491910886253b42506ac8d7fa0
Exploitability Classification: EXPLOITABLE
Explanation: The target crashed on an access violation at an address matching the destination operand of the instruction. This likely indicates a write access violation, which means the attacker may control the write address and/or value.
Other tags: AccessViolation (20/21)

The project also comes with a proof-of-concept wrapper script runs a set of test cases and prints a summary report, ordered by severity. The invocation looks something like this:

$ python triage.py \$sub `find ./exploitable/tests/bin -type f`

The command above triages all of the test cases that are supplied with the project. In a nutshell, this is how it works:

For each supplied shell command (a program invocation), 
   Run the command under GDB,
   If the program crashes,
      Load the plugin
      Run the `exploitable` command
      Store the result to a file
Finally, aggregate the results and print (or store) a report 

You can read more about the plugin here.

The Plan

In order to test the GDB plugin’s ARM functionality on travis-ci I had to get some ducks in a row in the test environment. Particularly, these ducks:

  • Install an ARM emulator
  • Download or build an ARM VM to run in the emulator
  • Compile my test cases for ARM
  • Get test cases and a gdbserver (built for ARM) into the VM
  • Get the VM to run the gdbserver
  • Download or build a version of GDB that supports ARM targets AND Python scripting

This stuff is baked into the steps below.

Testing the ARM GDB plugin: 5 steps

You can find the final product in the .travis.yml file and test directory in the exploitable repo on GitHub.

Step 1: Get dependencies

In order to test on ARM, I needed an ARM system emulator, an ARM VM, and an ARM compiler that builds for the ARM VM.

The emulator

travis-ci uses an Ubuntu x86_64 environment, so qemu-system-arm is the obvious choice for emulating ARM. Note that if you just need to run ARM apps (rather than emulate a whole OS), you can use qemu-arm, which is a lot simpler. See Hello world for qemu-arm in 6 commands for more info on that.

The ARM VM

After messing with Linaro, etc., for a bit I found the QEMU Test Image for ARM. It is a super slim ARM VM that happily boots in qemu-system and is available on the web at a (seemingly) stable URI. No need to build a custom VM and or host an image somewhere. Sometimes being lazy is cool.

The ARM compiler

Of course, I needed a compiler that will target that specific ARM processor class. Letting laziness guide me once again, I noticed this line in the VM details:

(thanx to Paul Brook)

I googled this fellow’s name and found that he works at CodeSourcery –err– Mentor Graphics. Sure enough, the gcc included with CodeBench Lite Edition, which is free, does the trick.

Unfortunately, Mentor does not supply a public download link for the free version (or any version) of CodeBench (you have to supply an email addy). Since I needed to completely automate everything for travis-ci, I uploaded a copy of the free version to a private bucket on Amazon S3 and used travis-ci’s encryption features to include a key that can access the bucket into the test environment in a secure fashion. IIRC, these were the steps I took:

  1. Delete extra files from the CodeBench tarball (I am paying for S3, after all : )
  2. Create an AWS bucket: this is where the compiler is stored
  3. Create an AWS IAM role: this is a “user account” that will allow the travis-ci worker to access the S3 bucket
  4. Assign a policy to the IAM role that allows the worker to access the bucket (and nothing else):

     {
         "Statement": [
             {
                 "Effect": "Allow",
                 "Action": ["s3:ListBucket" ],
                 "Resource": [ "arn:aws:s3:::redacted"]
             },
             {
                 "Effect": "Allow",
                 "Action": [ "s3:GetObject"],
                 "Resource": [ "arn:aws:s3:::redacted/*"]
             }
         ]
     }
    
  5. Download the AWS keys for the user and encrypt them using travis encrypt. See here for more info. Note that it is the conventional AWS environment variable assignment (like AWS_ACCESS_KEY_ID=whatever) that gets encrypted, not the just the key.
  6. Add corresponding logic to the dependency-getting code (lives here in exploitable):

     set +x # don't log keys!
     : ${AWS_ACCESS_KEY_ID:?"Need to set AWS_ACCESS_KEY_ID non-empty"}
     : ${AWS_SECRET_ACCESS_KEY:?"Need to set AWS_SECRET_ACCESS_KEY non-empty"}
     set -x 
     python -c 'import boto, os; boto.connect_s3(os.environ["AWS_ACCESS_KEY_ID"], os.environ["AWS_SECRET_ACCESS_KEY"]).get_bucket("exploitable").get_key("arm-toolchain-slim.tar.bz2").get_contents_to_filename("arm-toolchain.tar.bz2")'
     tar -xjf arm-toolchain.tar.bz2 # dir is arm-2013.11
     export PATH=$PATH:${BUILD_DIR}/arm-2013.11/bin
    

WARNING: be sure to set +x if (like above) your test logic is in a bash script that you are running with set -x (so that evaluated commands are printed to stdout before the command executes). Otherwise your keys will be in your logs. Kinda defeats the purpose of encrypting them :).

Unfortunately the prebuilt CodeSourcery gcc is a 32-bit binary, which at the time of this writing can’t be executed in the travis-ci test environment (which is kind of odd). Fortunately, however, I was able to work around this by installing gcc-multilib to add support for executing 32-bit binaries.

Getting a version of GDB that can handle ARM targets and supports Python scripting

Most versions of GDB that come with ARM toolkits aren’t built to support Python scripting. Note that support for Python was only added in GDB 7.1 (and the API was a bit shakey back then). In addition, it is a bit tricky to build GDB to support both ARM targets and Python scripting. IIRC, dependency hell happens.

Luckily there is no need to do this, as the Mozilla Fennec team has forked GDB into jimdb, which is a really cool debugger that does just what I need. Note that since (in this case) we aren’t working with Fennec, we don’t need to mess with any of the ‘git pull’ or FenInit stuff referenced in the wiki page, so this will suffice:

    # get python-equipped, ARM-compatible GDB (see https://wiki.mozilla.org/Mobile/Fennec/Android/GDB)
    wget http://people.mozilla.org/~nchen/jimdb/jimdb-arm-linux_x64.tar.bz2 
    tar -xjf jimdb-arm-linux_x64.tar.bz2 # directory is jimdb-arm

2. Build the test cases

Building the test cases is relatively straightfoward. For exploitable, the script looks like this:

    # build ARM test cases
    pushd ${PROJECT_DIR}/exploitable/tests
    cpath=$BUILD_DIR/arm-2013.11/bin/arm-none-linux-gnueabi-gcc
    echo "Compiler path is $cpath"
    file $cpath
    CC=$BUILD_DIR/arm-2013.11/bin/arm-none-linux-gnueabi-gcc make -f Makefile.arm
    popd

The more interesting part is getting them into the VM…

3. Get the ARM test cases into the ARM VM

Initially I was copying the ARM test cases into the VM using tftp (which is included in the slim QEMU ARM test image), but in order to do that in an automatic fashion I had to edit the initrd for the VM, so I figured it would be cleaner just to do all the work up front:

   # patch VM initrd to run GDB server on startup
   mkdir ${BUILD_DIR}/initrd 
   pushd ${BUILD_DIR}/initrd 
   gunzip -c ${BUILD_DIR}/arm-test/arm_root.img | cpio -i -d -H newc
   cp ${PROJECT_DIR}/exploitable/tests/bin/* ${BUILD_DIR}/initrd/root/ # all test binaries
   cp ${BUILD_DIR}/arm-2013.11/bench/gdbserver ${BUILD_DIR}/initrd/root # gdbserver 
   chmod +x ${BUILD_DIR}/initrd/root/*
   echo "
   cd /root
   /root/gdbserver --multi 10.0.2.14:1234
   " >> ${BUILD_DIR}/initrd/etc/init.d/rcS
   rm ${BUILD_DIR}/arm-test/arm_root.img
   find . | cpio -o -H newc | gzip -9 > ${BUILD_DIR}/arm-test/arm_root.img
   popd

What the code above is doing is this:

  1. Extract the initrd using the conventional gunzip and cpio approach
  2. Copy in the test binaries (from the project source code) and the gdbserver (from CodeSourcery), both compiled for the ARM target; make them executable
  3. Append code to run the gdbserver to the end of the init.d rcS script (so it will run on boot)
  4. Delete the old initrd image and compress the modified contents, again using cpio and gzip

4. Boot the VM and run the tests

Here is the code that boots the VM and watches the output for a string that indicates the GDB server is ready:

    # start VM wait for GDB server to start
    qemu-system-arm -kernel ${BUILD_DIR}/arm-test/zImage.integrator -initrd ${BUILD_DIR}/arm-test/arm_root.img -nographic -append "console=ttyAMA0" -net nic -net user,tftp=exploitable,host=10.0.2.33 -redir tcp:1234::1234 </dev/null &> ${BUILD_DIR}/log-qemu.txt &
    until grep "Listening on port" ${BUILD_DIR}/log-qemu.txt
    do
      echo "Waiting for GDB server to start..."
      cat ${BUILD_DIR}/log-qemu.txt
      sleep 1
    done
    echo "GDB server started"

At this point in the test script the emulator is running the background with a GDB waiting for connections. The remainder of the script is kind of specific to the way I’ve hacked exploitable to support automated testing, but it may be useful to others since it contains gdbinit logic:

    # run triage; we pass a bash script that will create a per-file remote-debug GDB script to the "step-script" argument for triage.py
    pushd ${PROJECT_DIR}
    
    cmd="#!/bin/bash
    
    template=\"set solib-absolute-prefix nonexistantpath
    set solib-search-path ${BUILD_DIR}/arm-2013.11/arm-none-linux-gnueabi/libc/lib
    file dirname/filename
    target extended-remote localhost:1234
    set remote exec-file /root/filename
    run
    source ${PROJECT_DIR}/exploitable/exploitable.py
    exploitable -p /tmp/triage.pkl\"
    d=\`dirname \$1\`
    f=\`basename \$1\`
    sub=\${template//filename/\$f}
    sub=\${sub//dirname/\$d}
    echo \"\$sub\" > ${BUILD_DIR}/gdb_init"

    echo "$cmd" > ${BUILD_DIR}/pre_run.sh
    chmod +x ${BUILD_DIR}/pre_run.sh
    python triage.py -o ${BUILD_DIR}/result.json -vs ${BUILD_DIR}/pre_run.sh -g "${BUILD_DIR}/jimdb-arm/bin/gdb --batch -x ${BUILD_DIR}/gdb_init --args " \$sub `find exploitable/tests/bin -type f` 
    rm ${BUILD_DIR}/pre_run.sh ${BUILD_DIR}/gdb_init
    popd

The rest of the story

For the complete, working code, check out the exploitable repo on GitHub. The .travis.yml file is a good starting point. If you have any questions, comments, etc. feel free to drop me a line. Thanks for reading.