Fuzzing

Find vulnerabilities using fuzzing

Excellent playlist:

Coverage-Guided Fuzzing

Takes a corpus of inputs. Then it picks an input and does a random mutation on it. See if this mutated test covers new code; if so, store it in the corpus. Repeat the process in a loop.

API fuzzing:

When should I use coverage guided fuzzing?

Coverage guided fuzzing is recommended as it is generally the most effective. This works best when:

  • The target is self-contained.

  • The target is deterministic.

  • The target can execute dozens or more times per second (ideally hundreds or more).

Coverage

What should the coverage look like?

  • You want to cover as many lines as possible,

  • You want to cover those lines as uniformly as possible (not only cover some line once)

  • You don’t need to instrument things you don’t care about (well-tested 3rd party libraries, etc

How do I deal with coverage gaps?

  • Edit target code to take a branch/generate structured data (least amount of work)

  • In API testing, you can edit the harness to make code reachable or more likely to be reached. (medium amount of work, but doesn’t affect the tested API)

  • Add a test case to reach it directly (most amount of work)

  • Nop out/avoid some large sections of code that are just noise (if you’re fuzzing ffmpeg, only fuzz one format)

I keep hitting DoS bugs

You can just patch around uninteresting bugs until you find an RCE.

Reporting

You can put the main function in a try-catch block, because you often don’t care about the exceptions the application knows to throw. You care about things that create a crash, like segfaults and stuff.

Sanitization

Just detecting crashes isn’t good enough, since not all memory corruptions produce a crash (like if a single byte of memory is corrupted). Instead, you should use something like ASan (Address Sanitizer) to detect memory corruption.

However, it can be a good idea to run LibFuzzer without sanitization first, as that way you’ll find the more critical bugs first. Otherwise, it’ll trip on a smaller issue (off-by-one heap overflow for example) and it won’t get to the bigger issue until you patch it

Note: LeakSanitizer for memory leakage issues?

Variant Analysis

Also known as variant fuzzing, differential fuzzing, and bounded analysis.

This sort of fuzzing doesn’t look for crashes or infinite loops. Instead, it checks if the output of a function exceeds expected bounds. Basically, it finds logic bugs

For example, let’s say there’s a language analysis function, which rates a Youtube comment based on its offensiveness, on a scale of 0-9. If we do variant analysis on that function, then we would run LibFuzzer for example, and check the function’s output. If the output is less than 0 or more than 9, then we have hit a logic bug.

What is it useful for?

  • Finding bugs in memory-safe languages (golang, javascript) for certain functions

  • Fuzzing Ethereum Solidity smart contracts

    • There have been a few times where a coin operator has created a contract with bugs in it.

    • You can use this fuzzing method to find logical errors, where you can give yourself too much money.

AFL

Benefits over LibFuzzer:

  • Easier to set up

  • Can be more reliable or work in more difficult situations

Negatives:

  • Might catche less bugs

    • Doesn’t have memory leak detection instrumentation, for example.

    • It does have address sanitation, though, which is good.

  • With LibFuzzer you can easily target individual functions because you have to write an LLVMTestOneInput entry function.

Setting Up

Example of setting up fuzzing for duktape javascript engine

For normal compilation, you can just run make -f Makefile.cmdline

Create a directory in w`ith a corpus of a few good examples (of javascript examples, in this case). I added 2 examples in files 000 and 001:

  1. 1 + 2

  2. var num = 2; num * 4

Also make an out dir.

Then, in the makefile, you’ll need to specify the afl compiler as the compiler of the program, and compile an AFL version of the program. There’s a segment that explains this in the AFL github page.

And all we have to change in the Makefile.cmdline is to change:

CC = gcc

To:

CC = afl-clang

Then, make the file:

Make -f Makefile.cmdline

And run the fuzzer (./duk is the executable):

afl-fuzz -i in/ -o out/ -- ./duk

AFL Workshop

Use AFL++, it's maintained. Original isn't.

Useful documentation is in AFL++ documentation:

  • docs/life_pro_tips.txt

  • docs/README.md

  • llvm_mode/README.md

Quickstart

You have an inputs corpus in a folder (and make sure to add a valid input).

mkdir inputs

Compile the program with AFL:

CC=afl-clang-fast AFL_HARDEN=1 make

AFL_HARDEN adds code hardening options

Then fuzz it:

afl-fuzz -i inputs -o out ./vulnerable

Test Harnesses

Harness requirements:

  • Runs in the foreground

  • Processes input from stdin or from a specified file

  • Exits cleanly

  • Skips checksums and cryptographic integrity tests (or use a postprocessor), because the fuzzer will never get through those

Recommended:

  • If you're going beyond simple program crashes, then the test harness should crash if the program is in an invalid state

  • Use AFL's deferred forkserver + persistent mode for speed

Simple Harness

This harness only tests one of the two functions

Important points:

  • Define the size of the input buffer to limit the fuzzer

  • Initialize the input buffer with zeroes to make results reproducible!

#include <unistd.h>
#include <string.h>
#include <stdio.h>

#include "library.h"

// fixed size buffer based on assumptions about the maximum size that is likely necessary to exercise all aspects of the target function
#define SIZE 50

int main() {
  // make sure buffer is initialized to eliminate variable behaviour that isn't dependent on the input.
  char input[SIZE] = {0};

  ssize_t length;
  length = read(STDIN_FILENO, input, SIZE);

  lib_echo(input, length);
}

Compile and run like this:

AFL_HARDEN=1 afl-clang-fast harness.c library.c -o harness
afl-fuzz -i in -o out ./harness

Integer Inputs

This harness uses two 8-byte integer inputs if you specify the "mul" argument:

#include <unistd.h>
#include <string.h>
#include <stdio.h>

#include "library.h"

// fixed size buffer based on assumptions about the maximum size that is likely necessary to exercise all aspects of the target function
#define SIZE 100

int main(int argc, char* argv[]) {
    if((argc == 2) && strcmp(argv[1], "echo") == 0) {
        // make sure buffer is initialized to eliminate variable behaviour that isn't dependent on the input.
        char input[SIZE] = {0};

        ssize_t length;
        length = read(STDIN_FILENO, input, SIZE);

        lib_echo(input, length);
    } else if ((argc == 2) && strcmp(argv[1], "mul") == 0) {
        int a,b = 0;
        read(STDIN_FILENO, &a, 4);
        read(STDIN_FILENO, &b, 4);
        printf("%d\n", lib_mul(a,b));
    } else {
        printf("Usage: %s mul|echo\n", argv[0]);
    }
}

Run it like this:

afl-fuzz -i in -o out ./harness mul

Reading from files

The "@@" symbol specifies that the input is a filename.

afl-fuzz -i in -o out ./harness @@

LLVM Mode

AFL comes with the afl-gcc and afl-clang compilers by default. Where possible, you should prefer to use the afl-clang-fast compiler, because then you can use LLVM mode. However note that trying to compile a program with clang instead of gcc can sometimes fail.

LLVM mode allows you to significantly speed up your execution for most targets because of multiple reasons:

  • Deferred forkserver

    • Tell AFL where to start each new process

    • Is especially useful for processes with an expensive setup phase that isn’t dependent on input

  • Persistent mode

    • Don’t fork for each run, just loop around this bit of code

    • Requires that the program’s state can be completely reset so that multiple calls can be performed without resource leaks

Note: This documentation page says that afl-clang-lto is the best compiler to use because the instrumentation doesn't have collisions, and therefore gets better coverage. However, it's a work in progress and you might run into issues during compilation.

If there's a memory corruption that starts making it so that the program's state isn't completely reset and you're not using ASan to detect it, then you can use first persistent mode to fuzz to a certain extent to get a good corpus, and then fuzz in normal mode for consistency.

Running in persistent mode might cause stability to lower, this is normal. As long as stability is over 90 percent, everything is fine. The user manual goes into a bit more depth on this.

Seed Corpus

Tips:

  • Find real inputs that exercise as much of the target as possible.

  • Keep your seed files small (under 1kB is ideal).

  • Don’t put in redundant files.

Use afl-tmin and afl-cmin to minimize your seed corpus. Also when your fuzzing job has been running for a while (after the first cycle of the master fuzzer for example), you can also take your test corpus, minimize that and fuzz some more. You can use afl-ptmin to parallellize this process.

Sanitizers

Sanitizers turn tiny issues into crashes which AFL can detect.

Sanitizers are mutually incompatible, so run multiple fuzzing instances with different sanitizers

  • UBSan

  • MSan

  • ASan

  • TSan

Use AFL_HARDEN=1 to harden your program (fortify_source and stack protection) and therefore catch more issues

ASan

ASan can require a lot of memory – so much that using it on a 64-bit system can be infeasible if the program isn't robust and doesn't enforce reasonable memory limits.

Read the documentation for workarounds to this. Here are some options:

  • Compile as 32-bit

  • Gauge memory limits using recidivm

  • Use cgroups to limit available memory

Note: I haven't been able to get the cgroups script to work in the afl++ docker container.

/AFLplusplus/utils/asan_cgroups/limit_memory.sh -u username_here afl-fuzz -i in -o out ./target

Parallel Fuzzing

AFL runs on one core, so it makes sense to run one per core. However it’s not trivial.

Multicore:

  • One main instance. -M

  • One secondary instance per core. -S

  • All share the same output directory. -o

Multi Machine:

  • Lots of secondary instances

  • You need to sync state directories periodically

Tip: Run fuzzers with differing behaviours (sanitizers; fuzzing algorithms)

For large-scale serious fuzzing, it makes sense to use a fuzzing service that allows you to scale up massively. That way you don’t have to hack together multi-machine fuzzing yourself.

  • ClusterFuzz

  • OneFuzz

Dictionaries

Very useful for helping the fuzzer get through difficult parts where certain values are expected. Specifies interesting tokens the fuzzer can use to access paths it otherwise might not stumble upon.

Sources:

  • “dictionaries” folder

  • Automatically generated with afl-clang-lto (llvm11+)

    • Statically finds string comparisons while compiling

    • This feature is called autodictionary

  • Automatically generated with libtokencap

    • At runtime, looks at calls to strcmp and memcmp and remembers tokens

  • Source code review

Other similar useful features:

  • Iaf-intel

    • Works by splitting larger comparisons down to smaller ones so the fuzzer can traverse paths and get further.

    • Use export AFL_LLVM_LAF_ALL=1 to enable all Iaf-intel passes

Beyond memory corruption

You can add checks to detect if bad things have happened, e.g:

  • Send input into two different math/crypto libraries and asserts that their output is identical

  • assert that result of BN_sqr(x) == BN_mul(x,x) - CVE-2014-3570 in OpenSSL

  • assert if any filesystem changes have occurred – CVE-2014-6271 in Bash (shellshock)

If something bad has happened, then just crash the program manually using assert.

Miscellaneous Tips

Network targets:

  • Write a harness around the target function

  • Use inetd mode if your target supports it

  • Intercept network system calls, e.g using preenv

The queue directory contains a corpus of test cases that exercise your program

  • You can run it through CoverageSanitizer or gcov and see what isn’t hit

Resume a fuzzing run with -i-

  • You can recompile the binary and then do this

Triage Tips

  • ASan traces can be helpful

  • Use afl-tmin to minimise and simplify test cases while retaining the original control flow

  • “Unique” bugs aren’t unique. They might be the same bug, just through a different control flow. Fix them as you go along and carry on fuzzing.

  • Memory limits can give false positives

Use crashwalk to get a head start on triaging crashes and finding which ones are useful for exploitation. However, note that if you're using ASan, then this probably isn't always helpful, since ASan just does SigAbort when it finds something, and crashwalk doesn't know what to do with it. If you turn off ASan, then the program might not crash at all.

When to Stop

Options:

  • Never - Fuzz as part of CI

  • When the cycles counter is green

    • Last new path was found many cycles ago

    • Pending paths is zero

If you want to stop earlier:

  • Cycles counter is blue

  • Been running for a while

  • Look at the output of afl-plot

Remember to check code coverage.

  • See which parts of the program are hit, and whether all the important functions are hit.

You can check for code coverage using afl-cov.

Examples

LibXML (Optimization)

We're compiling using afl-clang-fast here, but note that lto is usually better. We're using ASAN as the sanitizer. We're fuzzing the xmlReadMemory function.

Instrumenting the library with ASAN:

CC=afl-clang-fast ./autogen.sh
AFL_USE_ASAN=1 make -j 4

Compiling the harness:

AFL_USE_ASAN=1 afl-clang-fast ./harness.c -I libxml2/include libxml2/.libs/libxml2.a -lz -lm -o fuzzer

Seed corpus:

echo "<hi></hi>" > in/hi.xml

Fuzzing with an XML dictionary:

afl-fuzz -i in -o out -x ~/AFLplusplus/dictionaries/xml.dict ./fuzzer

Basic harness, no optimizations, 250 execs/sec, BAD:

#include "libxml/parser.h"
#include "libxml/tree.h"
#include <unistd.h>

#define SIZE 5000
int main(int argc, char **argv) {
        char buf[SIZE] = {0};
        int len = read(STDIN_FILENO, buf, SIZE);

        xmlInitParser();
        xmlDocPtr doc = xmlReadMemory((char *)buf, len, "https://mykter.com", NULL, 0);
        if (doc != NULL) {
                        xmlFreeDoc(doc);
        }
        xmlCleanupParser();

        return(0);
}

Important: Note how the buffer is filled with zeros in main. This is good, because otherwise it would contain random data and the fuzzing might not be reproducible if there is a bug (stability will suffer).

Let's start adding persistent mode optimizations.

Deferred initialization gives a negligible speedup (from 250 to 270 executions per second). Note that AFL does deferred initialization to the main function by default, but you can defer it even more (after xmlInitParser() in this case):

#include "libxml/parser.h"
#include "libxml/tree.h"
#include <unistd.h>

#define SIZE 5000
int main(int argc, char **argv) {
  xmlInitParser();

  #ifdef __AFL_HAVE_MANUAL_CONTROL
    __AFL_INIT();
  #endif

  char buf[SIZE] = {0};
  int len = read(STDIN_FILENO, buf, SIZE);

  xmlDocPtr doc = xmlReadMemory((char *)buf, len, "https://mykter.com", NULL, 0);
  if (doc != NULL) { 
      xmlFreeDoc(doc);
  }                 
  xmlCleanupParser();

  return(0);
}

Adding persistent mode via AFL_LOOP gives ~10x speedup to ~2500 execs/sec:

#include "libxml/parser.h"
#include "libxml/tree.h"
#include <unistd.h>

#define SIZE 5000
int main(int argc, char **argv) {
  xmlInitParser();

  #ifdef __AFL_HAVE_MANUAL_CONTROL
    __AFL_INIT();
  #endif

  while (__AFL_LOOP(1000)) {
    char buf[SIZE] = {0};
    int len = read(STDIN_FILENO, buf, SIZE);

    xmlDocPtr doc = xmlReadMemory((char *)buf, len, "https://mykter.com", NULL, 0);
    if (doc != NULL) {
        xmlFreeDoc(doc);
    }
    xmlCleanupParser();
  }
  return(0);
}

When we added shared memory fuzzing, we further increased the speed by ~50%. In this case, we got rid of the memory limitation, but you can easily just program one in by returning when the "len" variable is higher than what you want.

#include "libxml/parser.h"
#include "libxml/tree.h"
#include <unistd.h>

__AFL_FUZZ_INIT();

int main(int argc, char **argv) {
  xmlInitParser();

  #ifdef __AFL_HAVE_MANUAL_CONTROL
    __AFL_INIT();
  #endif
  unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF; 
  while (__AFL_LOOP(1000)) {
    int len = __AFL_FUZZ_TESTCASE_LEN;

    xmlDocPtr doc = xmlReadMemory((char *)buf, len, "https://mykter.com", NULL, 0);
    if (doc != NULL) { 
        xmlFreeDoc(doc);
    }
    xmlCleanupParser();
  }
  return(0);
}

We further increased speed using parallellization. We ran 4 fuzzers in parallel and the fuzz speed increased by 4x, since each fuzzer runs on 1 thread.

mkdir sync_dir
afl-fuzz -i in -o sync_dir -M fuzzer01 -x ~/AFLplusplus/dictionaries/xml.dict ./fuzzer
afl-fuzz -i in -o sync_dir -S fuzzer02 -x ~/AFLplusplus/dictionaries/xml.dict ./fuzzer
afl-fuzz -i in -o sync_dir -S fuzzer03 -x ~/AFLplusplus/dictionaries/xml.dict ./fuzzer
afl-fuzz -i in -o sync_dir -S fuzzer04 -x ~/AFLplusplus/dictionaries/xml.dict ./fuzzer

When doing parallel fuzzing, you can check up on your progress using the afl-whatsup tool:

afl-whatsup sync_dir/

After 44 minutes of total running time (11 minutes per core), we have 48 locally unique crashes

LibFuzzer

It’s said to be recommended to use LibFuzzer over AFL and honggfuzz (or use multiple in conjunction), as it will find the most bugs. But at the same time, it’s a bit more difficult to set up. Unlike AFL, LibFuzzer doesn’t do forking + feeding data to stdin.

  • It fuzzes in-process persistently by default (though AFL added support for this)

  • State can accumulate (sometimes good, sometimes bad)

You have to be careful about state accumulation.

Similar fuzzers:

  • Honggfuzz

The above tutorial contains instructions about:

  • Setting up libfuzzer

  • Running it on multiple cores

  • Minimizing the corpus

  • Finding RCEs which are blocked by DoS vulns

  • Coverage visualization

  • And more

Running It

This demonstrates how to run LibFuzzer on a very simple fuzzme, which already includes the LLVMTestOneInput function.

Compile it with ASan and fuzzer options:

clang++ -g -fsanitize=address,fuzzer ./fuzz_me.cc

Then run it, and it will find a heap buffer overflow.

Honggfuzz

Similar to LibFuzzer. May find some bugs that libfuzzer won’t and may miss a lot of bugs that LibFuzzer won’t. It’s a bit of a mix of libfuzzer and AFL in its usage.

First you have to compile it with instrumentation:

hfuzz-clang++ -o fuzz_me.hfuzz -g fuzz_me.cc

Then, you have to run it with honggfuzz:

honggfuzz -P --output ./out/ -i ./in/  -- ./hfuzz.out
  • -P is for persistent mode

  • --output saves the generated corpus in the specified directory

  • -i specifies where the input corpus should be taken from

Black Box Fuzzing

A blackbox fuzzer generates inputs for a target program without knowledge of its internal behaviour or implementation.

A blackbox fuzzer may generate inputs from scratch, or rely on a static corpus of valid input files to base mutations on. Unlike coverage guided fuzzing, the corpus does not grow here.

Also, if you don’t have coverage guiding or sanitization (bug detection), then it will probably miss a lot of bugs.

Radamsa

Pretty easy and beginner-friendly tool. Radamsa accepts input and tries to generate interesting test cases from that.

You’ll probably want to write a script to continuously use radamsa on a program and detect segfaults and such. Here’s an example of using it on Exploit Exercises Phoenix Format 5:

#!/bin/bash
while true; do
  inp="$(echo "hello" | radamsa)"
  echo "$inp" | /opt/phoenix/i486/stack-five > /dev/null 2>&1
  if [ $? -gt 127 ]; then
    echo "$inp" > crash-`date +s%.%N`.txt
    echo "Crash found!"
  fi
done

Fuzzing Closed-Source Software

This section covers difficulties in fuzzing programs, which do not include the source code, and ways to overcome that issue.

Approaches:

  • Black box fuzzing - doesn’t find a lot of bugs

  • Adding instrumentation (coverage tracking, etc) by dynamically rewriting - simple to implement, but 10-100x slower

    • AFL can do this by running executables under QEMU, and QEMU provides tracing and everything for you. You can even use it to fuzz ARM binaries from an x86 system.

  • Adding instrumentation by statically rewriting - more complex analysis, but much more performant. Needs to fully model program control flow without executing each path, so can be buggy (for example with stripped binaries).

Symbolic Execution

Symbolic execution executes a program, but uses symbolic values (formulae) instead of normal variable names. When encountering a path, it takes both paths (and adjusts the possible values accordingly for each path). Thus it can theoretically traverse every single branch of a program and find everything. Practically, it has a number of practical limitations, however:

  • Large computing power demand

  • State explosion

  • Difficult to set up

https://nebelwelt.net/files/13CCC-presentation.pdf

Last updated