Skip to main content

A coverage-guided fuzzer for Python and Python extensions.

Project description

Atheris: A Coverage-Guided, Native Python Fuzzer

Atheris is a coverage-guided Python fuzzing engine. It supports fuzzing of Python code, but also native extensions written for CPython. Atheris is based off of libFuzzer. When fuzzing native code, Atheris can be used in combination with Address Sanitizer or Undefined Behavior Sanitizer to catch extra bugs.

Installation Instructions

Atheris supports Linux (32- and 64-bit) and Mac OS X.

Linux

Atheris relies on libFuzzer, which is distributed with Clang. If you have a sufficiently new version of clang on your path, installation is as simple as:

pip install atheris

If you don't have clang installed or it's too old, you'll need to download and build the latest version of LLVM. Follow the instructions in Installing Against New LLVM below.

Mac

Atheris relies on libFuzzer, which is distributed with Clang. However, Apple Clang doesn't come with libFuzzer, so you'll need to install a new version of LLVM from head. Follow the instructions in Installing Against New LLVM below.

Installing Against New LLVM

# Building LLVM
git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build
cd build
cmake -DLLVM_ENABLE_PROJECTS='clang;compiler-rt' -G "Unix Makefiles" ../llvm
make -j 10  # This step is very slow

# Installing Atheris
CLANG_BIN="$(pwd)/bin/clang" pip3 install atheris

Using Atheris

Example:

import atheris
import sys

def TestOneInput(data):
  if data == b"bad":
    raise RuntimeError("Badness!")

atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()

Atheris supports fuzzing Python code, and uses Python code coverage information for this purpose.

Fuzzing Python Code

While Atheris supports Python 2.7 and Python 3.3+, its Python code coverage support is significantly better when used with Python 3.8+, as it supports opcode-by-opcode coverage. If fuzzing Python code, we strongly recommend using Python 3.8+ where possible.

When fuzzing Python, Atheris will report a failure if the Python code under test throws an uncaught exception.

Be sure to pass enable_python_coverage=True as an argument to Setup(). You can additionally pass enable_python_opcode_coverage=[True/False] to turn on and off opcode coverage. Opcode coverage is typically beneficial, but may provide more performance impact than benefit on large Python projects. This option defaults to True on Python 3.8+, or False otherwise.

Opcode coverage must be enabled to support features like intelligent string comparison fuzzing for Python code.

Fuzzing Native Extensions

In order for native fuzzing to be effective, such native extensions must be built with Clang, using the argument -fsanitize=fuzzer-no-link. They should be built with the same clang as was used when building Atheris.

The mechanics of building with Clang depend on your native extension. However, if your library is built with setuptools (e.g. pip and setup.py), the following is often sufficient:

CC="/usr/bin/clang" CFLAGS="-fsanitize=fuzzer-no-link" CXX="/usr/bin/clang++" CXXFLAGS="-fsanitize=fuzzer-no-link" pip install .

Using Sanitizers

When fuzzing a native extension, we strongly recommend you use a sanitizer, such as Address Sanitizer or Undefined Behavior Sanitizer. However, there are complexities involved in doing this; see using_sanitizers.md for details.

Integration with OSS-Fuzz

Atheris is fully supported by OSS-Fuzz, Google's continuous fuzzing service for open source projects. For integrating with OSS-Fuzz, please see https://google.github.io/oss-fuzz/getting-started/new-project-guide/python-lang.

API

Main Interface

The atheris module provides two key functions: Setup() and Fuzz().

In your source file, define a fuzzer entry point function, and pass it to atheris.Setup(), along with the fuzzer's arguments (typically sys.argv). Finally, call atheris.Fuzz() to start fuzzing. Here's an example:

def Setup(args, callback, enable_python_coverage=True, enable_python_opcode_coverage=True):

Configure the Atheris Python Fuzzer. You must call atheris.Setup() before atheris.Fuzz().

Args:

  • args: A list of strings: the process arguments to pass to the fuzzer, typically sys.argv. This argument list may be modified in-place, to remove arguments consumed by the fuzzer. See the LibFuzzer docs for a list of such options.
  • test_one_input: your fuzzer's entry point. Must take a single bytes argument (str in Python 2). This will be repeatedly invoked with a single bytes container.

Optional Args:

  • enable_python_coverage: boolean. Controls whether to collect coverage information on Python code. Defaults to True. If fuzzing a native extension with minimal Python code, set to False for a performance increase.
  • enable_python_opcode_coverage: boolean. Controls whether to collect Python opcode trace events. You typically want this enabled. Defaults to True on Python 3.8+, and False otherwise. Ignored if enable_python_coverage=False, or if using a version of Python prior to 3.8.
def Fuzz():

This starts the fuzzer. You must have called Setup() before calling this function. This function does not return.

In many cases Setup() and Fuzz() could be combined into a single function, but they are separated because you may want the fuzzer to consume the command-line arguments it handles before passing any remaining arguments to another setup function.

def TraceThisThread(enable_python_opcode_coverage=True):

While we don't recommend using threads during fuzzing if you can avoid it, Atheris does support it.

This function enables the collection of coverage information for the current thread. Python coverage collection must be enabled in Setup() or this has no effect. (Thread coverage still works if this function is called before Setup(), and Setup() is subsequently called with enable_python_coverage=True).

Optional Args:

  • enable_python_opcode_coverage: boolean. Controls whether to collect Python opcode trace events for this thread. You typically want this enabled. Defaults to True ; ignored and unsupported if using a version of Python prior to 3.8.

FuzzedDataProvider

Often, a bytes object is not convenient input to your code being fuzzed. Similar to libFuzzer, we provide a FuzzedDataProvider to translate these bytes into other input forms. Alternatively, you can use Hypothesis as described below.

You can construct the FuzzedDataProvider with:

fdp = atheris.FuzzedDataProvider(input_bytes)

The FuzzedDataProvider then supports the following functions:

def ConsumeBytes(count: int)

Consume count bytes.

def ConsumeUnicode(count: int)

Consume unicode characters. Might contain surrogate pair characters, which according to the specification are invalid in this situation. However, many core software tools (e.g. Windows file paths) support them, so other software often needs to too.

def ConsumeUnicodeNoSurrogates(count: int)

Consume unicode characters, but never generate surrogate pair characters.

def ConsumeString(count: int)

Alias for ConsumeBytes in Python 2, or ConsumeUnicode in Python 3.

def ConsumeInt(int: bytes)

Consume a signed integer of the specified size (when written in two's complement notation).

def ConsumeUInt(int: bytes)

Consume an unsigned integer of the specified size.

def ConsumeIntInRange(min: int, max: int)

Consume an integer in the range [min, max].

def ConsumeIntList(count: int, bytes: int)

Consume a list of count integers of size bytes.

def ConsumeIntListInRange(count: int, min: int, max: int)

Consume a list of count integers in the range [min, max].

def ConsumeFloat()

Consume an arbitrary floating-point value. Might produce weird values like NaN and Inf.

def ConsumeRegularFloat()

Consume an arbitrary numeric floating-point value; never produces a special type like NaN or Inf.

def ConsumeProbability()

Consume a floating-point value in the range [0, 1].

def ConsumeFloatInRange(min: float, max: float)

Consume a floating-point value in the range [min, max].

def ConsumeFloatList(count: int)

Consume a list of count arbitrary floating-point values. Might produce weird values like NaN and Inf.

def ConsumeRegularFloatList(count: int)

Consume a list of count arbitrary numeric floating-point values; never produces special types like NaN or Inf.

def ConsumeProbabilityList(count: int)

Consume a list of count floats in the range [0, 1].

def ConsumeFloatListInRange(count: int, min: float, max: float)

Consume a list of count floats in the range [min, max]

def PickValueInList(l: list)

Given a list, pick a random value

def ConsumeBool()

Consume either True or False.

Use with Hypothesis

The Hypothesis library for property-based testing is also useful for writing fuzz harnesses. As well as a great library of "strategies" which describe the inputs to generate, using Hypothesis makes it trivial to reproduce failures found by the fuzzer - including automatically finding a minimal reproducing input. For example:

import atheris
from hypothesis import given, strategies as st

@given(st.from_regex(r"\w+!?", fullmatch=True))
def test(string):
  assert string != "bad"

atheris.Setup(sys.argv, test.hypothesis.fuzz_one_input)
atheris.Fuzz()

See here for more details, or here for what you can generate.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atheris-1.0.9.tar.gz (42.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

atheris-1.0.9-cp39-cp39-manylinux2014_x86_64.whl (26.1 MB view details)

Uploaded CPython 3.9

atheris-1.0.9-cp38-cp38-manylinux2014_x86_64.whl (26.2 MB view details)

Uploaded CPython 3.8

atheris-1.0.9-cp37-cp37m-manylinux2014_x86_64.whl (26.2 MB view details)

Uploaded CPython 3.7m

atheris-1.0.9-cp36-cp36m-manylinux2014_x86_64.whl (26.2 MB view details)

Uploaded CPython 3.6m

atheris-1.0.9-cp35-cp35m-manylinux2014_x86_64.whl (26.2 MB view details)

Uploaded CPython 3.5m

File details

Details for the file atheris-1.0.9.tar.gz.

File metadata

  • Download URL: atheris-1.0.9.tar.gz
  • Upload date:
  • Size: 42.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-1.0.9.tar.gz
Algorithm Hash digest
SHA256 2e5d8c58d6fcca4da79f5666eeda5c3e831aee52c83531ef3b9844c982f133cc
MD5 9cf79e6fed354db072f9d1a8743a3d44
BLAKE2b-256 a0e7b0e3fe407dd3a66fec124254c9ffcbda67e5da48bc0182a198d9b45639ee

See more details on using hashes here.

File details

Details for the file atheris-1.0.9-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-1.0.9-cp39-cp39-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.1 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-1.0.9-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 45a47a26114b210f130338b9b0548e9156eb9ba70b9bd8178456892ac5e9d83d
MD5 6f9ddcf73fda952297b546be3a84ff2a
BLAKE2b-256 297ae71f74f9c36196f2edf0075a88ca397cefbd4e244c5126273eaddbf63649

See more details on using hashes here.

File details

Details for the file atheris-1.0.9-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-1.0.9-cp38-cp38-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.2 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-1.0.9-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a6f5a32f4c9d4e22b06852b3a028259be49be8aeb0d703359c0767947615b923
MD5 6974639deebf3b6183249849052ef390
BLAKE2b-256 2f31f6005dacd55f3850024656561416af3ff67bfefd1edd1d4e8c55178c21f5

See more details on using hashes here.

File details

Details for the file atheris-1.0.9-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-1.0.9-cp37-cp37m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-1.0.9-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cd8bf8e7223250e30d1235e0760d28f94d8feb45dcbb43e4109527932dba32ef
MD5 1b5b561f5d130b8b465b0629a5542093
BLAKE2b-256 f308329dbd4a5bb233b600faf4390b1f0849224e1504dd93076b8caa583e2c08

See more details on using hashes here.

File details

Details for the file atheris-1.0.9-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-1.0.9-cp36-cp36m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.2 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-1.0.9-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9aae94b858579b0642252f0400deca8a630544611f67aa48b6ee7122a2a7c596
MD5 e93ad9e23705aede9514e42c34365999
BLAKE2b-256 ba9e9a968c03a4fb1348e1112ad3ccb0bb1060b270b6c4b81e61ef4d8a9561a8

See more details on using hashes here.

File details

Details for the file atheris-1.0.9-cp35-cp35m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: atheris-1.0.9-cp35-cp35m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 26.2 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.8.6

File hashes

Hashes for atheris-1.0.9-cp35-cp35m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d3d4a49bd3e7efd70e119a774aecfcddab1fddd33d5045d3204ca248a19dcadb
MD5 5654ffb3f207de00841dc2b764e6a2ee
BLAKE2b-256 a2a8bbe070f0ee271dcaf7bfccce5e828b3737d9d207a4c38c8c3b87f46c75e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page