BAP Graph Library


Although BAP is well-documented, its vast interface may be a little hard to navigate when looking for specific features. This is the first in a series of posts which introduces some of BAP's features by way of "usage patterns". The intention is to provide you with small snippets of code that encapsulate uses of BAP features.

At the bottom of this post you can find a template file and example source for the binary used in the examples.

In this post we focus on BAP's Graphlib.

Graphlib

Dot Output

How do I output my program's callgraph in dot format?

let callgraph = Program.to_graph @@ Project.program project in
let string_of_node n =
  sprintf "%S" @@ Tid.name @@ Graphlib.Callgraph.Node.label n in
Graphlib.to_dot (module Graphlib.Callgraph)
  ~string_of_node ~filename:"callgraph.dot" callgraph;

Notes:

  • BAP's graphlib module has a dedicated submodule for callgraphs: Graphlib.Callgraph

The output file callgraph.dot for the example binary appears as follows:


How do I output the CFG of of "main" in dot format?

let program = Project.program project in

let left_justify =
  String.concat_map ~f:(fun c ->
      if c = '\n' then "\\l" else Char.to_string c) in

(match Term.find sub_t program Tid.(!"@main") with
 | Some main_sub ->
   let node_attrs _ =
     [`Shape `Box] in
   let string_of_node node = sprintf "\"\\%s\""
     @@ Blk.to_string @@ Graphlib.Ir.Node.label node |> left_justify in
   Graphlib.to_dot (module Graphlib.Ir) ~string_of_node ~node_attrs
     ~filename:"main.dot" @@ Sub.to_cfg main_sub
 | None -> ());

The ouptut file main.dot appears as follows:

Notes:

  • Tid.(!"@main") looks for a function called main. Your program needs to be compiled with debugging symbols, or you need to use the --use-ida option if you want to use this notation. Alternatively, use Tid.(!"@sub_400440") where 400440 corresponds to the address (in hex) of your function (for example, entry point).

  • Here we use Sub.to_cfg, which returns a graph corresponding to Graphlib.Ir. Nodes of Graphlib.Ir are blks.


What if I don't want all of the IR in my CFG, but rather nodes and edges labeled with identifiers?

let program = Project.program project in
Option.(Term.find sub_t program Tid.(!"@main") >>= fun main_sub ->
        let string_of_node n = sprintf "\"\\%s\""
            @@ Tid.name @@ Graphlib.Tid.Tid.Node.label n in
        let string_of_edge e = Tid.name @@ Graphlib.Tid.Tid.Edge.label e in
            Graphlib.to_dot ~string_of_node ~string_of_edge
              ~filename:"main_with_tids.dot" (module Graphlib.Tid.Tid) @@
          Sub.to_graph main_sub; Some main_sub)
|> Pervasives.ignore;

Notes:

  • Here we use Sub.to_graph, and the appropriate types for labels: tid, as opposed to blk in the previous example. See bap documentation for why you might want this instead.


Strongly Connected Components

How do I find strongly connected components in my program?

let callgraph = Program.to_graph @@ Project.program project in
let scc_partition = Graphlib.strong_components
    (module Graphlib.Callgraph) callgraph in
Format.printf "%d components found:\n"
@@ Partition.number_of_groups scc_partition;

Seq.iter (Partition.groups scc_partition) ~f:(fun group ->
    Group.enum group |> Seq.iter ~f:(fun x ->
        Format.printf "%s " @@ Tid.to_string x);
    Format.printf "\n");
Some scc_partition
|> Pervasives.ignore;

Output:

7 components found:
@sub_4003e0
@sub_400410
@sub_400430
@h @g
@f
@main
@__libc_csu_init

Notes:

  • Notice that the two functions @h and @g are printed on the same line, indicating that they belong to the same group (i.e., they form a strongly connected component).


Graph Construction

How can I construct arbitrary graphs?

let module G = Graphlib.Int.Unit in
let g = Graphlib.create (module G) ~edges:[0,1,();1,1,()] () in
Graphlib.to_dot (module G) ~filename:"graph.dot" g;

Notes:

  • BAP's Graphlib is in fact generic, and graphs over arbitrary types may be constructed.


Wrap-up

This really just scratches the surface of Graphlib. There are a number of other interesting functions in the library that can be referred to. Notable ones include:

Template

All of the snippets above may be substituted into the following template.

#use "topfind";;
#require "bap.top";;
open Core_kernel.Std
open Bap.Std
open Or_error

let main () =
  Project.from_file Sys.argv.(1) >>= fun project ->

    (* Place snippet here *)

  return ()

let () =
  try main ()
      |> function
      | Ok _ -> ()
      | Error e -> Format.printf "BAP error: %s\n" @@ Error.to_string_hum e
  with
  | Invalid_argument _ ->
    Format.printf "Please specify a file on the command line\n"

It can then be run on the commandline as follows:

ocaml template.ml example

For the example binary example, you may compile the following:

#include <stdio.h>

int g(int i);

int h(int i) {
  i += 1;
  return g(i);
}

int g(int i) {
  if (i > 10) {
    return i;
  }
  i += 1;
  return h(i);
}

int f(int i) {
  i += 1;
  return g(i);
}

int main() {
  int i = 0;
  int result;

  result = f(i);
  printf("Res: %d\n", result);
}
gcc -o example example.c