Module Extension.Type

Data types for parameters.

A data type defines the validation, parsing, and textual representation of data used for command and configuration parameters.

Data types are not declarative, but structural, i.e., they define a set of rules which describe the set of possible values and mapping of those values to their OCaml representation.

Given that the textual representation of data could be non structural itself, e.g., filenames do not define themselves but act as a reference to some other data, it is also important to correctly define the equality operator. We use the digest function, that computes an md5 hash of the datum that describes how it should be compared to other data of the same type. This digests are approximations, which guarantee, that data with equal digests are equal (modulo probability of md5 hash collision), but not always the vice verse (since it is not always possible or feasible to compute the complete digest, cf., digest of the /dev folder).

type 'a t = 'a typ
val define : ?name:string -> ?digest:('a -> string) -> parse:(string -> 'a) -> print:('a -> string) -> 'a -> 'a t

define ~parse ~print default defines a data type.

The print x is the textual representation of the value x. For all x in the defined type, x = parse (print x).

The parse function may raise the Invalid_arg exception to indicate that the provided datum doesn't represent a valid element of the type. It may raise any other exception to indicate other possible errors. In any case, any exception raised by the parse, print, or digest functions will be caught and propagated to the Bap_main.init abnormal termination with a corresponding error condition.

@parameter digest if provided then digest x, should evaluate to an md5 hash of x such that if for all y, if digest x = digest y mod md5 then x = y. I.e., if digests are equal (modulo md5 collision) then the x and y are also equal. The opposite is not guaranteed, but most of the data types usually provide this guarantee.

@parameter name is the variable name which is used to reference to elements of the type t. (defaults to "VAL").

val refine : 'a t -> ('a -> unit) -> 'a t

refine t valid narrows the set of t, to those that valid. The valid function shall raise the Invalid_arg exception, for all values that are not members of the newly defined data type.

val rename : 'a t -> string -> 'a t

renam t var denotes elements of t with the new var.

val digest : 'a t -> 'a -> string

digest t x is the digest of x.

val (=?) : 'a t -> 'a -> 'a t

t =? x defines a new type with different default.

The new type has the same definition as t except the default value is x.

val (|=) : 'a t -> ('a -> unit) -> 'a t

t |? guard is refine t guard

val (%:) : string -> 'a t -> 'a t

name %: t is rename t name.

Note, operators (=?), |?, and (%:) are designed to be used together for easy definitions of new types, e.g.,

let arch = Type.("code" %: arch_t =? `x86 |= only_x86)
val print : 'a t -> 'a -> string

print t x is the textual representation of x.

val parse : 'a t -> string -> 'a

parse t s is the OCaml value representing s.

Of those s which are not valid, raises the Invalid_arg exception.

val name : 'a t -> string

name t is the name of the var that ranges of t.

val default : 'a t -> 'a

default t is the default value of t.

Predefined data types

val bool : bool t

bool is "true" | "false"

val char : char t

char is a single character.

val int : int t

int is a sequence of digits.

Common OCaml syntax is supported, with binary, decimal, and hexadecimal literals.

val nativeint : nativeint t

nativeint is a sequence of digit.

This type uses processor-native integer as OCaml representation so it is one bit wider than the int type.

val int32 : int32 t

int32 is a sequence of digits.

val int64 : int64 t

int64 is a sequence of digits.

val float : float t

float is a floating point number.

val string : string t

string is a sequence of bytes.

When the sequence contains whitespaces, delimit the whole sequence with double or single quotes.

val some : 'a t -> 'a option t

some t extends t with an empty string.

val enum : (string * 'a) list -> 'a t

enum repr defines a type from the given representation.

Defines a type with such print and parse, that for each pair (s,v) in repr, print v = s and parse s = v.

It is a configuration error, when repr is empty.

If repr has repretitive keys, i.e., for the same textual representation there are different values, then the result is undefined.

val path : string t

path denotes a file path.

The path is suitable for denoting output paths and its digest is the digest of the characters, which constitute the path.

val file : string t

file the name of an input file or directory.

The file denoted by the name must exist and must be accessible.

Digesting paths

The following rules describe how the digest of the path is computed. It is assumed that the file type denotes the input file, therefore the contents that is referenced by the path is approximately digested. For the output destinations the path type is more suitable.

1. If the name is a symbolic link then the digest of the link destination is computed.

2. If the name references to a regular file then the digest of the file is the digest of its contents and the name itself doesn't affect the digest value.

3. If the name referenced to a directory, then a recursive digest is computed, such that:

  • if the directory contains a small number of regular files and directories (less than 4k), then a cummulative digest of its content built from all constituting path names and modification times is computed;
  • otherwise (if the directory is too large or contains non regular files, e.g., sockets, fifo, devices), then a fresh new random digest is created from the directory name and the current time.
val dir : string t

dir denotes a file which must be a directory.

The directory denoted by the name must exist. See the file type for more information about computing the digest.

val non_dir_file : string t

dir denotes a file which must not be a directory.

The directory denoted by the name must exist. See the file type for more information about computing the digest.

val list : ?sep:char -> 'a t -> 'a list t

list ~sep t is a list of t elements, separated with sep.

val array : ?sep:char -> 'a t -> 'a array t

array ~sep t is an array of t elements, separated with sep. @parameter sep defaults to ','.

val pair : ?sep:char -> 'a t -> 'b t -> ('a * 'b) t

pair ~sep t1 t2 is a pair t1 and t2, separated with sep.

@parameter sep defaults to ','.

val t2 : ?sep:char -> 'a t -> 'b t -> ('a * 'b) t

t2 ~sep t1 t2 is a pair t1 and t2, separated with sep.

@parameter sep defaults to ','.

val t3 : ?sep:char -> 'a t -> 'b t -> 'c t -> ('a * 'b * 'c) t

t3 ~sep t1 t2 t3 is (t1,t2,t3), separated with sep. @parameter sep defaults to ','.

val t4 : ?sep:char -> 'a t -> 'b t -> 'c t -> 'd t -> ('a * 'b * 'c * 'd) t

t4 ~sep t1 t2 t3 t4 is (t1,t2,t3,t4), separated with [sep]. @parameter sep defaults to [','].