Next: General Index, Previous: Historical, Up: Top [Contents][Index]
This chapter contains information about the file format of Daikon’s input files. It is of most information to those who wish to write a front end, also known as an instrumenter (see Front ends and instrumentation in Daikon User Manual). A new front end enables Daikon to detect invariants in another programming language.
Daikon’s input is conventionally one or more .dtrace data trace files. (Another, optional type of input file for Daikon is a splitter info file; see Splitter info file format in Daikon User Manual.) A trace file is a text file that consists of newline-separated records. There are two basic types of records that can appear in Daikon’s input: program point declarations, and trace records. The declarations describe the structure of the trace records. The trace records contain the data on which Daikon operates — the run-time values of variables in your program.
Each declaration names an instrumented program point and lists the variables at that program point. A program point is a location in the program, such as a specific line number, or a specific procedure’s entry or exit. An instrumented program point is a place where the instrumenter may emit a trace record. A program point declaration may be repeated, so long as the declarations match exactly (any declarations after the first one have no effect).
A data trace record (also known as a sample) represents one execution of a program point. The record specifies the program point and gives the run-time values of each variable. The list of variables in the data trace record must be identical to that in the corresponding declaration. For a given program point, the declaration must precede the first data trace record for the program point. It is not required that all the program point declarations appear before any of the data trace records.
There exist some other declaration-related records; see Declaration-related records.
• Declarations in a separate file | ||
• Conventions | ||
• Declarations | ||
• Data trace records | ||
• Example files | ||
• Version 1 Declarations |
Next: Conventions, Up: File formats [Contents][Index]
Instead of placing both declarations and data trace records in a single
file, it is permitted to place the declarations in one or more
.decls declaration files while leaving the data trace records
in the .dtrace file. This can be convenient for tools that
perform a separate instrumentation step, such as
dfepl
(see dfepl in Daikon User Manual).
Such a tool takes as
input a target program to be analyzed, and produces two outputs: a
.decls file and an instrumented program. Executing the
instrumented program produces a .dtrace file containing data
trace records for all the program points that appear in the
.decls file. This approach works fine and is easier to
implement in certain situations, but has a few disadvantages. It
requires the user to perform at least two steps — instrumentation and
execution — and the existence of two versions of the program
(instrumented and uninstrumented) can lead to confusion or extra work.
It is also more convenient to have a single file that contains all
information about a program, rather than multiple .decls files
that must be associated with the .dtrace file.
It is also permitted for a declaration to appear more than once in Daikon’s input. The declaration must be identical in all occurrences. This is useful when running Daikon with multiple .dtrace files, each of which contains its own declarations.
Next: Declarations, Previous: Declarations in a separate file, Up: File formats [Contents][Index]
Daikon files are textual, to permit easier viewing and editing by humans. Each record is separated by one or more blank lines. To permit easier parsing by programs, each piece of information in a record appears on a separate line.
Outside a record, any line starting with a pound sign (#) or double slashes (//) is ignored as a comment. Comments are not permitted inside a record.
Next: Data trace records, Previous: Conventions, Up: File formats [Contents][Index]
The trace file (or declaration file) first states the declaration file format version number (see Declaration version). It may also specify some other information about the file (see Declaration-related records). Then, it defines each program point and its variables.
Indentation is ignored, so it may be used to aid readability. Fields with defaults can be omitted.
As a rule, each line of the declaration file is of the form
<field-name> <field-value>
.
Some additional details about the declaration file format appear in file
daikon/doc/decl_format.txt
in the Daikon source code.
• Declaration-related records | ||
• Program point declarations | ||
• Variable declarations |
Next: Program point declarations, Up: Declarations [Contents][Index]
• Declaration version | ||
• Input-language declaration | ||
• Variable comparability | ||
• ListImplementors declaration |
Next: Input-language declaration, Up: Declaration-related records [Contents][Index]
The declaration version record must be the first record in the file.
The declaration version record is as follows:
decl-version <version>
The current version is 2.0.
Previous versions (see Version 1 Declarations) did not include a version field and are identified by the lack of this field.
Next: Variable comparability, Previous: Declaration version, Up: Declaration-related records [Contents][Index]
You can specify the language in which the program was written with a record of the form
input-language <language>
The language string is arbitrary and is not used.
Next: ListImplementors declaration, Previous: Input-language declaration, Up: Declaration-related records [Contents][Index]
The Variable comparability record indicates how the comparability field of a variable declaration should be interpreted.
Its format is:
var-comparability <comparability-type>
The possible values for comparability-type are implicit
and
none
.
implicit
means ordinary comparability as described in
Variable declarations. (The name implicit
is retained
for historical reasons.)
This record is optional. The implicit
type is the default.
Previous: Variable comparability, Up: Declaration-related records [Contents][Index]
This declaration indicates classes that implement the
java.util.List
interface, and should be treated as sequences
for the purposes of invariant detection. The syntax is as follows:
ListImplementors <classname1> <classname2> ...
Each classname
is in Java format (for example, java.util.LinkedList).
The --list_type command-line option to Daikon can also be used to specify classes that implement lists; see Options to control invariant detection in Daikon User Manual.
Next: Variable declarations, Previous: Declaration-related records, Up: Declarations [Contents][Index]
The format of a program point declaration is:
ppt <ppt-name> <ppt-info> <ppt-info> ... <variable-declaration> <variable-declaration> ...
The program point name can include any character. In the declaration
file,
blanks must be replaced by \_
, and backslashes must be escaped as \\
.
Program point names must be distinct.
While Daikon does not infer program point relationships from ppt-name
s,
it does require these names to conform to a set syntax.
The following patterns are for the enter
, subexit
,
class
and object
ppt-type
s, respectively.
<fully qualified class name>.<method/function name>(<argument types>):::ENTER <fully qualified class name>.<method/function name>(<argument types>):::EXIT<id> <fully qualified class name>:::CLASS <fully qualified class name>:::OBJECT
Since in most languages a method or function may have multiple exit points,
the ppt-name
for a subexit
ppt-type
must be appended
with a unique id. Typically, this is the corresponding source line number.
The following information about the program point (ppt-info
) can be specified:
ppt-type <type>
Specifies the type of the program point. Possible program point
types are point
, class
, object
, enter
,
exit
, subexit
. Except for point
all of these
types are related to the program point hierarchy (see Dataflow hierarchy).
A point
program point is one that is not involved in a
program point hierarchy. This is normally used when the input is not
from a programming language or when there is no dataflow hierarchy.
flags <flags>
Specifies one or more flags for this ppt
. The possible flags are:
static
, enter
, exit
, private
, return
.
It should be noted that neither Daikon nor Kvasir currently use or
output this field.
parent <relation-type> <parent-ppt-name> <relation-id>
Specifies the program point hierarchy (Dataflow hierarchy).
In particular, each parent
field names one parent of this program
point. A parent program point
is a point whose samples should include all of the samples at this
program point. For example, an object program point is a parent of
each of the method program points in that object.
The relation-type is the type of parent-child relationship in
the hierarchy. There are a few relationship types used internally
by Daikon, but the only one output to the data files is parent
.
A parent
relationship is one where the program points themselves
are explicitly related, such as an enter and an exit point. All of the
variables at one of the points exists at the other. A user
relation is one where a class is used at another point, such as at an
enter point. For example, if a reference to class A were passed to
routine ‘r1’, the values found at enter and exit of ‘r1’ could
be applied to
the class/object program point for A. By default user
relations
are not used because they can be recursive.
The relation-id is a unique integer that identifies this parent relation. They are used when defining the specific parent relations for variables.
Multiple parent fields can be specified.
Previous: Program point declarations, Up: Declarations [Contents][Index]
The format of a variable declaration is:
variable <name> <variable-info> <variable-info> ...
The variable name is arbitrary, but for clarity, it should match what is
used in the programming language. All characters are legal in a name,
but blanks must be represented as \_
and backslashes as
\\
.
If the variable is an array, ..
marks the location of
array indices within the variable name. Some examples of names are:
this.theArray this.theArray[..] this.stack.getClass()
The following information about the variable (variable-info
)
can be specified:
var-kind <kind> [<relative-name>]
Specifies the variable kind. Possible values are: field
,
function
,
array
, variable
, return
. If field
or
function
are specified, the relative name of the field or function must be
specified. For example, if the variable is this.theArray
, the
relative name is theArray
. Pointers to arrays are of type
field
. The arrays themselves (a sequence of values) are of
type array
. A var-kind entry is required in each variable block.
enclosing-var <enclosing-var-name>
The variable that contains this variable. Required for fields and arrays. Required for functions that represent an instance method. Forbidden for functions that specify a static method or a function in a non-object-oriented language. A variable is specified by its name. The named variable must be defined at the same program point. If a variable is omitted (e.g., by the omit-var switch), any variable for which it is the enclosing variable must be omitted as well.
For example, if the variable is this.theArray
, the
enclosing variable is this
.
reference-type pointer|offset
Specifies the kind of reference for variables which are structures or
classes. The possible values are pointer
or offset
. In
C, pointer
is used if the variable is a pointer, offset
is used when the structure is placed inline. Pointer would be used
for all references to Java objects. Defaults to pointer.
array <dim>
The number of array dimensions inherited or declared by this variable. The valid values are 0 or 1. This should be specified for any variable that has multiple values. If not specified it defaults to 0. Future versions of Daikon may support more levels of arrays.
dec-type <language-declaration>
This is what the programmer used in the declaration of the variable.
Names for standard types should use Java’s names (e.g., int
,
boolean
, java.lang.String
, etc.),
but names for user-defined or language-specific
types can be arbitrary strings. A dec-type
entry is required in each
variable block.
rep-type <daikon-type>
This describes what will appear in the data
trace file. For instance, the declared type might be char[..]
but
the representation type might be java.lang.String
. Or, the declared
type might be Object
but the representation type might be
hashcode
, if the address of the object is written to the data trace
file. A rep-type entry is required in each
variable block.
The representation type should be one of boolean
, int
,
hashcode
, double
, or java.lang.String
; or an
array of one of those (indicated by a [..]
suffix).
hashcode
is intended for unique object identifiers like memory
addresses (pointers) or the return value of Java’s
Object.hashCode
method. hashcode
is treated like
int
, except that the hashcode values are considered uninteresting
for the purposes of output. For example, Daikon will print
‘var has only one value’ instead of ‘var == 0x38E8A’.
flags <flags>
One or more flags may optionally be specified. Possible values are:
is_param
Indicates that a given variable is a parameter to a procedure. Some
procedures reassign parameters — essentially using them as local
variables. Such uses are not relevant to the procedure’s external
specification. The is_param
flag causes Daikon not to print
certain invariants, if the variable has been reassigned.
p
in its post-state form are not
printed.
p
(such as p.x
)
are printed only if p
has not changed.
p
is changed, but then, p
would no longer be interesting.)
no_dups
Indicates that a collection can not contain duplicates. If it cannot, Daikon does not check for some invariants that only have meaning for collections that can contain duplicate elements.
not_ordered
Indicates that the order of a collection does not have meaning. In this case, Daikon does not check for element-wise comparisons between it and other collections.
nomod
Indicates that the variable can never be modified. For example, a
variable declared static final
in Java.
synthetic
Indicates that the variable was added by the front end and is not manifest in the input program.
classname
Indicates that the variable indicates the classname
of its
enclosing variable.
to_string
Indicates that the variable is the string representation of its enclosing variable.
non_null
Indicates that the variable can’t take on a null value. In this case, Daikon will not check for the ‘NonZero’ invariant.
is_property
Indicates that the variable is a C# property (set by the Celeriac front end). It is used in output formatting to remove the parentheses from a method call.
is_enum
Indicates that the variable is an enumerated type (set by the Celeriac front end). Daikon uses this information to suppress obvious invariants.
is_readonly
Indicates that the variable is read only (set by the Celeriac front end). Daikon uses this information to suppress obvious invariants.
comparability <comparability-key>
The comparability-key indicates which other variables are comparable to this one. The information specified here might have been obtained dynamically, or via type-inference-based static analysis, or in some other manner.
A comparability for a non-array type is a signed integer. Two variables at the same program point are considered comparable if both integers are the same, or if either integer is negative (that is, a negative number means “comparable to every other variable”). A comparability for an array type contains an integer for each index and for the contents; for instance, ‘8[5]’ means that the array elements have comparability ‘8’ and the array indices have comparability ‘5’. Similarly, ‘5[22][17]’ is a comparability for a two-dimensional array. An array comparison succeeds if comparisons over each component succeed.
Variables at different program points are never compared to one another. Use of the same number at different program points does not indicate any relationship between the variables, and a given variable may have a different comparability integer at different program points.
As an example, in the following code:
int sum(int len, int[] a) { int sum=0; for (int i=0; i++; i<len) sum += a[i]; return sum; }
variables i
and len
are comparable to one another (and
to indices of array a
). Furthermore, the result is comparable
to the elements of array a
. The comparability keys for these
variables might look like
len - comparability 5 a - comparability 8[5] return - comparability 8
A comparability entry is required in each variable block.
parent <parent-ppt> <relation-id> [<parent-variable>]
Optionally specifies the parent variable of this variable in the program point/variable hierarchy. The parent-ppt is the name of the parent program point. The relation-id must be one of the relationship ids specified for this program point. The parent-variable is the name of this variable’s parent in the parent program point. If the names are the same, it can be omitted.
constant <value>
Optionally specifies a constant value for this variable. If the variable has a compile-time constant value that is specified in the declaration, then the variable must be omitted from the data trace records.
function-args <arg1> <arg2> ...
If this variable is computed as a function of some other variable,
specifies the arguments to the function; otherwise, the
function-args
line must be omitted.
The arguments are specified by their
external variable name. Multiple arguments are
blank separated. For example
function-args a.b this.f1
specifies that the function takes two arguments which are a.b
and
this.f1
. As with enclosing variables, each of the arguments must
be defined as variables.
min-value <v>
Optionally specifies the minimum possible value for this variable
(for instance, due to language-specific restrictions on the type). This
enables Daikon to suppress invariants that would be “obvious” in that
case, such as var >= v
.
max-value <v>
Optionally specifies the maximum possible value for this variable
(for instance, due to language-specific restrictions on the type). This
enables Daikon to suppress invariants that would be “obvious” in that
case, such as var <= v
.
min-length <v>
Optionally specifies the minimum possible length for this variable
(for instance, due to language-specific restrictions on the type). This
enables Daikon to suppress invariants that would be “obvious” in that
case, such as size(var) >= v
.
max-length <v>
Optionally specifies the maximum possible length for this variable
(for instance, due to language-specific restrictions on the type). This
enables Daikon to suppress invariants that would be “obvious” in that
case, such as size(var) <= v
.
valid-values [<v1> <v2> ... <vN>]
Optionally specifies all the possible values for this
variable (for instance, due to language-specific restrictions on the
type). This enables Daikon to suppress invariants that would be
“obvious” in that case, such as v is one of {v1, v2, ..., vN}
.
Next: Example files, Previous: Declarations, Up: File formats [Contents][Index]
A data trace record (also known as a sample) contains run-time value information. Its format is:
<program-point-name> this_invocation_nonce <nonce-string> <varname-1> <var-value-1> <var-modified-1> <varname2> <var-value-2> <var-modified-2> ...
In other words, the sample record contains:
:::ENTER
) with
procedure exits (whose names conventionally end with :::EXIT
).
This is necessary in concurrent systems because there may
be several invocations of a procedure active at once and they do not
necessarily follow a stack discipline, being exited in the reverse order of
entry. For non-concurrent systems, this nonce is not necessary, and
both the line this_invocation_nonce
and the nonce value may be
omitted.
null
.
[
), elements separated by
spaces, close bracket (]
). (Also, the array name
should end in ‘[..]’; use ‘a[..]’ for array contents,
but ‘a’ for the identity of the array itself.)
The value representation may also be the string nonsensical
;
see Nonsensical values.
A string or array value is never null
. A reference to a
string or array may be null
, in which case the string or array
value is printed as nonsensical
.
The special value 2 should be used only (and always) when the value
field is nonsensical
.
The variables should appear in the same order as they did in the declaration of the program point, without omissions or additions.
• Nonsensical values | ||
• Variables that do not appear in trace records |
Some trace variables and derived variables may not have a value because
the expression that computes it cannot be evaluated. In such a
circumstance, the value is said to be nonsensical, it is written in the
trace file as nonsensical
, and its modified field must be 2.
Examples include
x
when x
is uninitialized or deallocated,
x.y
when x
is null (or uninitialized or deallocated)
a[i]
when i
is outside the bounds of a
(or
uninitialized or deallocated, or a
is null, uninitialized, or
deallocated)
For trace variables, it is the responsibility of the front end to perform a check at run time whenever a variable’s value is about to be output to the trace, and to output the value ‘nonsensical’ (see Nonsensical values) rather than crashing the program or outputting an uninitialized or meaningless value. (Determining when an expression’s value is meaningless is the most challenging part of writing an instrumenter for a language like C, since it requires tracking memory allocation and initialization.) For derived variables created by Daikon, Daikon does the same thing, setting values to ‘nonsensical’ when appropriate. For controlling Daikon’s output in the presence of nonsensical values, see the daikon.Daikon.guardNulls configuration option (see General configuration options in Daikon User Manual).
Previous: Nonsensical values, Up: Data trace records [Contents][Index]
A trace record should contain exactly the same variables as in the corresponding declaration. There is one exception: for efficiency, compile-time constants (e.g., static final variables in Java) are omitted from the trace record, since they would have the same value every time.
Furthermore, neither the declarations nor the trace records contain derived variables (see Variable names in Daikon User Manual).
Next: Version 1 Declarations, Previous: Data trace records, Up: File formats [Contents][Index]
Here are portions of two files StackArTester.decls and StackArTester.dtrace, for a Java class that implements a stack of integers using an array as the underlying data structure. You can see many more examples by simply running an existing front end on some Java, C, or Perl programs and viewing the resulting files.
• Example declaration file | ||
• Example data trace file |
Next: Example data trace file, Up: Example files [Contents][Index]
This is part of the file StackArTester.decls, a declaration file for the StackAr.java program (see StackAr example in Daikon User Manual).
ppt DataStructures.StackAr.push(java.lang.Object):::ENTER ppt-type enter parent parent DataStructures.StackAr:::OBJECT 1 variable this var-kind variable dec-type DataStructures.StackAr rep-type hashcode flags is_param nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.theArray var-kind field theArray enclosing-var this dec-type java.lang.Object[] rep-type hashcode flags nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.theArray.getClass().getName() var-kind function getClass().getName() enclosing-var this.theArray dec-type java.lang.Class rep-type java.lang.String function-args this.theArray flags nomod synthetic classname comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.theArray[..] var-kind array enclosing-var this.theArray array 1 dec-type java.lang.Object[] rep-type hashcode[] flags nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.theArray[..].getClass().getName() var-kind function getClass().getName() enclosing-var this.theArray[..] array 1 dec-type java.lang.Class[] rep-type java.lang.String[] function-args this.theArray[] flags nomod synthetic classname comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.topOfStack var-kind field topOfStack enclosing-var this dec-type int rep-type int flags nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable DataStructures.StackAr.DEFAULT_CAPACITY var-kind variable dec-type int rep-type int constant 10 flags nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable x var-kind variable dec-type java.lang.Object rep-type hashcode flags is_param nomod comparability 22 variable x.getClass().getName() var-kind function getClass().getName() enclosing-var x dec-type java.lang.Class rep-type java.lang.String function-args x flags nomod synthetic classname comparability 22 ppt DataStructures.StackAr.push(java.lang.Object):::EXIT103 ppt-type subexit parent parent DataStructures.StackAr:::OBJECT 1 variable this var-kind variable dec-type DataStructures.StackAr rep-type hashcode flags is_param nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.theArray var-kind field theArray enclosing-var this dec-type java.lang.Object[] rep-type hashcode flags nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.theArray.getClass().getName() var-kind function getClass().getName() enclosing-var this.theArray dec-type java.lang.Class rep-type java.lang.String function-args this.theArray flags nomod synthetic classname comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.theArray[..] var-kind array enclosing-var this.theArray array 1 dec-type java.lang.Object[] rep-type hashcode[] flags nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.theArray[..].getClass().getName() var-kind function getClass().getName() enclosing-var this.theArray[..] array 1 dec-type java.lang.Class[] rep-type java.lang.String[] function-args this.theArray[] flags nomod synthetic classname comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable this.topOfStack var-kind field topOfStack enclosing-var this dec-type int rep-type int flags nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable DataStructures.StackAr.DEFAULT_CAPACITY var-kind variable dec-type int rep-type int constant 10 flags nomod comparability 22 parent DataStructures.StackAr:::OBJECT 1 variable x var-kind variable dec-type java.lang.Object rep-type hashcode flags is_param nomod comparability 22 variable x.getClass().getName() var-kind function getClass().getName() enclosing-var x dec-type java.lang.Class rep-type java.lang.String function-args x flags nomod synthetic classname comparability 22 ppt DataStructures.StackAr:::CLASS ppt-type class variable DataStructures.StackAr.DEFAULT_CAPACITY var-kind variable dec-type int rep-type int constant 10 flags nomod comparability 22 ppt DataStructures.StackAr:::OBJECT ppt-type object parent parent DataStructures.StackAr:::CLASS 1 variable this var-kind variable dec-type DataStructures.StackAr rep-type hashcode flags is_param nomod comparability 22 variable this.theArray var-kind field theArray enclosing-var this dec-type java.lang.Object[] rep-type hashcode flags nomod comparability 22 variable this.theArray.getClass().getName() var-kind function getClass().getName() enclosing-var this.theArray dec-type java.lang.Class rep-type java.lang.String function-args this.theArray flags nomod synthetic classname comparability 22 variable this.theArray[..] var-kind array enclosing-var this.theArray array 1 dec-type java.lang.Object[] rep-type hashcode[] flags nomod comparability 22 variable this.theArray[..].getClass().getName() var-kind function getClass().getName() enclosing-var this.theArray[..] array 1 dec-type java.lang.Class[] rep-type java.lang.String[] function-args this.theArray[] flags nomod synthetic classname comparability 22 variable this.topOfStack var-kind field topOfStack enclosing-var this dec-type int rep-type int flags nomod comparability 22 variable DataStructures.StackAr.DEFAULT_CAPACITY var-kind variable dec-type int rep-type int constant 10 flags nomod comparability 22 parent DataStructures.StackAr:::CLASS 1
Previous: Example declaration file, Up: Example files [Contents][Index]
This is part of file StackArTester.dtrace, which you can create by
running the instrumented StackAr.java program
(see StackAr example in Daikon User Manual).
This excerpt contains only the
first two calls to
push
and the first return from push
, along with the
associated object program point records; omitted records are indicated
by ellipses.
... DataStructures.StackAr.push(java.lang.Object):::ENTER this_invocation_nonce 104 this 812272602 1 this.theArray 312077835 1 this.theArray.getClass().getName() "java.lang.Object[]" 1 this.theArray[..] [null] 1 this.theArray[..].getClass().getName() [null] 1 this.topOfStack -1 1 x 1367164551 1 x.getClass().getName() "DataStructures.MyInteger" 1 ... DataStructures.StackAr.push(java.lang.Object):::EXIT103 this_invocation_nonce 104 this 812272602 1 this.theArray 312077835 1 this.theArray.getClass().getName() "java.lang.Object[]" 1 this.theArray[..] [1367164551] 1 this.theArray[..].getClass().getName() ["DataStructures.MyInteger"] 1 this.topOfStack 0 1 x 1367164551 1 x.getClass().getName() "DataStructures.MyInteger" 1 ... DataStructures.StackAr.push(java.lang.Object):::ENTER this_invocation_nonce 159 this 2007069404 1 this.theArray 142345952 1 this.theArray.getClass().getName() "java.lang.Object[]" 1 this.theArray[..] [null] 1 this.theArray[..].getClass().getName() [null] 1 this.topOfStack -1 1 x 111632506 1 x.getClass().getName() "DataStructures.MyInteger" 1 ... DataStructures.StackAr.push(java.lang.Object):::EXIT103 this_invocation_nonce 159 this 2007069404 1 this.theArray 142345952 1 this.theArray.getClass().getName() "java.lang.Object[]" 1 this.theArray[..] [111632506] 1 this.theArray[..].getClass().getName() ["DataStructures.MyInteger"] 1 this.topOfStack 0 1 x 111632506 1 x.getClass().getName() "DataStructures.MyInteger" 1 ...
Previous: Example files, Up: File formats [Contents][Index]
This section describes the original version (1.0) of declaration records. These are now obsolete and should not be used.
A declarations file can contain program point declarations,
VarComparability
declarations, and ListImplementors declarations
.
• V1 Program point declarations | ||
• V1 pptname format | ||
• V1 VarComparability declaration | ||
• V1 ListImplementors declaration |
Next: V1 pptname format, Up: Version 1 Declarations [Contents][Index]
The format of a program point declaration is:
DECLARE program-point-name varname1 declared-type1 [# auxiliary-information1] representation-type1 [= constant-value1] comparable1 varname2 declared-type2 [# auxiliary-information2] representation-type2 [= constant-value2] comparable2 ...
Program point information includes:
int
, boolean
, etc.), but names for
user-defined or language-specific types can be arbitrary strings.
hasDuplicates
Whether a collection can contain duplicates. If it cannot, Daikon does not check for some invariants that only have meaning for collections that can contain duplicate elements.
hasOrder
Whether order has meaning for a collection. If order does not have meaning in a collection, then Daikon does not check for element-wise comparisons between it and other collections.
hasNull
Whether zero has the special meaning null for the variable or collection. If it does, then Daikon checks for whether a value or the elements in a collection are null.
nullTerminated
Whether a collection has a value (usually null) that ends its representation. If it does, then Daikon looks at the collection’s size and at the collection’s size-1 as “interesting” values. If it does not, then Daikon only looks at the collection’s size.
isParam
Whether a given variable is a parameter to a method. If a
variable is a parameter, Daikon avoids printing some information that
would be considered uninteresting for parameters. First, invariants
that use the parameter variable p
in its post-state form are not
printed. Second, invariants that use fields of p
(such as p.x
)
are printed only if p
has not changed. Lastly, some immutable
characteristics, such as the size of arrays and data types are not
printed (both can be changed if p
is changed, but then, p
would no longer be interesting).
char[]
but
the representation type might be java.lang.String
. Or, the declared
type might be Object
but the representation type might be
hashcode
, if the address of the object is written to the data trace
file.
The representation type should be one of boolean
, int
,
hashcode
, double
, or java.lang.String
; or an
array of one of those (indicated by a []
suffix, as in Java).
Hashcodes are treated like integers, except that their actual values
are considered uninteresting for the purposes of output; they are
intended for unique object identifiers like memory addresses or the
return value of Java’s Object.hashCode
method.
The representation type may optionally be followed by an equals sign and a value; in that case, the variable is known to have a compile-time constant value and should be omitted from the data trace file.
The point of comparability is that Daikon should not compare unrelated quantities. For example, each person’s height in centimeters may always be less than their birth year, but it is not helpful for Daikon to output ‘height < birthyear’, because the two variables are measuring incomparable quantities. (In this case, the variables use different units of measurement.)
Variable comparability information helps Daikon to avoid computing information over unrelated variables. This saves time and (more importantly) improves the quality of Daikon’s output. For more details, see the paper “Quickly detecting relevant program invariants”.
Variable comparability information may be obtained dynamically (see Dynamic abstract type inference (DynComp) in Daikon User Manual), via type-inference based analysis, or in some other manner. In any event, Daikon reads it from the variable declarations.
A comparability for a non-array type is a signed integer. Two variables at the same program point are considered comparable if both integers are the same, or if either integer is negative. A comparability for an array type must contain an integer for each index and for the contents; for instance, ‘5[22][17]’ for a two-dimensional array. Comparisons succeed if comparisons over each component succeed.
Regardless of comparability, variables at different program points are never compared to one another. Use of the same comparability integer at different program points does not indicate any relationship between the variables, and a given variable may have a different comparability integer at different program points.
As an example, in the following code:
int sum(int len, int[] a) { int sum=0; for (int i=0; i++; i<len) sum += a[i]; return sum; }
variables i
and len
are comparable to one another (and
to indices of array a
). Furthermore, the result is comparable
to the elements of array a
. A declaration file for these
variables might look like
len int int 5 a int[] int[] 8[5] return int int 8
Next: V1 VarComparability declaration, Previous: V1 Program point declarations, Up: Version 1 Declarations [Contents][Index]
Instrumenting code creates a .decls file that contains program point names such as:
DataStructures.StackAr.push(java.lang.Object):::ENTER DataStructures.StackAr.push(java.lang.Object):::EXIT99 PolyCalc.RatNum.RatNum(int, int):::ENTER PolyCalc.RatNum.RatNum(int, int):::EXIT55 PolyCalc.RatNum.RatNum(int, int):::EXIT67
This section describes the format of these program point names. Someone writing an instrumenter for a new language must be sure to follow this format specification.
A program point name is a string with no tabs or newlines in it. The basic
format is ‘topLevel.bottomLevel:::pptInfo’.
For the first example given above, the top level of the hierarchy would
be DataStructures.StackAr
, the bottom level would be
push(java.lang.Object)
, and the
program point information would be ENTER
.
topLevel may contain any number of periods (‘.’). bottomLevel and pptInfo may not contain any periods. The string ‘:::’ may only appear once.
topLevel and pptInfo are required (i.e., they must be non-empty), as are the period to the right of topLevel and the colons to the left of pptInfo. However, bottomLevel is optional.
By convention, for Java topLevel consists of the class name, and bottomLevel consists of the method name and method signature.
For C, topLevel consists of a filename (or a single period for global functions), and bottomLevel could consist of a function name and signature. More precisely, names of C program points follow these conventions:
For an Input Output Automaton, topLevel consists of an Automaton name and bottomLevel consists of information for a transition state.
By convention, the entry and exit points for a function have names of
a special form so that they can be associated with one another.
(Currently, those names end with :::ENTER
and :::EXIT
.) This
convention permits Daikon to generate pre-state variables
(see Variable names in Daikon User Manual) automatically at
procedure exit points, so
front ends need not output them explicitly. When there
are multiple exit points, then each one should be suffixed by a number
(such as a line number, for example, foo::EXIT22
). Daikon produces
the main (non-numbered) :::EXIT
point automatically. All the
numbered exits should contain the same set of variables; in general,
this means that local variables are not included at exit points.
Daikon requires that declarations for :::ENTER
program
points appear before any declarations for matching :::EXIT
program
points.
Another convention is to have another program point whose
bottomLevel is empty and whose pptInfo is OBJECT
:
for example, StackAr:::OBJECT
. This contains the
representation invariant (sometimes called the object invariant) of a
class. This program point is created automatically by Daikon; it need
not appear in a trace file.
Next: V1 ListImplementors declaration, Previous: V1 pptname format, Up: Version 1 Declarations [Contents][Index]
There is a special VarComparability
declaration that controls how
the comparability field in program point declarations is interpreted.
The default VarComparability
is implicit
, which means
ordinary comparability as described in Program point declarations.
(The name implicit
is retained for historical reasons.)
You can override it as
follows:
VarComparability none
As with all records in Daikon input files, a blank line is required between this record and the next one.
Previous: V1 VarComparability declaration, Up: Version 1 Declarations [Contents][Index]
This declaration indicates classes that implement the
java.util.List
interface, and should be treated as sequences
for the purposes of invariant detection. The syntax is as follows:
ListImplementors <classname1> <classname2> ...
Each classname
is in Java format (for example, java.util.LinkedList).
The --list_type command-line option to Daikon can also be used to specify classes that implement lists; see Options to control invariant detection in Daikon User Manual.
Previous: V1 VarComparability declaration, Up: Version 1 Declarations [Contents][Index]