Fjalar is built as a tool on top of Valgrind, and it it also contains a sizable amount of code from Memcheck, another Valgrind tool. When new versions of Valgrind and Memcheck are released, Fjalar should be updated to incorporate the Valgrind/Memcheck changes. This document tells you how to do so.
Conceptually, this is two separate updates: (1) Updating the underlying copy of Valgrind to the newer version of Valgrind, and (2) Updating the memcheck code contained in Fjalar to the newer version of Memcheck. Due to very tight dependence of the memcheck code on Valgrind, the changes should be done simultaneously.
The update of the underlying copy of Valgrind can usually be done almost automatically. Updating the memcheck code often involves manual work, due to substantial modifications to the memcheck code that is incorporated in Fjalar. The more frequent the merges are, the easier they will be to do. Monthly, at the very least, is recommended.
Contents:
The instructions assume you use bash or sh as your shell. If you use a different shell, adjust the instructions as necessary.
You need to obtain 2 copies of Valgrind: the most recent, and the one the current version of Fjalar is based on.
Note: As of 9/13/2009, the tests pass on a CSAIL PAG machine (Debian), but do not run on a UW CSE machine (Fedora) nor on Ubuntu 9.04.
First, check out Daikon and Fjalar from CVS. You may skip this if you already have a copy of invariants/valgrind-3 checked out.
(From a PAG machine)
cvs -d $pag/projects/invariants/.CVS co invariants cd invariants export $INV=`pwd` cvs -d $pag/projects/invariants/.CVS co valgrind-3
First, update your checkout.
cd $INV cvs -q update cd valgrind-3 cvs -q update
Ensure there are no local changes.
cd $INV cvs -q diff --brief cd valgrind-3 cvs -q diff --brief
The diff commands should produce no output. If there are any differences, commit them, then start over from the beginning.
Re-compile Fjalar, then run the tests. (The tests take about 20 minutes.)
cd $INV/valgrind-3 ./auto-everything.sh cd $INV/tests/kvasir-tests make nightly-summary-w-daikon
The tests should pass. If any tests fail, fix them, then start over from the beginning.
cd $INV svn co svn://svn.valgrind.org/valgrind/trunk valgrind-new
Additional information about working with the Valgrind repository can be found at: Valgrind: Code Repository
cd $INV source valgrind-3/valgrind/REVISION svn co -r $VALGRIND_REVISION svn://svn.valgrind.org/valgrind/trunk valgrind-old cd valgrind-old svn update -r $VEX_REVISION VEX
The VEX update is necessary because the VEX instrumentation libraries Valgrind is built on is stored in a separate SVN repository. A Valgrind checkout will always check out the most recent version of the VEX source. This is fine for when we checked out the current Valgrind source, however, we'll want the same version used in the Fjalar repository for the copy that will be diffed.
Create diffs showing what the Valgrind maintainers have changed since the last time Valgrind code was merged into Fjalar. Two separate diffs will be created: one for Memcheck and one for everything else (which we'll call "Coregrind", and which includes VEX).
cd $INV \rm -f coregrind.patch memcheck.patch diff -ur --unidirectional-new-file -x 'Makefile.in' -x '.cvsignore' -x inst -x '.svn' -x CVS valgrind-old valgrind-new > coregrind.patch diff -ur --unidirectional-new-file -x 'Makefile.in' -x '.cvsignore' -x '.svn' -x CVS -x docs -x tests -x perf valgrind-old/memcheck valgrind-new/memcheck > memcheck.patch
Generate a diff containing PAG changes to coregrind. This will not be used for any automated patching, so we will be excluding a lot more irrelevant files. Also the PAG repository contains many of the automake/autoconf generated files to simplify things for the end users, these also need to be omitted for a cleaner diff. The below command should create a diff containing the relevant PAG changes to the Valgrind source code.
cd $INV \rm -f coregrind-PAG.diff memcheck-PAG.diff diff -ur --unidirectional-new-file -x 'Makefile.in' -x '.cvsignore' -x inst -x '.svn' -x CVS -x fjalar valgrind-old valgrind-3/valgrind/ > coregrind-PAG.diff diff -ur --unidirectional-new-file -x 'Makefile.in' -x '.cvsignore' -x '.svn' -x CVS -x docs -x tests -x perf valgrind-old/memcheck valgrind-3/valgrind/fjalar > memcheck-PAG.diff
Now we can merge the changes from the created diffs. The coregrind patch should apply with very little conflicts.
It can be difficult to undo a patch operation, so you should first attempt a dry run.
cd $INV/valgrind-3/valgrind patch -p1 < $INV/coregrind.patch --dry-run
If the patch fails, it might be indicative of problems in the above diffing.
When you are ready to apply the patch run:
cd $INV/valgrind-3/valgrind patch -p1 < $INV/coregrind.patch
Handle any conflicts. They are listed in the patch output, or run this command:
find -name '*.rej'
For every change in the file you will need to examine both the changed code as well as our original code and determine if it needs to be hand merged or if the change is not relevant. It is useful to refer to coregrind-PAG.diff during this process. Remove the .rej and .orig files as you go, so that finally the find command produces no output.
cd $INV/valgrind-3/valgrind/fjalar patch -p2 < $INV/memcheck.patch --dry-run
If the patch output looks sane, continue with the actual merge
cd $INV/valgrind-3/valgrind/fjalar patch -p2 < $INV/memcheck.patch
Handle any conflicts. They are listed in the patch output, or run this command:
find -name '*.rej'
For every change in the file you will need to examine both the changed code as well as our original code and determine if it needs to be hand merged or if the change is not relevant.
Our changes to Memcheck are much more substantial than our changes to Coregrind.
The largest modification to memcheck is in mc_translate.c and special care should be made to ensure it is present and up to date. MC_(instrument) handles the instrumentation of calls for Dyncomp. It is primarily contained in a switch statement; each case handles one VEX instruction type, and the body contains both the original code for memcheck and also the code for dyncomp. After the update, you should double-check that each clause contains corresponding code: any changes to the memcheck versions are reflected in the dyncomp versions, and any new clause has a dyncomp version.
Additionally Fjalar/Kvasir need to be updated to properly handle any changes in Valgrind API/functionality. For the most part Valgrind maintains a relatively stable interface to its tools. Any tool-visible changes should be noted in the change logs.
A somewhat more problematic area are changes in the VEX IR. Dyncomp makes heavy use of the VEX IR, so any changes in it need to be reflected. Most of the changes to the public VEX IR can be discovered by running the following command:
cd $inv/valgrind-new svn log -r ${VEX_REVISION}:HEAD VEX/pub/libvex_ir.h
The files with all the VEX IR code for Dyncomp is located in dyncomp_translate.[ch]. dyncomp_translate.c is structured primarily into functions of the form expr2tags_[EXPRESSION_TYPE](). These functions will contain a switch for all VEX IR instructions corresponding to the expression type and some call to a dyncomp tag function. Most often VEX IR changes will be syntactical in nature and will only involve changing the names of the instructions. If the log indicates that new VEX IR instructions have been added, they will need to be explicitly supported by Dyncomp. Please see Appendix C for guidelines on supporting new VEX instructions.
We must now ensure that the merged code compiles correctly and passes the regression test suite.
cd $INV/valgrind-3 ./auto-everything.sh
The auto-everything shell script should autogen the config and Makefiles and compile Fjalar. Fix any compilation errors.
The regression suite is located in the tests/kvasir-tests directory. It can be run using the following commands, which take about 20 minutes:
cd $INV/tests/kvasir-tests make nightly-summary-w-daikon
The test suite should pass without modification.
Any user-visible changes should be documented in $INV/docs/CHANGES. Additionally, valgrind-3/valgrind/REVISION should be updated with the Valgrind and VEX revisions that were used for this merge.
The revision for Valgrind can be obtained by:
export VALGRIND_REVISION_NEW=`(cd $INV/valgrind-new; svn info | grep "Last Changed Rev: " | cut -d " " --fields=4)` echo $VALGRIND_REVISION_NEW
The revision for VEX can be obtained by:
export VEX_REVISION_NEW=`(cd $INV/valgrind-new/VEX; svn info | grep "Last Changed Rev: " | cut -d " " --fields=4)` echo $VEX_REVISION_NEW
If the test suite passes with no errors and all changes are documented, changes should be committed to the CVS repository.
Tell CVS about files that were created/deleted by the patch:
cd $INV/valgrind-3 grep -q '^\-\-\-.*1969\-12\-31' $INV/coregrind.patch if [ "$?" == "0" ]; then cvs add `grep '^\-\-\-.*1969\-12\-31' $INV/coregrind.patch | cut --fields=1 | cut -d ' ' --fields=2 | perl -p -e 's/^valgrind-old/valgrind/g'` fi grep -q '^\+\+\+.*1969\-12\-31' $INV/coregrind.patch if [ "$?" == "0" ]; then cvs remove `grep '^\+\+\+.*1969\-12\-31' $INV/coregrind.patch | cut --fields=1 | cut -d ' ' --fields=2 | perl -p -e 's/^valgrind-new/valgrind/g'` fi grep -q '^\-\-\-.*1969\-12\-31' $INV/memcheck.patch if [ "$?" == "0" ]; then cvs add `grep '^\-\-\-.*1969\-12\-31' $INV/memcheck.patch| cut --fields=1 | cut -d ' ' --fields=2 | perl -p -e 's/^valgrind-old/valgrind/g'` fi grep -q '^\+\+\+.*1969\-12\-31' $INV/memcheck.patch if [ "$?" == "0" ]; then cvs remove `grep '^\+\+\+.*1969\-12\-31' $INV/memcheck.patch| cut --fields=1 | cut -d ' ' --fields=2 | perl -p -e 's/^valgrind-new/valgrind/g'` fi
cd $INV # Double-check the diffs — standard practice before committing cvs -q diff -b --brief -N # If necessary: cvs diff # Add any other relevant notes to the below. cvs commit -m "Valgrind merge from revision ${VALGRIND_REVISION} to ${VALGRIND_REVISION_NEW}. VEX IR merge from ${VEX_REVISION} to ${VEX_REVISION_NEW}." \rm -f coregrind.patch memcheck.patch coregrind-PAG.diff memcheck-PAG.diff
In an effort to aid in determining the appropriate measures to take when merging conflicted files, this section will provide a list of files modified by us and a brief explanation of the changes.
The most important set of changes is the addition of extra shadow state in Coregrind and VEX. The shadow area is an area of memory that Valgrind provides for tools to use. Unfortunately, it is of a fixed size, and memcheck uses all of it. We've had to increase the size of the the shadow area.
Additionally, we had to modify some of the VEX architecture files to return information specific to the x86 platform.
Finally, we had to extend Valgrind's implementation of the C library with a few extra functions.
The modifications made to memcheck are more organizational in nature. A few functions from mc_main.c and mc_translate.c have been made non-static. An extra header has also been created and filled with their signatures for use by Fjalar.
Other modifications include:
It is recommended that you acquaint yourself with the VEX IR by reading through:
In addition to being the primary headers for the VEX library, the above 2 files represent the majority of the public documentation on VEX. Valgrind's translation pipeline consists of the following:
Native assembly -> Pre-instrumented VEX IR -> Post-Instrumented VEX IR -> Final assembly
Valgrind begins by translating the entirety of an assembly basic block into VEX IR. Valgrind then allows tools to instrument examine the translated basic block and insert their own instrumentation. Valgrind finishes by translated the instrumented IR back into the machine's native assembly.
In order to keep track of comparabilities, Dyncomp instruments almost every VEX IR instruction type. Any added instructions will likely need to be supported by Dyncomp. In general Dyncomp's functionality parallels Memcheck's, so the best starting point for implementing support for a new instruction would be to mirror Memcheck's implementation. Dyncomp's layout is very similar to Memcheck's, so mirroring functionality should be fairly straightforward. It is, however, very unlikely that a new instruction type will be added as the VEX IR is a relatively mature instruction set and has been in use for almost 9 years at the time of this writing.
Another type of addition that will need to be supported are added "IR Expressions." Most VEX IR instructions are implemented as a set of IR Expressions - Take the following IR instructions for example:
t5 = Add32(t12,0x8:I32) t10 = CmpLE32S(t2,0x21:I32)
The above 2 IR Instructions are of the type PUT and they store the result of an IR Expression into a temporary. These instructions consist of the destination temporary, an IR Expression, which conceptually is the operation to be formed, and the arguments to the expression. Most IR Instructions will have a similar format. Dyncomp is particularly interested in analyzing all possible IRExpressions.
$INV/valgrind-3/valgrind/kvasir/fjalar/kvasir/dyncomp_translate.c
contains the following set of functions for processing IRExpressions.
If a new IR Expressions is added, it will need to be handled by one of the above functions. The easiest way to implement it will be to base it on the implementation of an existing instruction. Alternatively, it should be straightforward to mimic Memcheck's handling of the expression.