Where to get started with utrace: http://redhat.com/~roland/utrace/

Please do not be shy about sending me your work in progress.  As soon as
you have any patches even kind of compiling, please send them along so I
can give you feedback.  I can fix up small issues while folding things in.

Now that you have the code in hand, the good news is you don't really need
to grok much about utrace to do the arch work.  You can look at
Documentation/utrace.txt if you want to understand it, but I think you can
just skim it or skip it entirely and proceed with the steps below to start
doing your arch work without worrying too much about what it's all for.


			Native Porting
			------ -------

1. Create your asm-foo/tracehook.h file and write its tracehook_* entry
   points.  Comments at the top of linux/tracehook.h describe the entry
   points you need to provide.  Look at the existing ports for the model.
   The single_step (and block_step, if supported) calls will likely just be
   renaming some existing functions you already have in ptrace.c or else
   moving a simple line or two from the bowels of ptrace.c into new inlines.
   (See note below about single-step when there is no true hardware support.)
   The syscall_trace calls are often the same code existing arches use.

   Make sure that your existing arch/foo/kernel/ptrace.c code for syscall
   tracing is cleaned up and calls tracehook_report_syscall.  Make sure
   signal handling calls tracehook_report_handle_signal.  (See existing
   ports for how it's used.)

   tracehook_syscall_callno, tracehook_syscall_retval, and
   tracehook_abort_syscall depend on the assembly code (often in entry.S)
   that calls the C function (often called something like do_syscall_trace)
   containing the call to tracehook_report_syscall.
   tracehook_abort_syscall can only be used on the user-mode register state
   that is sitting at the syscall-entry tracing call.  The other two can
   only be used either there or at the syscall-exit tracing call.

   This should be enough to compile with CONFIG_UTRACE enabled and
   CONFIG_PTRACE disabled.  You may also have to dike out some old code in
   arch/foo/kernel/ptrace.c with #if 0 for the moment (later we'll clean it
   out).  Please do test that the kernel builds with CONFIG_UTRACE
   disabled, and with CONFIG_UTRACE enabled but CONFIG_PTRACE disabled.
   Each of those builds should produce a kernel that works fine in general,
   but anything that uses ptrace will get ENOSYS.  (It's easy to verify
   that with "strace /bin/true" and seeing the error.)

2. regset support.  This is where you rip arch/foo/kernel/ptrace.c into
   little shreds and leave behind just the core arch-specific stuff we
   really need.  Look at the utrace_regset data structure in linux/tracehook.h.
   The regsets will encompass all the per-thread machine-specific CPU state
   that you can access via ptrace.  Your ptrace.c has existing code to
   access it for PTRACE_GETREGS/SETREGS, PTRACE_PEEKUSR/POKEUSR, etc.

   For biarch machines where you support different kinds of user threads,
   I would recommend starting with just the native (64-bit) support and
   testing that first.  (That's how I did the x86_64 and powerpc work.)
   The Biarch section below will talk about doing the 32-bit compatibility.

   Here is where the architecture knowledge comes in.  It is probably
   fairly obvious what all the natural regset flavors there are to have for
   your machine, and moreso to you than to me.  So far on every machine,
   regset 0 has the same contents and layout as elf_gregset_t, that appears
   in core dumps inside the NT_PRSTATUS note, and regset 1 matches
   elf_fpregset_t that appears as the whole NT_PRFPREG note in core dumps.
   Please make your regset 0 and regset 1 follow this rule.
   If there is a really strong reason to differ, please talk to me about it.

   Other regsets are defined for whatever CPU state there is that is
   accessible via ptrace.  If there was an existing ptrace call a la
   PTRACE_GETFPXREGS to get some particular info, then it makes most sense
   to make that existing layout the layout of a distinct regset.  For
   example the powerpc altivec regset is this way.  If there is other info
   that was accessible only via PTRACE_PEEKUSR, your knowledge of the
   architecture will tell you what the natural divisions of this are, and
   the natural layout for the info.  An individual regset definition should
   be a topically distinct item rather than a collection of unrelated
   things that were all in struct user.  There are some weird cases, like
   on non-ancient i386 hardware there is regset 1 (fpregs) and regset 2
   (fpxregs) which actually describe the same CPU state in different
   formats, one a superset of the other.

   For extra credit, take note of other kinds of per-thread
   machine-specific CPU state the kernel has that was not accessible via
   ptrace.  Put at least a comment mentioning new things that might be
   worth making available.  For example, the x86 has the ioperm bitmap and
   the LDT state that should be made new regsets in the future.

   Trace out what you think your regset definitions will be in an array of
   struct utrace_regset, and then define a struct utrace_regset_view
   pointing to that and a utrace_native_view function that returns it.
   Take the definitions in existing ports' arch/foo/kernel/ptrace.c as a model.

   Now that you have your regset layouts set in mind, you can write their
   get and set hooks.  This just means taking the existing code that
   implemented either PEEKUSR/POKEUSR or GET*REGS/SET*REGS and putting it
   into new entry points with the signatures in struct utrace_regset.  You
   have to adapt it slightly because those calls allow arbitrary access to
   some portion in the middle of the regset, not just one word at a time or
   the whole thing.  However, you don't have to worry about being called
   with pos and count arguments that aren't aligned to the ->align value
   you chose for the regset, i.e. whole registers.  So it's pretty easy.

   The linux/tracehook.h header defines some helper inlines to simplify
   your regset code: utrace_regset_copyout and utrace_regset_copyin, and
   utrace_regset_copyout_zero and and utrace_regset_copyin_ignore.  These
   take care of the flexible calling details of the regset get/set
   functions for you.  A short sequence of calls to these will usually
   cover mapping a regset's layout into the one or several places the data
   is stored inside struct pt_regs or thread_struct.

   If hardware supported at compile time doesn't actually exist on the
   machine at boot time, the regset hooks can all return -ENODEV.

   If some particular state is lazily allocated and you can cheaply
   distinguish "never used" from just having default values, then
   define a utrace_regset.active hook.  For example, lazy-allocated FPU
   state that has never been used in this thread.  (There's no point in
   defining one if you have nothing better to do than compare the whole
   state to the initial state to decide it's "inactive".)
   This is not crucial.

   It's ok if what the regsets in the regset_view are varies by the
   compile-time configuration.  For example, you can build a powerpc kernel
   without altivec support and then that regset doesn't show up at all (as
   opposed to an altivec-supporting kernel on a machine with no altivec
   hardware, where the regset shows up but its calls return -ENODEV).
   Just keep the order consistent.

   At this point, when you have something that builds and doesn't break
   normal functioning, please make sure I have your latest patches ASAP.
   I'd like to see you work and give you feedback, even if the code is not
   yet ready to be merged in.

3. ptrace compatibility support.
   (On biarch platforms, see below about 32-bit ptrace compatibility.)

   All you have to do here is define arch_ptrace, which is called
   differently than the old arch_ptrace, and doesn't need to do as much
   work.  This function handles any machine-specific ptrace request codes
   traditionally supported, that are not already covered in the generic
   kernel/ptrace.c code.  Here you only need to handle the arch-specific
   ptrace requests and the ones with arch-specific layout (PTRACE_PEEKUSR
   et al).  Don't try to extend anything, just cover exact compatibility
   with the ptrace requests that were available before.
   linux/ptrace.h declares several helper functions to use in arch_ptrace.
   When in doubt, please ask me for advice about the specific case.  (For
   the moment, just leave out requests that are specially for getting at a
   32-bit process from a 64-bit one or vice versa.)

   If you're lucky, PTRACE_GETREGS and the like used the same layout as
   in core dumps to begin with, so those layouts match your regset definitions.
   For these you can ptrace_whole_regset.  For example, i386 has:

	case PTRACE_GETREGS:
		return ptrace_whole_regset(child, engine, data, 0, 0);
	case PTRACE_SETREGS:
		return ptrace_whole_regset(child, engine, data, 0, 1);
	case PTRACE_GETFPREGS:
		return ptrace_whole_regset(child, engine, data, 1, 0);
	case PTRACE_SETFPREGS:
		return ptrace_whole_regset(child, engine, data, 1, 1);

   Here PTRACE_GETREGS accesses all of regset 0, and PTRACE_GETFPREGS
   accesses all of regset 1.

   If ptrace formats are different from regset formats, then you'll need to
   use the ptrace_layout_access function to map ptrace-compatible formats
   to the regsets you've defined.  The ia64 and sparc64 ports have examples
   of this.

   On all machines, the "struct user" layout supported by PTRACE_PEEKUSR
   and PTRACE_POKEUSR differs from any regset layout you want.  This format
   is handled the same way; the ptrace_peekusr and ptrace_pokeusr inlines
   are simplified front ends to ptrace_layout_access.  You just need to
   define an array of struct ptrace_layout_segment to map the "struct user"
   layout to regset data, such as in arch/i386/kernel/ptrace.c:

	static const struct ptrace_layout_segment i386_uarea[] = {
		{0, FRAME_SIZE*4, 0, 0},
		{offsetof(struct user, u_debugreg[0]),
		 offsetof(struct user, u_debugreg[8]), 4, 0},
		{0, 0, -1, 0}
	};


   See linux/ptrace.h for comments explaining the ptrace_layout_segment fields.
   Then in arch_ptrace use:

	case PTRACE_PEEKUSR:
		return ptrace_peekusr(child, engine, i386_uarea, addr, data);
	case PTRACE_POKEUSR:
		return ptrace_pokeusr(child, engine, i386_uarea, addr, data);

   It's now possible to test ptrace (for the native arch, i.e. 64-bit).
   The first simple test I usually do is "strace /bin/true" and look if it
   works and any of it reads as sane.  Then you can test with ntrace, or
   for the real pain, with gdb.  I'll write more about testing below.

   Before you start sinking time into the testing and debugging,
   send me your latest patches.

4. regset caveats: ptrace-induced.

   The struct utrace_regset definition is a nice idea of a clean way to
   talk about register info, i.e. natural sizes and alignments.
   Traditional practice in the ptrace interfaces is not always so clean.

   I think it's clean in a worthwhile way that the utrace_regset hooks are
   *the* way to get at that thread data, and that the ptrace code is
   layered on top of that well-defined architecture-independent interface.
   I like the fact that you can't break the clean utrace accessors used by
   future facilities, without breaking the old ptrace interface.  If you
   test ptrace, which many people use all the time via existing
   applications today, then you're testing the arch utrace_regset code so
   that future portable utrace-based work will have a good chance of
   working out of the box across architectures.

   However, this approach means that the regset accessors have to allow the
   access that ptrace did.  So if PTRACE_PEEKUSR allows 4-byte aligned
   access to some register data that naturally comes in bigger units, the
   regset get/set hooks need to support that.  But you can still set .size
   and .align to their natural values.  I am open to different opinions on
   how this stuff should be.

5. regset caveats: utrace API

   I said before that the regset hooks only need to do exactly what the
   old ptrace code did to get at the machine-specific thread state.  That
   is not entirely true, because the utrace API allows a subtle complication.

   With ptrace, the task you are dealing with is always stopped in
   TASK_TRACED, has completed its context switch off the CPU, and won't
   run again until after you're done examining it.  (The only exception is
   that it can wake up for SIGKILL and run through to exit and have
   release_task called concurrently with your looking at it, but the
   task_struct will remain live until your call returns.  It doesn't
   matter if you can read scrambled data while racing with the thread
   dying by SIGKILL, but you mustn't crash the kernel because of it.)
   This is always true with old ptrace, and is also true with the new
   ptrace layered on top of utrace.  So you will not observe any problem
   in testing using ptrace.

   However, the definition of the utrace API allows another case: the
   regset hooks can be called by the current thread on itself.  (We allow
   this in the specification because it can enable much more efficient new
   interfaces built on utrace, which send register data somewhere
   immediately without the notification-request roundtrips and context
   switches always entailed in the ptrace way of doing things.)

   For the user-mode general registers, this is usually no different
   because they are already saved in the same place on the kernel stack of
   the current thread as they are when a thread is stopped.

   Other regset state may normally state live in the CPU and only be
   stored into the thread_struct on context switch.  FPU state is usually
   like this.  The regset accessor functions may need to do some special
   machine-specific call when the target is the current thread to
   synchronize the thread_struct data with the live CPU data.
   For example, on x86:

		if (target == current)
			unlazy_fpu(target);

   The existing ptrace code for your arch won't have covered this case,
   because in the old ptrace world, the thread of interest was always some
   different thread that had completed a context switch so its state is
   not live in any CPU.  You need to carefully consider each regset with
   your knowledge of the arch and how/when/what CPU state it saves at
   context switches.  Make sure your regset hooks correctly get and set
   the current user-mode CPU state when called on current.

6. Single-step caveats: utrace API

   The issue described in #5 above also applies to the tracehook_*
   functions for single-step (and block-step, when supported).  In ptrace
   these things were always done by one thread acting on another that is
   not running (modulo SIGKILL, see above).  Now, these calls are in fact
   made on the current thread shortly before it returns to user mode.
   tracehook_enable_single_step et al must cope with being called on
   current (though for modularity's sake they should not assume they will
   only be called on current).  If you need to modify some CPU state that
   is normally set in the CPU itself by context switch and not by returning
   to user mode, then you'll need to change the CPU state directly, whereas
   in the old ptrace world you could just change thread_struct bits that
   would be copied into the CPU on context switch.

7. Software single-step.

   A few machines do not have any hardware single-step support, but provide
   PTRACE_SINGLESTEP by doing memory breakpoint insertion.  If your machine
   does this, do not define tracehook_enable_single_step et al.  The
   tracehook single-step/block-step functions are intended for true
   hardware support, or forms of software support that truly work as well
   as hardware support does.  Simply changing memory has a lot of problems,
   notably its incompatibility with multi-threaded debugging.

   For ptrace compatibility, just handle PTRACE_SINGLESTEP in your
   arch_ptrace function using your existing code.  If arch_ptrace needs
   to do something that should be undone when ptrace cleans up,
   asm/ptrace.h can #define HAVE_ARCH_PTRACE_DETACH and it will
   call void arch_ptrace_detach(struct task_struct *) before detaching.

   In future, the utrace world will have facilities to do things like
   per-thread breakpoints while mitigating the bad side effects of
   breakpoint insertion.  Then single-stepping will be emulated using
   those.  Until we have that, your old PTRACE_SINGLESTEP support code is
   fine for ptrace, but new utrace-based users will expect not to see side
   effects like memory-writing breakpoint insertion and are better off not
   falsely thinking there is proper single-step support.

8. Weirdo arch complications.

   If there are other CPU features that don't seem to fit into the
   utrace_regset model, please talk to me.  Of imminent concern is
   anything old ptrace did support on your machine.  But I'm also curious
   about anything else you think might make sense to add later on.

9. regset definitions: arch consensus

   The choice of regsets to define and their layouts and parameters won't
   be set in stone right away.  You shouldn't hesitate to send me patches
   using choices that might not be final.  Certainly more regsets can be
   added later for new kinds of per-thread machine data.  But eventually,
   the set of regsets and each one's layout once each has been defined
   will become a more or less stable part of the linux arch definition.
   It's most often obvious to everyone what the natural layouts for the
   arch are, and probably not too much deep thought will be required.
   But talk to your arch maintainers and interested hackers for your arch,
   and make sure everyone agrees on what makes sense for how to export the
   machine-specific state on your arch.

10. Sweep out the detritus.

    arch/foo/kernel/ptrace.c files have long been gathering places for
    cruft.  Some arch maintainers claim their own ptrace.c harbors
    dangerous dragons with which they've been loathe to meddle.  One of
    the goals of utrace is to isolate the arch support requirements
    into clean and well-defined interfaces dealing with just the
    machine-specific state and leaving arch maintainers free from
    worrying about arcane minutiae and strange entanglements of the
    ptrace implementation.  The code doesn't need to be big and scary.
    In the future, this support should be straightforward and natural
    for a new arch maintainer to write concisely to enable good
    debugging facilities, with just a small effort and the arch knowledge.

    Once you've gotten your regset and tracehook support functions
    working, there may be some old dead code in ptrace.c you can get
    rid of.  Likewise anything obsolete in asm-foo/ptrace.h that's not
    part of the userland interface (i.e. anything in #ifdef __KERNEL__)
    can go.  It's a fine opportunity to reorganize an old tangled mess
    of silly subfunctions into just the cleanest straightforward
    concise implementation of the regset accessor hooks and so forth.
    To be exceedingly tidy, you can make it all conditionally compiled
    on CONFIG_UTRACE and test building without it--who knows, someday
    someone might want to save a little code space on a stripped-down
    embedded configuration with no way to use debugging facilities.

    That said, reusing existing internal code as undisturbed as
    possible also has its benefits.  The less you change things, the
    easier it is to be sure you are still doing it right.  Moreover,
    the easier it is to read your changes in patch form and be
    confident that the important low-level chunks were preserved.
    The transition to utrace can seem drastic enough as it is.
    Your arch maintainers might very understandably prefer to take an
    incremental approach to cleaning up the internals of the arch code.
    Use your judgment and consult with your arch kernel community.
    The most important thing is that all the arch maintainers are
    comfortable with folding in the utrace support changes.
    More cleanups can always happen later on.


			Biarch (or multi-arch) Support
			------ --- ----------- -------

I recommend doing the native support first and getting it working before
worrying about multi-arch support.  Move on to the Testing section below
and iron out kinks in your native support first, then come back here.

If your platform is biarch, supporting both native 64-bit processes and
also 32-bit processes, you need to do some more work to support debugging
32-bit processes with utrace, and to support 32-bit processes that use
ptrace to debug other 32-bit processes.  If an arch supports more than two
alternative kinds of processes, you just need to repeat the work once for
each additional flavor after the main "native" flavor of the arch.

The existing ports for powerpc, s390x, x86_64, ia64, and sparc64 all
include biarch support.  You can look at their code for examples.


1. biarch tracehook support.

   The tracehook_abort_syscall, tracehook_syscall_callno, and
   tracehook_syscall_retval functions need to do the right thing with the
   struct pt_regs from either a 32-bit or a 64-bit task.  If the registers
   are not used the same way for the syscall number and return value
   between different kinds of user threads, they need to be check for the
   flavor.  Let me know if you need more than just the struct pt_regs for
   this, I can change the signatures to pass the task_struct too if need be.

2. biarch regset support.

   If there are different kinds of user threads, like 32-bit and 64-bit, then
   each one is described by a different struct utrace_regset_view.

   This is like #2 above all over again, but matching the regset layouts that
   a native 32-bit kernel would present as the single regset_view available.
   There is existing code in arch/foo/kernel/ptrace.c under CONFIG_COMPAT, or
   another appropriate place (e.g. arch/x86_64/ia32/ptrace32.c), that you'll
   start with, just as you used the old native arch ptrace code for the
   native regset support.

   Define and populate a second utrace_regset_view someplace.  There is
   probably some commonality in the hooks, like the FPU state is often the
   same between 32 and 64, so it may make sense to define them all together
   in arch/foo/kernel/ptrace.c with some parts inside #ifdef CONFIG_COMPAT.
   Make the utrace_native_view function return the struct utrace_regset_view
   corresponding to the mode that its argument task_struct is running in.

   That's it.  There is very little useful testing you can do of the regset
   support before the ptrace part is done.  So once this compiles and hasn't
   made life regress in general, you should probably just send off your patch
   to me and then get on with the next step.

3. biarch ptrace compatibility.
   The main work here is to be sure you got the biarch regset support right.

   Now go back to arch/foo/kernel/ptrace.c, or whereever is appropriate, and
   add and arch_compat_ptrace function (inside #ifdef CONFIG_COMPAT or
   whatever is appropriate).  This is just like the arch_ptrace function, but
   needs to match whatever ptrace requests should be supported for 32-bit
   processes.

   At this point, you can test 32-bit debuggers using ptrace on 32-bit
   processes on the 64-bit kernel.

   Finally, go back to your 64-bit arch_ptrace function and fill in any
   special requests having to do with examining 32-bit processes.
   If there are special calls for a 32-bit ptrace call to access 64-bit
   information about a 64-bit process, those go in arch_compat_ptrace.

   Now the ptrace support should be complete.  As soon as you've got it
   building and think it's all there, send me the new patch.
   You can now test the full complement of arch-calling-ptrace on
   arch-of-thread-examined combinations that are supported in the old
   ptrace implementation for your machine.


			Testing
			-------

Now, about testing.  As ptrace is now layered on top of utrace and the new
utrace arch support interfaces, the chief practical means of testing is to
test ptrace.

strace is good for a simple non-demanding user of ptrace.  It's the first
thing to run as a sanity check that things are working at all.  If it
shows problems (i.e. behaves differently than it did on the older
pre-utrace kernel), it's relatively easy to figure out what ptrace calls
it does and debug the scenario.

Simple uses of gdb are the next stage of testing.  Run something, set a
breakpoint, do some single-stepping (display/i $pc; stepi) and see that it
steps one instruction at a time, etc.

The only real torture test I've been using is to run the gdb test
suite.  My primary metric for success has been parity on the gdb test
suite; that is, the same behavior it sees on the old kernel.  I'll go
into painful detail below about setting this up.

At my aforementioned web site, there is a tarball called "ntrace".  That
will eventually be a package of fancy new stuff.  But what it contains
right now of interest is a small framework and test suite for utrace.
There are only a few tests so far and they are not very demanding.  (That
makes it somewhat easier to debug issues and pass the tests than e.g. the
gdb suite.)  If you find problems in the utrace core code, and as people
do in the future, I'll try to add regression tests to this suite.

The ntrace tarball just needs the usual "configure; make; make check" for
basic tests.  But it won't compile at all on your arch without at least a
touch of arch support code.  You may or may not want to deal with this.
It will give you a little bit of testing that's easier to deal with than
the gdb suite if you need to debug kernel problems.  But if you just try
some strace and gdb and things look OK, there is no big reason to play
with my little test suite.  If you're not interested, skip ahead to the
gdb test suite instructions below.

The ntrace package can build tests for the new kernel code, but it first
builds a userland test harness based on ptrace calls that approximates the
utrace kernel API for the purpose of some test code.  This is a good way
to verify that the test setup is sane, it does some degree of testing on
ptrace, and it will be useful to me later on.  It's that userland test
harness that needs to be ported to your machine.  As soon as you send me a
draft patch with regset layouts, I can whip up the CPU support in this
test harness.  Even less than that is required to compile it and do all
the utrace tests except for the ones accessing registers, just some stub
code in new per-CPU files and a make -k; make -k check should be able to
run and pass most of the tests.

To start with you can build ntrace using configure --without-kmod.  Make
sure that works right on the vanilla pre-utrace kernel.  Once the test
setup works ok in the userland harness, you can enable the real kernel
tests.  configure with no options will look in the standard (RH) place
/lib/modules/`uname -r`/build for the kernel config against which to build
kernel modules.  Or use configure --with-kmod=/my/dir to point it at your
own kernel build directory.  It needs the kernel you're running, a version
that exports the new utrace symbols.  Now "make" should build some kernel
modules as well.  Now "make check" (or just "make check-kmod") will run
tests that use insmod to install test modules and examine their output in
the dmesg log.  It uses "sudo /sbin/insmod" and "sudo /sbin/rmmod", so
make sure you can do that without a password from your shell before you
run the tests.  The tests are not very demanding, but if they don't work
it should not be too hard (as these things go) to debug the kernel using
them.  Please don't hesitate to ask me for detailed advice if you start
trying to debug problems.


The big scary test is running the gdb test suite.  Start with some gdb
sources.  What I've used is the trunk code from sourceware.org via cvs
checkout.  You can use the latest gdb release tarball, whatever works.
configure and make all-gdb.  (Always use a separate build directory;
you're going to be doing two builds.)  You might want --disable-gdbtk or
something like that to cut down some of the cruft we don't care about in
the build and have fewer prerequisite libraries installed to build gdb.
There may be some other special prerequisites for building gdb on your arch.
I expect you know the sort of nonsense likely entailed.  If not, find the
person who builds gdb for your arch and get them to show you.
I'll assume the build worked OK.

You need dejagnu and expect installed (yum install dejagnu).
Before running the test suite, do "ulimit -c unlimited"; make sure /sbin
and /usr/sbin are in PATH (don't really remember why any more).  Make sure
prelink is installed (yum install prelink); it doesn't matter if you've
run prelink on your system, just have it around for the test suite to find.

Running the test suite can take a really long time.  Some individual tests
may be very slow and a long time can pass before you get another PASS/FAIL
line out.  On the other hand, hitting the bugs you're looking for can
easily mean that part of the test suite wedges, or spins, or hangs the
machine.  So keep an eye on it with ps or whatnot and check for wedges,
but be aware that it might sometimes take a long time apparently doing not
much and this can be normal.

1. Get the baseline for comparison.  Run an existing kernel you think is
   stable and reliable.  When I did this testing a while back, I used FC5
   and rawhide kernels, and also my own build of an unmodified current
   Linus tree at the time.

   Run the test suite.  From your top-level build directory, you can do
   "make check-gdb" or "cd gdb; make check".  At least after the first
   time it's run, this boils down to just "cd gdb/testsuite; runtest" and
   I usually just use "runtest" directly.  See above re: "really long time".

   runtest writes all the interesting output into "gdb.sum" and "gdb.log".
   To keep track of the runs on different kernels, I usually do:
	mkdir results-`uname -r`
	runtest; mv gdb.sum gdb.log results-`uname -r`
   There is nothing in the terminal output from running the suite that's
   not saved in those files, so you don't need to redirect or save that
   output.

   I always look first at the executive summary from "tail gdb.sum".
   Don't be alarmed at seeing some number of failures.  On x86 using the
   gdb trunk from a month back there were something like 150 "unexpected
   failures".  We're not interested in gdb bugs here.  We're just looking
   at gdb's results as indicators of the kernel ptrace behavior and
   comparing one kernel to another.  (If there are thousands of failures
   and no successes on the current stable kernel, then that is a problem
   with the gdb build and that's beyond the scope of my endeavors.  I may
   be able to help you with some of that, but really you should hound the
   gdb people or more directly your arch's person who worries about gdb.
   If the kernel's status quo for ptrace on your arch is royally broken,
   then I've got your compatibility covered right here, buddy.)

2. Test your new kernel with utrace-based ptrace.  On the new kernel,
   follow the same procedure to run the test suite in your gdb build.
   If you survive, look at the results.

   Again, "tail gdb.sum" is the first thing that will tell you how much
   coffee you're going to need this evening.  Then what I do is:
	diff -u 2.6.stable/gdb.sum 2.6.flakey-utrace/gdb.sum | less
   Go straight to the end and see how your "unexpected failures" line
   compares.  If it's +/- a few from the baseline stable kernel's number,
   life is good.  (If it's way worse, then take a deep breath and just
   worry about one of them at a time. :-)  Now look through the diff
   seeing which particular tests have -PASS +FAIL or whatever unexplained
   difference.  In gdb.sum files, each test script will be announced,
   called gdb.base/foo.exp or suchlike, and then PASS/FAIL/XFAIL etc lines
   logging individual items in each script.  The gdb.log file contains
   that same output, interspersed with all the detailed logs of the gdb
   session the test ran.  You can look for a "FAIL: some message" line in
   gdb.log and try to tell what happened that the script wasn't expecting.
   (Compare to the baseline run's gdb.log for that same test to see how
   things normally work out.)

   To reproduce a particular test failure, you can do:
	runtest gdb.base/foo.exp
   using whatever the name of the test script is as shown in the log.
   (Note this overwrites gdb.sum and gdb.log in the gdb/testsuite dir, so
   be sure you've copied them aside if you're looking at full-suite results.)

   You may also be able to figure out from reading the transcripts in gdb.log
   what kinds of uses are failing, and use gdb by hand to provoke a similar
   failure in a way that's easier to debug.  If you have the FPU register
   data inside out or whatever, it may be discernable from what weirdness you
   see in gdb how it's gone wrong in the kernel implementation.

   Please feel free to ask for specific help and advice from me when
   figuring out what's going on in the test suite and any divergences
   from the baseline kernel's behavior.

3. biarch gdb testing.  As far as I'm aware, gdb does not do a very good
   job of testing its cross-arch support out of the box.  The following
   is what I did for 64/32 testing on x86_64 and powerpc.

   Get a 32-bit gdb build, preferably from the same sources you used for
   the native build above.  Use something like:
	CC='gcc -m32' ..../gdb-src/configure i386-linux # your 32-bit cpu
	make
	make check
   You can just do this when booted on a 32-bit kernel if you want, and
   then use that build for runs on a 64-bit kernel.  However, running
   the test suite will compile some little programs, so you need to make
   sure it uses a $CC with -m32 to compile 32-bit test programs.
   If your biarch devel system is set up correctly, you will be able to
   do the CC='gcc -m32' build fine on a 64-bit system.

   Follow all the instructions above to run the test suite on your
   baseline kernel.  If you have a native 32-bit kernel you're
   supporting, then test on the stable 32-bit native kernel first.
   We're testing 32-bit gdb debugging 32-bit processes, on the 32-bit kernel.

	cd gdb/testsuite
	mkdir results-32on32on32-`uname -r`
	runtest; mv gdb.sum gdb.log results-32on32on32-`uname -r`

   Next reboot and run that same build's test suite on the 64-bit kernel.

	mkdir results-32on32on64-`uname -r`
	runtest; mv gdb.sum gdb.log results-32on32on64-`uname -r`

   Compare the 32 vs 64 baseline kernels and don't be too surprised if
   they don't match perfectly already.  Biarch is a pain to get right.
   Any regressions of 32on32on64 vs 32on32on32 deserve to be noted and
   fixed in the stable kernel branch, but that is not our concern here.

   Finally, take the third biarch baseline.  Here I backdoor the 64-bit
   gdb binary into the testsuite setup for the 32-bit build.

	cd gdb/testsuite
	cp .../build-64bit-gdb/gdb/gdb ../gdb64
	mv ../gdb ../gdb32
	ln ../gdb64 ../gdb
	mkdir results-64on32on64-`uname -r`
	runtest; mv gdb.sum gdb.log results-64on32on64-`uname -r`

   I'm not sure gdb supports a 64-bit debugger for 32-bit processes on
   more than a couple of platforms, but if it works at all, however well
   it works is a baseline for comparison of the vanilla kernel and the
   utrace kernel.  I'm not at all sure that any 32-bit build of gdb
   works to debug 64-bit processes, though there is ptrace support for
   that on some machines.  If that does exist for your arch, then do the
   analogous to above for 32on64on64.

   So now you have two or three or four sets of biarch results in
   addition to the native results from step #1.  For each of the one to
   five variants of life in your world, repeat the testing procedure for
   the baseline kernel and for the new utrace kernel and compare the
   results for the same variation between old and new kernels.
   When everything works as well on new kernels as on old, it's Miller time.


Please harangue me early and often for more assistance and
clarification on all this.


	-- Roland McGrath <roland@redhat.com>