NAME

       stap - systemtap script translator/driver

SYNOPSIS

       stap [ OPTIONS ] FILENAME [ ARGUMENTS ]
       stap [ OPTIONS ] - [ ARGUMENTS ]
       stap [ OPTIONS ] -e SCRIPT [ ARGUMENTS ]
       stap [ OPTIONS ] -l PROBE [ ARGUMENTS ]
       stap [ OPTIONS ] -L PROBE [ ARGUMENTS ]

DESCRIPTION

       The  stap  program  is the front-end to the Systemtap tool.  It accepts
       probing  instructions  (written  in  a  simple   scripting   language),
       translates  those  instructions  into C code, compiles this C code, and
       loads the resulting kernel  module  into  a  running  Linux  kernel  to
       perform the requested system trace/probe functions.  You can supply the
       script in a named file, from standard input, or from the command  line.
       The  program runs until it is interrupted by the user, or if the script
       voluntarily invokes the exit() function, or  by  sufficient  number  of
       soft errors.

       The language, which is described in a later section, is strictly typed,
       declaration free, procedural, and inspired by awk.   It  allows  source
       code  points  or  events  in the kernel to be associated with handlers,
       which are subroutines that are executed synchronously.  It is  somewhat
       similar conceptually to "breakpoint command lists" in the gdb debugger.

OPTIONS

       The  systemtap  translator  supports  the following options.  Any other
       option prints a list of supported options.

       -h     Show help message.

       -V     Show version message.

       -p NUM Stop after pass  NUM.   The  passes  are  numbered  1-5:  parse,
              elaborate,  translate, compile, run.  See the PROCESSING section
              for details.

       -v     Increase verbosity for all passes.  Produce a larger  volume  of
              informative (?) output each time option repeated.

       --vp ABCDE
              Increase verbosity on a per-pass basis.  For example, "--vp 002"
              adds 2 units of verbosity  to  pass  3  only.   The  combination
              "-v --vp 00004"  adds  1 unit of verbosity for all passes, and 4
              more for pass 5.

       -k     Keep the temporary directory after all processing.  This may  be
              useful in order to examine the generated C code, or to reuse the
              compiled kernel object.

       -g     Guru mode.  Enable parsing  of  unsafe  expert-level  constructs
              like embedded C.

       -P     Prologue-searching  mode.   Activate  heuristics  to work around
              incorrect debugging information for $target variables.

       -u     Unoptimized  mode.    Disable   unused   code   elision   during
              elaboration.

       -w     Suppressed warnings mode.  Disables all warning messages.

       -b     Use bulk mode (percpu files) for kernel-to-user data transfer.

       -t     Collect timing information on the number of times probe executes
              and average amount of time spent in each probe-point. Also shows
              the derivation for each probe-point.

       -sNUM  Use NUM megabyte buffers for kernel-to-user data transfer.  On a
              multiprocessor in bulk mode, this is a per-processor amount.

       -I DIR Add the given directory to the tapset search directory.  See the
              description of pass 2 for details.

       -D NAME=VALUE
              Add  the  given C preprocessor directive to the module Makefile.
              These can be used to override limit parameters described below.

       -B NAME=VALUE
              Add the given make directive to the kernel module  build's  make
              invocation.   These  can  be  used  to  add  or override kconfig
              options.

       -G NAME=VALUE
              Sets the value of global variable NAME to VALUE when staprun  is
              invoked.   This  applies  to scalar variables declared global in
              the script/tapset.

       -R DIR Look for the systemtap runtime sources in the given directory.

       -r /DIR
              Build for kernel in given build tree. Can also be set  with  the
              SYSTEMTAP_RELEASE environment variable.

       -r RELEASE
              Build  for kernel in build tree /lib/modules/RELEASE/build.  Can
              also be set with the SYSTEMTAP_RELEASE environment variable.

       -m MODULE
              Use the given name  for  the  generated  kernel  object  module,
              instead  of  a  unique  randomized  name.   The generated kernel
              object module is copied to the current directory.

       -d MODULE
              Add symbol/unwind information for  the  given  module  into  the
              kernel  object module.  This may enable symbolic tracebacks from
              those modules/programs, even if they do  not  have  an  explicit
              probe placed into them.

       --ldd  Add symbol/unwind information for all shared libraries suspected
              by ldd to be necessary for user-space binaries  being  probe  or
              listed  with  the  -d  option.  Caution: this can make the probe
              modules considerably larger.

       --all-modules
              Equivalent to specifying "-dkernel" and a "-d" for  each  kernel
              module  that  is  currently  loaded.  Caution: this can make the
              probe modules considerably larger.

       -o FILE
              Send standard output to named file. In bulk mode,  percpu  files
              will  start  with  FILE_  (FILE_cpu with -F) followed by the cpu
              number.  This supports strftime(3) formats for FILE.

       -c CMD Start the probes, run CMD, and exit when CMD finishes.

       -x PID Sets target() to PID. This allows scripts  to  be  written  that
              filter on a specific process.

       -l PROBE
              Instead of running a probe script, just list all available probe
              points matching the given single probe point.  The  pattern  may
              include  wildcards and aliases, but not comma-separated multiple
              probe points.  The process result code will indicate failure  if
              there are no matches.

       -L PROBE
              Similar  to  "-l",  but list probe points and script-level local
              variables.

       -F     Without -o option, load module and  start  probes,  then  detach
              from the module leaving the probes running.  With -o option, run
              staprun in background as a daemon and show its pid.

       -S size[,N]
              Sets the maximum size of output file and the maximum  number  of
              output  files.   If  the  size of output file will exceed size ,
              systemtap switches output file to the  next  file.  And  if  the
              number  of  output files exceed N , systemtap removes the oldest
              output file. You can omit the second argument.

       --skip-badvars
              Ignore out of context variables and substitute with literal 0.

       --compatible VERSION
              Suppress recent script language  or  tapset  changes  which  are
              incompatible with given older version of systemtap.  This may be
              useful if a much older systemtap script fails to run.   See  the
              DEPRECATION section for more details.

       --check-version
              This  option  is  used  to  check  if  the active script has any
              constructors that may be systemtap version  specific.   See  the
              DEPRECATION section for more details.

       --use-server [HOSTNAME[:PORT] | IP_ADDRESS[:PORT] | CERT_SERIAL]
              Specify  compile-server(s)  to be used for compilation and/or in
              conjunction with --list-servers and --trust-servers (see below).
              If  no  argument  is  supplied, then the default in unprivileged
              mode (see --unprivileged) is to select compatible servers  which
              are  trusted  as  SSL  peers and as module signers and currently
              online. Otherwise the default is to  select  compatible  servers
              which   are   trusted   as   SSL  peers  and  currently  online.
              --use-server may be specified more than once, in  which  case  a
              list  of  servers is accumulated in the order specified. Servers
              may be specified by host name, ip  address,  or  by  certificate
              serial  number  (obtained  using --list-servers).  The latter is
              most commonly used when revoking trust in a server (see --trust-
              servers  below).  If  a  server  is specified by host name or ip
              address, then an optional port number may be specified. This  is
              useful  for accessing servers which are not on the local network
              or to specify a particular server.

       --list-servers [SERVERS]
              Display the status of the requested SERVERS, where SERVERS is  a
              comma-separated   list   of   server  attributes.  The  list  of
              attributes is combined to filter the list of servers  displayed.
              Supported attributes are:

              all    specifies  all  known servers (trusted SSL peers, trusted
                     module signers, online servers).

              specified
                     specifies servers specified using --use-server.

              online filters the output by retaining information about servers
                     which are currently online.

              trusted
                     filters the output by retaining information about servers
                     which are trusted as SSL peers.

              signer filters the output by retaining information about servers
                     which are trusted as module signers (see --unprivileged).

              compatible
                     filters the output by retaining information about servers
                     which are compatible with the current kernel release  and
                     architecture.

              If  no  argument is provided, then the default is specified.  If
              no servers were specified using --use-server, then  the  default
              servers for --use-server are listed.

       --trust-servers [TRUST_SPEC]
              Grant or revoke trust in compile-servers, specified using --use-
              server  as  specified  by  TRUST_SPEC,  where  TRUST_SPEC  is  a
              comma-separated list specifying the trust which is to be granted
              or revoked. Supported elements are:

              ssl    trust the specified servers as SSL peers.

              signer trust  the  specified  servers  as  module  signers  (see
                     --unprivileged).  Only root can specify signer.

              all-users
                     grant  trust  as  an  ssl peer for all users on the local
                     host. The default is to grant trust as an  ssl  peer  for
                     the current user only. Trust as a module signer is always
                     granted for all users. Only root can specify all-users.

              revoke revoke the specified trust. The default is to grant it.

              no-prompt
                     do not prompt the user for confirmation  before  carrying
                     out  the  requested  action. The default is to prompt the
                     user for confirmation.

              If no argument is provided, then the  default  is  ssl.   If  no
              servers were specified using --use-server, then no trust will be
              granted or revoked.

              Unless no-prompt has been specified, the user will  be  prompted
              to  confirm  the  trust  to  be  granted  or  revoked before the
              operation is performed.

       --remote [USER@]HOSTNAME
              Set the execution target to the specified ssh  host,  optionally
              using  a  username  not  matching  your own.  This option may be
              repeated to target multiple execution targets.  Passes  1-4  are
              completed locally as normal to build the script, and then pass 5
              will copy the module to the target and run it.  (EXPERIMENTAL)

ARGUMENTS

       Any additional arguments on the command line are passed to  the  script
       parser for substitution.  See below.

SCRIPT LANGUAGE

       The  systemtap  script  language  resembles  awk.   There  are two main
       outermost constructs: probes and functions.  Within  these,  statements
       and expressions use C-like operator syntax and precedence.

   GENERAL SYNTAX
       Whitespace is ignored.  Three forms of comments are supported:
              # ... shell style, to the end of line, except for $# and @#
              // ... C++ style, to the end of line
              /* ... C style ... */
       Literals  are either strings enclosed in double-quotes (passing through
       the usual C escape codes with backslashes), or  integers  (in  decimal,
       hexadecimal,  or  octal, using the same notation as in C).  All strings
       are limited in length to some reasonable value (a few  hundred  bytes).
       Integers are 64-bit signed quantities, although the parser also accepts
       (and wraps around) values above positive 2**63.

       In addition, script arguments given at the end of the command line  may
       be inserted.  Use $1 ... $<NN> for insertion unquoted, @1 ... @<NN> for
       insertion as a string literal.  The number of arguments may be accessed
       through  $# (as an unquoted number) or through @# (as a quoted number).
       These may be used at any place a token may begin, including within  the
       preprocessing  stage.   Reference to an argument number beyond what was
       actually given is an error.

   PREPROCESSING
       A simple conditional preprocessing stage is run as a part  of  parsing.
       The general form is similar to the cond ? exp1 : exp2 ternary operator:
              %( CONDITION %? TRUE-TOKENS %)
              %( CONDITION %? TRUE-TOKENS %: FALSE-TOKENS %)
       The CONDITION is either an expression whose format is determined by its
       first keyword, or a string literals comparison or  a  numeric  literals
       comparison.    It  can  be  also  composed  of  many  alternatives  and
       conjunctions of CONDITIONs (meant as in previous sentence) using || and
       &&  respectively.   However,  parentheses  are  not  supported  yet, so
       remembering that  conjunction  takes  precedence  over  alternative  is
       important.

       If  the  first part is the identifier kernel_vr or kernel_v to refer to
       the kernel  version  number,  with  ("2.6.13-1.322FC3smp")  or  without
       ("2.6.13")  the release code suffix, then the second part is one of the
       six standard numeric comparison operators <, <=, ==, !=, >, and >=, and
       the  third part is a string literal that contains an RPM-style version-
       release value.  The condition is deemed satisfied if the version of the
       target  kernel  (as optionally overridden by the -r option) compares to
       the given version string.  The comparison is  performed  by  the  glibc
       function  strverscmp.  As a special case, if the operator is for simple
       equality (==), or inequality (!=), and  the  third  part  contains  any
       wildcard  characters (* or ? or [), then the expression is treated as a
       wildcard (mis)match as evaluated by fnmatch.

       If, on the other hand, the first part is the identifier arch  to  refer
       to  the  processor  architecture  (as  named by the kernel build system
       ARCH/SUBARCH), then the second part is one of the two string comparison
       operators == or !=, and the third part is a string literal for matching
       it.  This comparison is a wildcard (mis)match.

       Similarly, if the first part is an identifier like CONFIG_something  to
       refer  to  a kernel configuration option, then the second part is == or
       !=, and the third part is a  string  literal  for  matching  the  value
       (commonly  "y"  or  "m").   Nonexistent  or  unset kernel configuration
       options are represented by the empty string.  This comparison is also a
       wildcard (mis)match.

       If the first part is the identifier systemtap_v, the test refers to the
       systemtap compatibility  version,  which  may  be  overridden  for  old
       scripts  with  the --compatible flag.  The comparison operator is as is
       for kernel_v and the right operand is a version string.  See  also  the
       DEPRECATION section below.

       Otherwise,  the  CONDITION  is  expected to be a comparison between two
       string literals or two numeric literals.  In this case,  the  arguments
       are the only variables usable.

       The TRUE-TOKENS and FALSE-TOKENS are zero or more general parser tokens
       (possibly including nested preprocessor conditionals), and  are  passed
       into  the input stream if the condition is true or false.  For example,
       the following code induces a  parse  error  unless  the  target  kernel
       version is newer than 2.6.5:
              %( kernel_v <= "2.6.5" %? **ERROR** %) # invalid token sequence
       The following code might adapt to hypothetical kernel version drift:
              probe kernel.function (
                %( kernel_v <= "2.6.12" %? "__mm_do_fault" %:
                   %( kernel_vr == "2.6.13*smp" %? "do_page_fault" %:
                      UNSUPPORTED %) %)
              ) { /* ... */ }

              %( arch == "ia64" %?
                 probe syscall.vliw = kernel.function("vliw_widget") {}
              %)

   VARIABLES
       Identifiers  for  variables and functions are an alphanumeric sequence,
       and may include "_" and "$" characters.  They  may  not  start  with  a
       plain  digit,  as in C.  Each variable is by default local to the probe
       or function statement block within which it is mentioned, and therefore
       its  scope  and  lifetime  is limited to a particular probe or function
       invocation.

       Scalar variables are implicitly typed  as  either  string  or  integer.
       Associative  arrays also have a string or integer value, and a tuple of
       strings and/or integers serving  as  a  key.   Here  are  a  few  basic
       expressions.
              var1 = 5
              var2 = "bar"
              array1 [pid()] = "name"     # single numeric key
              array2 ["foo",4,i++] += 5   # vector of string/num/num keys
              if (["hello",5,4] in array2) println ("yes")  # membership test

       The  translator  performs  type inference on all identifiers, including
       array indexes and function parameters.  Inconsistent  type-related  use
       of identifiers signals an error.

       Variables  may  be declared global, so that they are shared amongst all
       probes and live as long as the entire systemtap session.  There is  one
       namespace  for  all  global  variables, regardless of which script file
       they are found within.  A global declaration  may  be  written  at  the
       outermost level anywhere, not within a block of code.  Global variables
       which are written but never read will  be  displayed  automatically  at
       session  shutdown.   The translator will infer for each its value type,
       and if it is used as an  array,  its  key  types.   Optionally,  scalar
       globals  may  be  initialized  with  a  string  or number literal.  The
       following declaration marks variables as global.
              global var1, var2, var3=4

       Global variables can also be set as module options. One can do this  by
       either  using the -G option, or the module must first be compiled using
       stap -p4.  Global variables can then be set on the  command  line  when
       calling staprun on the module generated by stap -p4. See staprun(8) for
       more information.

       Arrays are limited in size by the MAXMAPENTRIES  variable  --  see  the
       SAFETY AND SECURITY section for details.  Optionally, global arrays may
       be declared with a maximum size in brackets,  overriding  MAXMAPENTRIES
       for  that array only.  Note that this doesn't indicate the type of keys
       for the array, just the size.
              global tiny_array[10], normal_array, big_array[50000]

   STATEMENTS
       Statements enable procedural  control  flow.   They  may  occur  within
       functions  and probe handlers.  The total number of statements executed
       in response to any single probe event is limited to some number defined
       by  a  macro  in  the translated C code, and is in the neighbourhood of
       1000.

       EXP    Execute the string- or integer-valued expression and throw  away
              the value.

       { STMT1 STMT2 ... }
              Execute  each  statement  in  sequence in this block.  Note that
              separators or terminators are generally  not  necessary  between
              statements.

       ;      Null  statement,  do  nothing.   It  is  useful  as  an optional
              separator between statements to improve  syntax-error  detection
              and to handle certain grammar ambiguities.

       if (EXP) STMT1 [ else STMT2 ]
              Compare  integer-valued  EXP  to  zero.  Execute the first (non-
              zero) or second STMT (zero).

       while (EXP) STMT
              While integer-valued EXP evaluates to non-zero, execute STMT.

       for (EXP1; EXP2; EXP3) STMT
              Execute EXP1 as initialization.  While EXP2 is non-zero, execute
              STMT, then the iteration expression EXP3.

       foreach (VAR in ARRAY [ limit EXP ]) STMT
              Loop  over  each  element  of  the named global array, assigning
              current key to VAR.  The array may not be  modified  within  the
              statement.   By adding a single + or - operator after the VAR or
              the ARRAY identifier, the iteration will  proceed  in  a  sorted
              order,  by  ascending  or  descending index or value.  Using the
              optional limit keyword limits the number of loop  iterations  to
              EXP times.  EXP is evaluated once at the beginning of the loop.

       foreach ([VAR1, VAR2, ...] in ARRAY [ limit EXP ]) STMT
              Same  as  above,  used when the array is indexed with a tuple of
              keys.  A sorting suffix may be used on at most one VAR or  ARRAY
              identifier.

       foreach (VALUE = VAR in ARRAY [ limit EXP ]) STMT
              This  variant  of foreach saves current value into VALUE on each
              iteration, so it is the same as  ARRAY[VAR].   This  also  works
              with  a  tuple of keys.  Sorting suffixes on VALUE have the same
              effect as on ARRAY.

       break, continue
              Exit or iterate the innermost nesting  loop  (while  or  for  or
              foreach) statement.

       return EXP
              Return  EXP  value  from  enclosing function.  If the function's
              value is not taken anywhere, then  a  return  statement  is  not
              needed, and the function will have a special "unknown" type with
              no return value.

       next   Return now from enclosing probe  handler.   This  is  especially
              useful in probe aliases that apply event filtering predicates.

       try { STMT1 } catch { STMT2 }
              Run  the  statements  in  the  first  block.   Upon any run-time
              errors, abort STMT1 and start executing STMT2.   Any  errors  in
              STMT2 will propagate to outer try/catch blocks, if any.

       try { STMT1 } catch(VAR) { STMT2 }
              Same  as  above,  plus  assign  the  error message to the string
              scalar variable VAR.

       delete ARRAY[INDEX1, INDEX2, ...]
              Remove from ARRAY the element specified by the index tuple.  The
              value  will  no  longer  be available, and subsequent iterations
              will not report the element.  It is not an error  to  delete  an
              element that does not exist.

       delete ARRAY
              Remove all elements from ARRAY.

       delete SCALAR
              Removes  the  value of SCALAR.  Integers and strings are cleared
              to 0 and "" respectively, while  statistics  are  reset  to  the
              initial empty state.

   EXPRESSIONS
       Systemtap  supports  a  number  of operators that have the same general
       syntax, semantics, and precedence as  in  C  and  awk.   Arithmetic  is
       performed as per typical C rules for signed integers.  Division by zero
       or overflow is detected and results in an error.

       binary numeric operators
              * / % + - >> << & ^ | && ||

       binary string operators
              .  (string concatenation)

       numeric assignment operators
              = *= /= %= += -= >>= <<= &= ^= |=

       string assignment operators
              = .=

       unary numeric operators
              + - ! ~ ++ --

       binary numeric or string comparison operators
              < > <= >= == !=

       ternary operator
              cond ? exp1 : exp2

       grouping operator
              ( exp )

       function call
              fn ([ arg1, arg2, ... ])

       array membership check
              exp in array
              [exp1, exp2, ...] in array

   PROBES
       The main construct in the scripting language identifies probes.  Probes
       associate abstract events with a statement block ("probe handler") that
       is to be executed when any of those events occur.  The  general  syntax
       is as follows:
              probe PROBEPOINT [, PROBEPOINT] { [STMT ...] }

       Events  are specified in a special syntax called "probe points".  There
       are several varieties of probe points defined by  the  translator,  and
       tapset scripts may define further ones using aliases.  These are listed
       in the stapprobes(3stap) manual pages.

       The probe handler is interpreted relative to the context of each event.
       For  events  associated  with  kernel  code,  this  context may include
       variables defined in the source  code  at  that  spot.   These  "target
       variables"  are  presented  to  the script as variables whose names are
       prefixed with "$".  They may be accessed only if the kernel's  compiler
       preserved  them despite optimization.  This is the same constraint that
       a debugger user faces when working with  optimized  code.   Some  other
       events  have  very little context.  See the stapprobes(3stap) man pages
       to see the kinds of context variables available at each kind  of  probe
       point.

       New  probe  points may be defined using "aliases".  Probe point aliases
       look similar to probe definitions, but instead of activating a probe at
       the  given point, it just defines a new probe point name as an alias to
       an existing one. There are two types of alias, i.e. the prologue  style
       and   the   epilogue  style  which  are  identified  by  "="  and  "+="
       respectively.

       For prologue style alias, the statement block  that  follows  an  alias
       definition  is  implicitly added as a prologue to any probe that refers
       to the alias. While for the epilogue style alias, the  statement  block
       that  follows an alias definition is implicitly added as an epilogue to
       any probe that refers to the alias.  For example:

              probe syscall.read = kernel.function("sys_read") {
                fildes = $fd
                if (execname() == "init") next  # skip rest of probe
              }
       defines  a   new   probe   point   syscall.read,   which   expands   to
       kernel.function("sys_read"),  with  the  given statement as a prologue,
       which is useful to predefine some variables for the alias  user  and/or
       to skip probe processing entirely based on some conditions.  And
              probe syscall.read += kernel.function("sys_read") {
                if (tracethis) println ($fd)
              }
       defines  a  new  probe  point  with the given statement as an epilogue,
       which is useful to take actions based upon variables set or  left  over
       by the the alias user.

       An alias is used just like a built-in probe type.
              probe syscall.read {
                printf("reading fd=%d0, fildes)
                if (fildes > 10) tracethis = 1
              }

   FUNCTIONS
       Systemtap  scripts  may  define  subroutines to factor out common work.
       Functions take any number of scalar (integer or string) arguments,  and
       must  return  a single scalar (integer or string).  An example function
       declaration looks like this:
              function thisfn (arg1, arg2) {
                 return arg1 + arg2
              }
       Note the general  absence  of  type  declarations,  which  are  instead
       inferred by the translator.  However, if desired, a function definition
       may include explicit type declarations for its return value and/or  its
       arguments.   This  is  especially helpful for embedded-C functions.  In
       the following example, the type inference engine need only  infer  type
       type of arg2 (a string).
              function thatfn:string (arg1:long, arg2) {
                 return sprint(arg1) . arg2
              }
       Functions  may  call  others  or  themselves recursively, up to a fixed
       nesting limit.  This limit is defined by a macro in  the  translated  C
       code and is in the neighbourhood of 10.

   PRINTING
       There  are  a  set  of function names that are specially treated by the
       translator.  They format values for printing to the standard  systemtap
       output  stream  in  a more convenient way.  The sprint* variants return
       the formatted string instead of printing it.

       print, sprint
              Print one or more values  of  any  type,  concatenated  directly
              together.

       println, sprintln
              Print values like print and sprint, but also append a newline.

       printd, sprintd
              Take  a string delimiter and two or more values of any type, and
              print the values with the delimiter interposed.   The  delimiter
              must be a literal string constant.

       printdln, sprintdln
              Print  values with a delimiter like printd and sprintd, but also
              append a newline.

       printf, sprintf
              Take a formatting string and a number of values of corresponding
              types,  and print them all.  The format must be a literal string
              constant.

       The printf formatting directives similar to those  of  C,  except  that
       they are fully type-checked by the translator:

              %b     Writes a binary blob of the value given, instead of ASCII
                     text.  The width specifier determines the number of bytes
                     to  write;  valid  specifiers  are  %b  %1b  %2b %4b %8b.
                     Default (%b) is 8 bytes.

              %c     Character.

              %d,%i  Signed decimal.

              %m     Safely reads kernel memory at the given address,  outputs
                     its  content.   The  precision  specifier  determines the
                     number of bytes to read.  Default is 1 byte.

              %M     Same as %m, but outputs in hexadecimal.  The minimal size
                     of output is double the precision specifier.

              %o     Unsigned octal.

              %p     Unsigned pointer address.

              %s     String.

              %u     Unsigned decimal.

              %x     Unsigned hex value, in all lower-case.

              %X     Unsigned hex value, in all upper-case.

              %%     Writes a %.

       Examples:
                   a = "alice", b = "bob", p = 0x1234abcd, i = 123, j = -1, id[a] = 1234, id[b] = 4567
                   print("hello")
                        Prints: hello
                   println(b)
                        Prints: bob\n
                   println(a . " is " . sprint(16))
                        Prints: alice is 16
                   foreach (name in id)  printdln("|", strlen(name), name, id[name])
                        Prints: 5|alice|1234\n3|bob|4567
                   printf("%c is %s; %x or %X or %p; %d or %u\n",97,a,p,p,p,j,j)
                        Prints: a is alice; 1234abcd or 1234ABCD or 0x1234abcd; -1 or 18446744073709551615\n
                   printf("2 bytes of kernel buffer at address %p: %2m", p, p)
                        Prints: 2 byte of kernel buffer at address 0x1234abcd: <binary data>
                   printf("%4b", p)
                        Prints (these values as binary data): 0x1234abcd

   STATISTICS
       It  is  often  desirable to collect statistics in a way that avoids the
       penalties of repeatedly exclusive locking the  global  variables  those
       numbers  are  being  put  into.   Systemtap provides a solution using a
       special operator to accumulate values, and several pseudo-functions  to
       extract the statistical aggregates.

       The  aggregation operator is <<<, and resembles an assignment, or a C++
       output-streaming operation.  The left operand  specifies  a  scalar  or
       array-index  lvalue,  which must be declared global.  The right operand
       is a numeric expression.  The  meaning  is  intuitive:  add  the  given
       number  to the pile of numbers to compute statistics of.  (The specific
       list of statistics to gather is given  separately,  by  the  extraction
       functions.)
                  foo <<< 1
                  stats[pid()] <<< memsize

       The  extraction  functions  are also special.  For each appearance of a
       distinct extraction function  operating  on  a  given  identifier,  the
       translator  arranges  to  compute  a set of statistics that satisfy it.
       The statistics system is thereby "on-demand".   Each  execution  of  an
       extraction  function  causes  the  aggregation  to be computed for that
       moment across all processors.

       Here is the set of extractor functions.  The first argument of each  is
       the  same  style of lvalue used on the left hand side of the accumulate
       operation.  The @count(v), @sum(v), @min(v), @max(v), @avg(v) extractor
       functions   compute  the  number/total/minimum/maximum/average  of  all
       accumulated values.  The resulting values are all simple integers.

       Histograms are also available, but are more  complicated  because  they
       have       a       vector      rather      than      scalar      value.
       @hist_linear(v,start,stop,interval) represents a linear histogram  from
       "start"  to  "stop"  by increments of "interval".  The interval must be
       positive.  Similarly,  @hist_log(v)  represents  a  base-2  logarithmic
       histogram.  Printing  a  histogram  with  the print family of functions
       renders a histogram object as a tabular "ASCII art" bar chart.
              probe foo {
                x <<< $value
              }
              probe end {
                printf ("avg %d = sum %d / count %d\n",
                        @avg(x), @sum(x), @count(x))
                print (@hist_log(v))
              }

   TYPECASTING
       Once a pointer has been saved  into  a  script  integer  variable,  the
       translator  loses the type information necessary to access members from
       that pointer.  Using the @cast() operator tells the translator  how  to
       read a pointer.
              @cast(p, "type_name"[, "module"])->member

       This  will  interpret  p as a pointer to a struct/union named type_name
       and dereference the member value.  Further ->subfield  expressions  may
       be  appended to dereference more levels.   NOTE: the same dereferencing
       operator -> is used to refer to  both  direct  containment  or  pointer
       indirection.   Systemtap  automatically determines which.  The optional
       module tells the translator where to look for  information  about  that
       type.   Multiple  modules may be specified as a list with : separators.
       If the module is not specified, it will default  either  to  the  probe
       module  for  dwarf  probes,  or to "kernel" for functions and all other
       probes types.

       The translator can create its own module with type information  from  a
       header  surrounded  by  angle brackets, in case normal debuginfo is not
       available.  For kernel headers, prefix it  with  "kernel"  to  use  the
       appropriate build system.  All other headers are build with default GCC
       parameters into a user module.  Multiple headers may  be  specified  in
       sequence to resolve a codependency.
              @cast(tv, "timeval", "<sys/time.h>")->tv_sec
              @cast(task, "task_struct", "kernel<linux/sched.h>")->tgid
              @cast(task, "task_struct",
                    "kernel<linux/sched.h><linux/fs_struct.h>")->fs->umask
       Values  acquired  by  @cast  may be pretty-printed by the  $ " and " $$
       suffix operators, the same way as described in  the  CONTEXT  VARIABLES
       section of the stapprobes(3stap) manual page.

       When in guru mode, the translator will also allow scripts to assign new
       values to members of typecasted pointers.

       Typecasting is also useful in the case of void* members whose type  may
       be determinable at runtime.
              probe foo {
                if ($var->type == 1) {
                  value = @cast($var->data, "type1")->bar
                } else {
                  value = @cast($var->data, "type2")->baz
                }
                print(value)
              }

   EMBEDDED C
       When  in guru mode, the translator accepts embedded code in the script.
       Such code is enclosed between %{ and %}  markers,  and  is  transcribed
       verbatim,  without  analysis,  in  some  sequence, into the generated C
       code.  At the outermost level, this  may  be  useful  to  add  #include
       instructions,  and  any auxiliary definitions for use by other embedded
       code.

       Another place where embedded code is permitted is as a  function  body.
       In  this case, the script language body is replaced entirely by a piece
       of C code enclosed again between %{ and %} markers.  This C code may do
       anything  reasonable  and safe.  There are a number of undocumented but
       complex  safety  constraints  on   atomicity,   concurrency,   resource
       consumption, and run time limits, so this is an advanced technique.

       The  memory  locations  set  aside for input and output values are made
       available to it using a macro THIS.  Here are some examples:
              function add_one (val) %{
                THIS->__retvalue = THIS->val + 1;
              %}
              function add_one_str (val) %{
                strlcpy (THIS->__retvalue, THIS->val, MAXSTRINGLEN);
                strlcat (THIS->__retvalue, "one", MAXSTRINGLEN);
              %}
       The function argument and return value types have to be inferred by the
       translator  from  the  call  sites in order for this to work.  The user
       should examine C code generated for ordinary script-language  functions
       in order to write compatible embedded-C ones.

       The  last  place  where  embedded code is permitted is as an expression
       rvalue.  In this case, the C code enclosed between %{ and %} markers is
       interpreted  as  an  ordinary  expression value.  It is assumed to be a
       normal 64-bit  signed  number,  unless  the  marker  /*  string  */  is
       included, in which case it's treated as a string.
              function add_one (val) {
                return val + %{ 1 %}
              }
              function add_string_two (val) {
                return val . %{ /* string */ "two" %}
              }

       The  embedded-C  code  may  contain  markers to assert optimization and
       safety properties.

       /* pure */
              means that the C code has no side  effects  and  may  be  elided
              entirely if its value is not used by script code.

       /* unprivileged */
              means  that  the  C code is so safe that even unprivileged users
              are permitted to use it.

       /* myproc-unprivileged */
              means that the C code is so safe that  even  unprivileged  users
              are permitted to use it, provided that the target of the current
              probe is within the user's own process.

       /* guru */
              means that the C code is so unsafe that a  systemtap  user  must
              specify -g (guru mode) to use this.

       /* string */
              in  embedded-C  expressions  only, means that the expression has
              const char * type and should  be  treated  as  a  string  value,
              instead of the default long numeric.

   BUILT-INS
       A  set of builtin functions and probe point aliases are provided by the
       scripts installed in the  directory  specified  in  the  stappaths  (7)
       manual  page.   The functions are described in the stapfuncs(3stap) and
       stapprobes(3stap) manual pages.

PROCESSING

       The translator begins pass 1 by parsing the given input script, and all
       scripts   (files  named  *.stp)  found  in  a  tapset  directory.   The
       directories listed with -I are processed in sequence, each processed in
       "guru  mode".   For each directory, a number of subdirectories are also
       searched.  These subdirectories are derived from  the  selected  kernel
       version (the -R option), in order to allow more kernel-version-specific
       scripts to override less specific ones.   For  example,  for  a  kernel
       version  2.6.12-23.FC3  the  following  patterns  would be searched, in
       sequence: 2.6.12-23.FC3/*.stp,  2.6.12/*.stp,  2.6/*.stp,  and  finally
       *.stp Stopping the translator after pass 1 causes it to print the parse
       trees.

       In pass 2, the translator analyzes the input script to resolve  symbols
       and  types.  References to variables, functions, and probe aliases that
       are unresolved internally are satisfied by searching through the parsed
       tapset scripts.  If any tapset script is selected because it defines an
       unresolved symbol, then the entirety of that script  is  added  to  the
       translator's resolution queue.  This process iterates until all symbols
       are resolved and a subset of tapset scripts is selected.

       Next, all probe point  descriptions  are  validated  against  the  wide
       variety  supported  by the translator.  Probe points that refer to code
       locations ("synchronous probe points") require the  appropriate  kernel
       debugging  information  to  be  installed.   In  the  associated  probe
       handlers, target-side variables (whose names begin with "$") are  found
       and have their run-time locations decoded.

       Next,   all   probes   and  functions  are  analyzed  for  optimization
       opportunities, in order to remove variables, expressions, and functions
       that have no useful value and no side-effect.  Embedded-C functions are
       assumed to have side-effects  unless  they  include  the  magic  string
       /* pure */.   Since  this optimization can hide latent code errors such
       as type mismatches or invalid $target variables, it  sometimes  may  be
       useful to disable the optimizations with the -u option.

       Finally,  all variable, function, parameter, array, and index types are
       inferred  from  context  (literals  and   operators).    Stopping   the
       translator  after  pass  2 causes it to list all the probes, functions,
       and variables, along with all  inferred  types.   Any  inconsistent  or
       unresolved types cause an error.

       In  pass 3, the translator writes C code that represents the actions of
       all selected script files, and creates a Makefile to build that into  a
       kernel  object.   These  files  are  placed into a temporary directory.
       Stopping the translator at this point causes it to print  the  contents
       of the C file.

       In  pass  4,  the  translator  invokes the Linux kernel build system to
       create the actual kernel object file.  This involves  running  make  in
       the  temporary  directory,  and  requires  a kernel module build system
       (headers, config and Makefiles) to  be  installed  in  the  usual  spot
       /lib/modules/VERSION/build.   Stopping  the  translator after pass 4 is
       the last chance before running the kernel object.  This may  be  useful
       if you want to archive the file.

       In  pass  5,  the  translator  invokes  the systemtap auxiliary program
       staprun program for the given kernel object.  This program arranges  to
       load  the module then communicates with it, copying trace data from the
       kernel into temporary files, until the user sends an interrupt  signal.
       Any  run-time  error encountered by the probe handlers, such as running
       out of memory, division by zero, exceeding nesting or  runtime  limits,
       results in a soft error indication.  Soft errors in excess of MAXERRORS
       block of all subsequent  probes  (except  error-handling  probes),  and
       terminate the session.  Finally, staprun unloads the module, and cleans
       up.

   ABNORMAL TERMINATION
       One should avoid killing the stap process forcibly,  for  example  with
       SIGKILL,  because  the  stapio  process  (a  child  process of the stap
       process) and the loaded module may be left running on the  system.   If
       this happens, send SIGTERM or SIGINT to any remaining stapio processes,
       then use rmmod to unload the systemtap module.

EXAMPLES

       See the stapex(3stap) manual page for a collection of samples.

CACHING

       The systemtap translator caches the pass  3  output  (the  generated  C
       code)  and  the  pass  4  output (the compiled kernel module) if pass 4
       completes successfully.  This cached  output  is  reused  if  the  same
       script  is  translated  again  assuming the same conditions exist (same
       kernel version, same systemtap version, etc.).  Cached files are stored
       in  the  $SYSTEMTAP_DIR/cache  directory.  The  cache can be limited by
       having the file cache_mb_limit placed in  the  cache  directory  (shown
       above)  containing  only an ASCII integer representing how many MiB the
       cache should not exceed. Note that this is a 'soft' limit in  that  the
       cache  will  be  cleaned after a new entry is added, so the total cache
       size may temporarily exceed this limit. In the absence of this file,  a
       default will be created with the limit set to 64MiB.

SAFETY AND SECURITY

       Systemtap  is  an administrative tool.  It exposes kernel internal data
       structures and potentially private user information.

       To actually run the kernel objects it builds, a user must be one of the
       following:

       ·   the root user;

       ·   a member of the stapdev and stapusr groups; or

       ·   a member of the stapusr group.

       The root user or a user who is a member of both the stapdev and stapusr
       groups can build and run any systemtap script.  Members of the  stapusr
       group can only use pre-built modules under the following conditions:

       ·   The   module   is  located  in  the  /lib/modules/VERSION/systemtap
           directory.  This directory must be owned by root and not  be  world
           writable.

       ·   The module has been signed by a trusted signer. Trusted signers are
           normally systemtap compile-servers  which  sign  modules  when  the
           --unprivileged  option  is  specified  by the client. See the stap-
           server(8) manual page for a for more information.

       The kernel modules generated by stap program are  run  by  the  staprun
       program.   The  latter is a part of the Systemtap package, dedicated to
       module loading and unloading (but only in the white zone), and  kernel-
       to-user  data  transfer.  Since staprun does not perform any additional
       security checks on the kernel objects it is given, it would  be  unwise
       for  a  system  administrator  to add untrusted users to the stapdev or
       stapusr groups.

       The translator asserts certain safety constraints.  It aims  to  ensure
       that no handler routine can run for very long, allocate memory, perform
       unsafe operations, or in unintentionally  interfere  with  the  kernel.
       Use  of  script  global variables is suitably locked to protect against
       manipulation by concurrent probe handlers.  Use of guru mode constructs
       such  as  embedded  C  can violate these constraints, leading to kernel
       crash or data corruption.

       The resource use limits are set by macros  in  the  generated  C  code.
       These  may  be overridden with the -D flag.  A selection of these is as
       follows:

       MAXNESTING
              Maximum number of nested function calls.  Default determined  by
              script  analysis,  with  a  bonus  10  slots added for recursive
              scripts.

       MAXSTRINGLEN
              Maximum length of strings, default 128.

       MAXTRYLOCK
              Maximum number  of  iterations  to  wait  for  locks  on  global
              variables  before  declaring  possible deadlock and skipping the
              probe, default 1000.

       MAXACTION
              Maximum number of statements to execute during any single  probe
              hit (with interrupts disabled), default 1000.

       MAXACTION_INTERRUPTIBLE
              Maximum  number of statements to execute during any single probe
              hit which is executed with interrupts enabled (such as begin/end
              probes), default (MAXACTION * 10).

       MAXMAPENTRIES
              Maximum number of rows in any single global array, default 2048.

       MAXERRORS
              Maximum  number  of  soft  errors  before  an exit is triggered,
              default 0, which means  that  the  first  error  will  exit  the
              script.

       MAXSKIPPED
              Maximum  number  of  skipped probes before an exit is triggered,
              default 100.  Running systemtap with -t (timing) mode gives more
              details    about    skipped    probes.     With    the   default
              -DINTERRUPTIBLE=1 setting, probes skipped due to reentrancy  are
              not accumulated against this limit.

       MINSTACKSPACE
              Minimum  number  of free kernel stack bytes required in order to
              run a probe handler, default 1024.  This number should be  large
              enough for the probe handler's own needs, plus a safety margin.

       MAXUPROBES
              Maximum   number   of   concurrently   armed  user-space  probes
              (uprobes), default somewhat larger than the number of user-space
              probe  points  named  in  the  script.   This  pool  needs to be
              potentialy large because individual  uprobe  objects  (about  64
              bytes  each)  are  allocated  for each process for each matching
              script-level probe.

       STP_MAXMEMORY
              Maximum amount of  memory  (in  kilobytes)  that  the  systemtap
              module  should use, default unlimited.  The memory size includes
              the size of the module itself, plus any additional  allocations.
              This  only  tracks  direct allocations by the systemtap runtime.
              This  does  not  track  indirect   allocations   (as   done   by
              kprobes/uprobes/etc. internals).

       TASK_FINDER_VMA_ENTRY_ITEMS
              Maximum  number  of  VMA  pages that will be tracked at runtime.
              This might get  exhausted  for  system  wide  probes  inspecting
              shared  library  variables  and/or  user backtraces. Defaults to
              1536.

       STP_PROCFS_BUFSIZE
              Size of procfs probe  read  buffers  (in  bytes).   Defaults  to
              MAXSTRINGLEN.  This value can be overridden on a per-procfs file
              basis using the procfs read probe .maxsize(MAXSIZE) parameter.

       With scripts that contain probes on any interrupt path, it is  possible
       that those interrupts may occur in the middle of another probe handler.
       The probe in the interrupt handler would be skipped  in  this  case  to
       avoid  reentrance.   To  work  around this issue, execute stap with the
       option  -DINTERRUPTIBLE=0  to  mask  interrupts  throughout  the  probe
       handler.   This  does add some extra overhead to the probes, but it may
       prevent reentrance for common problem cases.  However,  probes  in  NMI
       handlers  and  in the callpath of the stap runtime may still be skipped
       due to reentrance.

       Multiple scripts can write data into a  relay  buffer  concurrently.  A
       host  script  provides  an  interface for accessing its relay buffer to
       guest scripts.  Then, the output of the  guests  are  merged  into  the
       output  of  the  host.   To  run  a script as a host, execute stap with
       -DRELAYHOST[=name] option. The name identifies your host  script  among
       several   hosts.    While   running   the   host,   execute  stap  with
       -DRELAYGUEST[=name] to add a guest script to the host.  Note  that  you
       must  unload  guests  before unloading a host. If there are some guests
       connected to the host, unloading the host will be failed.

       In case something goes wrong with stap or staprun  after  a  probe  has
       already  started  running, one may safely kill both user processes, and
       remove the active probe kernel module with rmmod.   Any  pending  trace
       messages may be lost.

       In  addition to the methods outlined above, the generated kernel module
       also uses overload processing to make sure that probes  can't  run  for
       too   long.    If  more  than  STP_OVERLOAD_THRESHOLD  cycles  (default
       500000000) have been spent in all the probes on a single cpu during the
       last STP_OVERLOAD_INTERVAL cycles (default 1000000000), the probes have
       overloaded the system and an exit is triggered.

       By default, overload processing is turned on for all modules.   If  you
       would  like  to disable overload processing, define STP_NO_OVERLOAD (or
       its alias STAP_NO_OVERLOAD).

EXIT STATUS

       The systemtap translator generally returns with a success code of 0  if
       the  requested  script  was processed and executed successfully through
       the requested pass.  Otherwise, errors may be printed to stderr  and  a
       failure  code is returned.  Use -v or -vp N to increase (global or per-
       pass) verbosity to identify the source of the trouble.

       In listings mode (-l and -L), error messages are  normally  suppressed.
       A  success  code  of  0  is returned if at least one matching probe was
       found.

       A script executing in pass 5 that is interrupted with ^C  /  SIGINT  is
       considered to be successful.

DEPRECATION

       Over  time, some features of the script language and the tapset library
       may undergo incompatible changes, so that a script written  against  an
       old  version  of  systemtap  may no longer run.  In these cases, it may
       help to run systemtap with the --compatible  VERSION  flag,  specifying
       the  last  known  working version of systemtap.  Running systemtap with
       the  --check-version  flag  will  output  a  warning  if  any  possible
       incompatible  elements  have  been parsed. Below is a table of recently
       deprecated tapset functions and syntax elements that require the  given
       --compatible flag to use:

       --compatible=1.2
              (none yet)

       --compatible=1.3
              The  tapset  alias  'syscall.compat_pselect7a' was misnamed.  It
              should have been 'syscall.compat_pselect7' (without the trailing
              'a').  Starting in release 1.4, the old name will be deprecated.

       --compatible=1.4
              In the 'syscall.add_key' probe, the 'description_auddr' variable
              has been deprecated in  favor  of  the  new  'description_uaddr'
              variable.

              In       the      'syscall.fgetxattr',      'syscall.fsetxattr',
              'syscall.getxattr', ´syscall.lgetxattr', 'syscall.lremovexattr',
              'nd_syscall.fgetxattr',               ´nd_syscall.fremovexattr',
              'nd_syscall.fsetxattr',        'nd_syscall.getxattr',        and
              'nd_syscall.lremovexattr'  probes, the 'name2' variable has been
              deprecated in favor of the new 'name_str' variable.

              In the 'nd_syscall.accept' probe  the  'flag_str'  variable  has
              been deprecated in favor of the new 'flags_str' variable.

              In  the  'nd_syscall.dup'  probe  the 'old_fd' variable has been
              deprecated in favor of the new 'oldfd' variable.

              The tapset alias 'nd_syscall.compat_pselect7a' was misnamed.  It
              should   have  been  'nd_syscall.compat_pselect7'  (without  the
              trailing 'a').

              The tapset function 'cpuid' is deprecated in favor of the better
              known 'cpu'.

              In the i386 'syscall.sigaltstack' probe, the 'ussp' variable has
              been deprecated in favor of the new 'uss_uaddr' variable.

              In the ia64  'syscall.sigaltstack'  probe,  the  'ss_uaddr'  and
              ´oss_uaddr'  variables  have been deprecated in favor of the new
              ´uss_uaddr' and 'uoss_uaddr' variables.

              The powerpc tapset alias 'syscall.compat_sysctl' was  deprecated
              and renamed 'syscall.sysctl32'.

              In  the  x86_64  'syscall.sigaltstack'  probe,  the 'regs_uaddr'
              variable  has  been  deprecated  in  favor  of  the  new  'regs'
              variable.

FILES

       Important files and their corresponding paths can be located in the
              stappaths (7) manual page.

SEE ALSO

       stapprobes(3stap),    stapfuncs(3stap),    stappaths(7),    staprun(8),
       stapvars(3stap), stapex(3stap), stap-server(8), awk(1), gdb(1)

BUGS

       Use the Bugzilla link of the project web  page  or  our  mailing  list.
       http://sources.redhat.com/systemtap/,<systemtap@sources.redhat.com>.

                                                                       STAP(1)