perf-mem(1) — Linux manual page

NAME | SYNOPSIS | DESCRIPTION | COMMON OPTIONS | RECORD OPTIONS | REPORT OPTIONS | SEE ALSO | COLOPHON

PERF-MEM(1)                    perf Manual                    PERF-MEM(1)

NAME         top

       perf-mem - Profile memory accesses

SYNOPSIS         top

       perf mem [<options>] (record [<command>] | report)

DESCRIPTION         top

       "perf mem record" runs a command and gathers memory operation data
       from it, into perf.data. Perf record options are accepted and are
       passed through.

       "perf mem report" displays the result. It invokes perf report with
       the right set of options to display a memory access profile. By
       default, loads and stores are sampled. Use the -t option to limit
       to loads or stores.

       Note that on Intel systems the memory latency reported is the
       use-latency, not the pure load (or store latency). Use latency
       includes any pipeline queuing delays in addition to the memory
       subsystem latency.

       On Arm64 this uses SPE to sample load and store operations,
       therefore hardware and kernel support is required. See
       perf-arm-spe(1) for a setup guide. Due to the statistical nature
       of SPE sampling, not every memory operation will be sampled.

COMMON OPTIONS         top

       -f, --force
           Don’t do ownership validation

       -t, --type=<type>
           Select the memory operation type: load or store (default:
           load,store)

       -v, --verbose
           Be more verbose (show counter open errors, etc)

       -p, --phys-data
           Record/Report sample physical addresses

       --data-page-size
           Record/Report sample data address page size

RECORD OPTIONS         top

       <command>...
           Any command you can specify in a shell.

       -e, --event <event>
           Event selector. Use perf mem record -e list to list available
           events.

       -K, --all-kernel
           Configure all used events to run in kernel space.

       -U, --all-user
           Configure all used events to run in user space.

       --ldlat <n>
           Specify desired latency for loads event. Supported on Intel
           and Arm64 processors only. Ignored on other archs.

REPORT OPTIONS         top

       -i, --input=<file>
           Input file name.

       -C, --cpu=<cpu>
           Monitor only on the list of CPUs provided. Multiple CPUs can
           be provided as a comma-separated list with no space: 0,1.
           Ranges of CPUs are specified with - like 0-2. Default is to
           monitor all CPUS.

       -D, --dump-raw-samples
           Dump the raw decoded samples on the screen in a format that is
           easy to parse with one sample per line.

       -s, --sort=<key>
           Group result by given key(s) - multiple keys can be specified
           in CSV format. The keys are specific to memory samples are:
           symbol_daddr, symbol_iaddr, dso_daddr, locked, tlb, mem,
           snoop, dcacheline, phys_daddr, data_page_size, blocked.

           •   symbol_daddr: name of data symbol being executed on at the
               time of sample

           •   symbol_iaddr: name of code symbol being executed on at the
               time of sample

           •   dso_daddr: name of library or module containing the data
               being executed on at the time of the sample

           •   locked: whether the bus was locked at the time of the
               sample

           •   tlb: type of tlb access for the data at the time of the
               sample

           •   mem: type of memory access for the data at the time of the
               sample

           •   snoop: type of snoop (if any) for the data at the time of
               the sample

           •   dcacheline: the cacheline the data address is on at the
               time of the sample

           •   phys_daddr: physical address of data being executed on at
               the time of sample

           •   data_page_size: the data page size of data being executed
               on at the time of sample

           •   blocked: reason of blocked load access for the data at the
               time of the sample

                   And the default sort keys are changed to local_weight, mem, sym, dso,
                   symbol_daddr, dso_daddr, snoop, tlb, locked, blocked, local_ins_lat.

       -T, --type-profile
           Show data-type profile result instead of code symbols. This
           requires the debug information and it will change the default
           sort keys to: mem, snoop, tlb, type.

       -U, --hide-unresolved
           Only display entries resolved to a symbol.

       -x, --field-separator=<separator>
           Specify the field separator used when dump raw samples (-D
           option). By default, The separator is the space character.

       In addition, for report all perf report options are valid, and for
       record all perf record options.

SEE ALSO         top

       perf-record(1), perf-report(1), perf-arm-spe(1)

COLOPHON         top

       This page is part of the perf (Performance analysis tools for
       Linux (in Linux source tree)) project.  Information about the
       project can be found at 
       ⟨https://perf.wiki.kernel.org/index.php/Main_Page⟩.  If you have a
       bug report for this manual page, send it to
       linux-kernel@vger.kernel.org.  This page was obtained from the
       project's upstream Git repository
       ⟨http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git⟩
       on 2025-02-02.  (At that time, the date of the most recent commit
       that was found in the repository was 2025-02-01.)  If you discover
       any rendering problems in this HTML version of the page, or you
       believe there is a better or more up-to-date source for the page,
       or you have corrections or improvements to the information in this
       COLOPHON (which is not part of the original manual page), send a
       mail to man-pages@man7.org

perf                            2024-08-05                    PERF-MEM(1)

Pages that refer to this page: perf(1)perf-amd-ibs(1)perf-c2c(1)