Provided by: linux-perf_6.14.0-24.24_amd64 bug

NAME

       perf-mem - Profile memory accesses

SYNOPSIS

       perf mem [<options>] (record [<command>] | report)

DESCRIPTION

       "perf mem record" runs a command and gathers memory operation data from it, into perf.data. Perf record
       options are accepted and are passed through.

       "perf mem report" displays the result. It invokes perf report with the right set of options to display a
       memory access profile. By default, loads and stores are sampled. Use the -t option to limit to loads or
       stores.

       Note that on Intel systems the memory latency reported is the use-latency, not the pure load (or store
       latency). Use latency includes any pipeline queuing delays in addition to the memory subsystem latency.

       On Arm64 this uses SPE to sample load and store operations, therefore hardware and kernel support is
       required. See perf-arm-spe(1) for a setup guide. Due to the statistical nature of SPE sampling, not every
       memory operation will be sampled.

COMMON OPTIONS

       -f, --force
           Don’t do ownership validation

       -t, --type=<type>
           Select the memory operation type: load or store (default: load,store)

       -v, --verbose
           Be more verbose (show counter open errors, etc)

       -p, --phys-data
           Record/Report sample physical addresses

       --data-page-size
           Record/Report sample data address page size

RECORD OPTIONS

       <command>...
           Any command you can specify in a shell.

       -e, --event <event>
           Event selector. Use perf mem record -e list to list available events.

       -K, --all-kernel
           Configure all used events to run in kernel space.

       -U, --all-user
           Configure all used events to run in user space.

       --ldlat <n>
           Specify desired latency for loads event. Supported on Intel and Arm64 processors only. Ignored on
           other archs.

REPORT OPTIONS

       -i, --input=<file>
           Input file name.

       -C, --cpu=<cpu>
           Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a comma-separated list
           with no space: 0,1. Ranges of CPUs are specified with - like 0-2. Default is to monitor all CPUS.

       -D, --dump-raw-samples
           Dump the raw decoded samples on the screen in a format that is easy to parse with one sample per
           line.

       -s, --sort=<key>
           Group result by given key(s) - multiple keys can be specified in CSV format. The keys are specific to
           memory samples are: symbol_daddr, symbol_iaddr, dso_daddr, locked, tlb, mem, snoop, dcacheline,
           phys_daddr, data_page_size, blocked.

           •   symbol_daddr: name of data symbol being executed on at the time of sample

           •   symbol_iaddr: name of code symbol being executed on at the time of sample

           •   dso_daddr: name of library or module containing the data being executed on at the time of the
               sample

           •   locked: whether the bus was locked at the time of the sample

           •   tlb: type of tlb access for the data at the time of the sample

           •   mem: type of memory access for the data at the time of the sample

           •   snoop: type of snoop (if any) for the data at the time of the sample

           •   dcacheline: the cacheline the data address is on at the time of the sample

           •   phys_daddr: physical address of data being executed on at the time of sample

           •   data_page_size: the data page size of data being executed on at the time of sample

           •   blocked: reason of blocked load access for the data at the time of the sample

                   And the default sort keys are changed to local_weight, mem, sym, dso,
                   symbol_daddr, dso_daddr, snoop, tlb, locked, blocked, local_ins_lat.

       -T, --type-profile
           Show data-type profile result instead of code symbols. This requires the debug information and it
           will change the default sort keys to: mem, snoop, tlb, type.

       -U, --hide-unresolved
           Only display entries resolved to a symbol.

       -x, --field-separator=<separator>
           Specify the field separator used when dump raw samples (-D option). By default, The separator is the
           space character.

       In addition, for report all perf report options are valid, and for record all perf record options.

SEE ALSO

       perf-record(1), perf-report(1), perf-arm-spe(1)

perf                                               06/15/2025                                        PERF-MEM(1)