Mercurial > hg > sv-dependency-builds
diff src/fftw-3.3.8/README-perfcnt.md @ 167:bd3cc4d1df30
Add FFTW 3.3.8 source, and a Linux build
author | Chris Cannam <cannam@all-day-breakfast.com> |
---|---|
date | Tue, 19 Nov 2019 14:52:55 +0000 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/src/fftw-3.3.8/README-perfcnt.md Tue Nov 19 14:52:55 2019 +0000 @@ -0,0 +1,93 @@ +Performance Counters +==================== + +FFTW measures execution time in the planning stage, optionally taking advantage +of hardware performance counters. This document describes the supported +counters and additional steps needed to enable each on different architectures. + +See `./configure --help` for flags for enabling each supported counter. +See [kernel/cycle.h](kernel/cycle.h) for the code that accesses the counters. + +ARMv7-A (armv7a) +================ + +`CNTVCT`: Virtual Count Register in VMSA +-------------------------------------- + +A 64-bit counter part of Virtual Memory System Architecture. +Section B4.1.34 in ARM Architecture Reference Manual ARMv7-A/ARMv7-R + +For access from user mode, requires `CNTKCTL.PL0VCTEN == 1`, which must +be set in kernel mode on each CPU: + + #define CNTKCTL_PL0VCTEN 0x2 /* B4.1.26 in ARM Architecture Rreference */ + uint32_t r; + asm volatile("mrc p15, 0, %0, c14, c1, 0" : "=r"(r)); /* read */ + r |= CNTKCTL_PL0VCTEN; + asm volatile("mcr p15, 0, %0, c14, c1, 0" :: "r"(r)); /* write */ + +Kernel module source *which can be patched with the above code* available at: +https://github.com/thoughtpolice/enable_arm_pmu + +`PMCCNTR`: Performance Monitors Cycle Count Register in VMSA +---------------------------------------------------------- + +A 32-bit counter part of Virtual Memory System Architecture. +Section B4.1.113 in ARM Architecture Reference Manual ARMv7-A/ARMv7-R + +For access from user mode, requires user-mode access to PMU to be enabled +(`PMUSERENR.EN == 1`), which must be done from kernel mode on each CPU: + + #define PERF_DEF_OPTS (1 | 16) + /* enable user-mode access to counters */ + asm volatile("mcr p15, 0, %0, c9, c14, 0" :: "r"(1)); + /* Program PMU and enable all counters */ + asm volatile("mcr p15, 0, %0, c9, c12, 0" :: "r"(PERF_DEF_OPTS)); + asm volatile("mcr p15, 0, %0, c9, c12, 1" :: "r"(0x8000000f)); + +Kernel module source with the above code available at: +[GitHub thoughtpolice/enable\_arm\_pmu](https://github.com/thoughtpolice/enable_arm_pmu) + +More information: +http://neocontra.blogspot.com/2013/05/user-mode-performance-counters-for.html + +ARMv8-A (aarch64) +================= + +`CNTVCT_EL0`: Counter-timer Virtual Count Register +------------------------------------------------ + +A 64-bit counter, part of Generic Registers. +Section D8.5.17 in ARM Architecture Reference Manual ARMv8-A + +For user-mode access, requires `CNTKCTL_EL1.EL0VCTEN == 1`, which +must be set from kernel mode for each CPU: + + #define CNTKCTL_EL0VCTEN 0x2 + uint32_t r; + asm volatile("mrs %0, CNTKCTL_EL1" : "=r"(r)); /* read */ + r |= CNTKCTL_EL0VCTEN; + asm volatile("msr CNTKCTL_EL1, %0" :: "r"(r)); /* write */ + +*WARNING*: Above code was not tested. + +`PMCCNTR_EL0`: Performance Monitors Cycle Count Register +------------------------------------------------------ + +A 64-bit counter, part of Performance Monitors. +Section D8.4.2 in ARM Architecture Reference Manual ARMv8-A + +For access from user mode, requires user-mode access to PMU (`PMUSERENR_EL0.EN +== 1`), which must be set from kernel mode for each CPU: + + #define PERF_DEF_OPTS (1 | 16) + /* enable user-mode access to counters */ + asm volatile("msr PMUSERENR_EL0, %0" :: "r"(1)); + /* Program PMU and enable all counters */ + asm volatile("msr PMCR_EL0, %0" :: "r"(PERF_DEF_OPTS)); + asm volatile("msr PMCNTENSET_EL0, %0" :: "r"(0x8000000f)); + asm volatile("msr PMCCFILTR_EL0, %0" :: "r"(0)); + +Kernel module source with the above code available at: +[GitHub rdolbeau/enable\_arm\_pmu](https://github.com/rdolbeau/enable_arm_pmu) +or in [Pull Request #2 at thoughtpolice/enable\_arm\_pmu](https://github.com/thoughtpolice/enable_arm_pmu/pull/2)