| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
This picks up the accelerated string functions written by
strajabot@.
Event: Google Summer of Code 2024
MFC after: 1 month
MFC to: stable/15
See also: 79e01e7e643c9337d8d6046b6db7df674475a099
Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D53248
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scalar implementation of strchrnul() in RISC-V assembly and changes to the
corresponding manpage.
Performance was benchmarked on a HiFive Unmatched (SiFive HF105-001) board
using: https://github.com/clausecker/strperf
os: FreeBSD
arch: riscv
│ strchrnul_baseline │ strchrnul_scalar │
│ sec/op │ sec/op vs base │
Short 680.2µ ± 5% 435.3µ ± 0% -36.01% (p=0.000 n=20)
Mid 314.7µ ± 3% 221.4µ ± 0% -29.63% (p=0.000 n=20)
Long 152.3µ ± 0% 138.5µ ± 0% -9.08% (p=0.000 n=20)
geomean 319.5µ 237.2µ -25.75%
│ strchrnul_baseline │ strchrnul_scalar │
│ MiB/s │ MiB/s vs base │
Short 183.8 ± 5% 287.2 ± 0% +56.27% (p=0.000 n=20)
Mid 397.3 ± 3% 564.6 ± 0% +42.12% (p=0.000 n=20)
Long 820.5 ± 0% 902.5 ± 0% +9.99% (p=0.000 n=20)
geomean 391.3 527.0 +34.68%
MFC after: 1 month
MFC to: stable/15
Approved by: markj (mentor)
Reviewed by: fuz
Sponsored by: Google LLC (GSoC 2024)
Differential Revision: https://reviews.freebsd.org/D46047
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimized implementation of strnlen() in RISC-V assembly
Performance was measured using strperf on a HiFive Unmatched (SiFive HF105-001) board.
os: FreeBSD
arch: riscv
│ strnlen_baseline │ strnlen_scalar │
│ sec/op │ sec/op vs base │
Short 787.0µ ± 0% 430.9µ ± 1% -45.24% (p=0.000 n=20)
Mid 621.6µ ± 0% 195.1µ ± 1% -68.61% (p=0.000 n=20)
Long 569.4µ ± 1% 100.6µ ± 0% -82.34% (p=0.000 n=20)
geomean 653.1µ 203.7µ -68.81%
│ strnlen_baseline │ strnlen_scalar │
│ MiB/s │ MiB/s vs base │
Short 158.8 ± 0% 290.1 ± 1% +82.62% (p=0.000 n=20)
Mid 201.1 ± 0% 640.6 ± 1% +218.59% (p=0.000 n=20)
Long 219.5 ± 1% 1242.9 ± 0% +466.19% (p=0.000 n=20)
geomean 191.4 613.5 +220.57%
MFC after: 1 month
MFC to: stable/15
Approved by: mhorne, markj (mentor)
Reviewed by: fuz, Jari Sihvola <jsihv@gmx.com>
Sponsored by: Google LLC (GSoC 2024)
Differential Revision: https://reviews.freebsd.org/D46230
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimized assembly implementation of memcpy() for the RISC-V architecture.
The implementation has two paths:
- An aligned path - (dst - src) % 8 = 0, runs faster
- An unaligned path - (dst - src) % 8 != 0, runs slower
os: FreeBSD
arch: riscv
│ memcpy_baseline │ memcpy_scalar │
│ sec/op │ sec/op vs base │
64Align8 851.6µ ± 1% 488.9µ ± 1% -42.59% (p=0.000 n=12)
4kAlign8 681.5µ ± 1% 255.1µ ± 2% -62.57% (p=0.000 n=12)
256kAlign8 273.0µ ± 2% 230.7µ ± 2% -15.50% (p=0.000 n=12)
16mAlign8 98.07m ± 0% 95.29m ± 0% -2.84% (p=0.000 n=12)
64UAlign 887.5µ ± 1% 531.6µ ± 1% -40.10% (p=0.000 n=12)
4kUAlign 725.6µ ± 1% 262.2µ ± 1% -63.87% (p=0.000 n=12)
256kUAlign 844.1µ ± 2% 322.8µ ± 0% -61.76% (p=0.000 n=12)
16mUAlign 134.9m ± 0% 101.2m ± 0% -24.97% (p=0.000 n=20)
geomean 2.410m 1.371m -43.12%
│ memcpy_baseline │ memcpy_scalar │
│ MiB/s │ MiB/s vs base │
64Align8 293.6 ± 1% 511.3 ± 1% +74.18% (p=0.000 n=12)
4kAlign8 366.8 ± 1% 980.0 ± 2% +167.15% (p=0.000 n=12)
256kAlign8 915.8 ± 2% 1083.7 ± 2% +18.34% (p=0.000 n=12)
16mAlign8 163.1 ± 0% 167.9 ± 0% +2.92% (p=0.000 n=12)
64UAlign 281.7 ± 1% 470.3 ± 1% +66.94% (p=0.000 n=12)
4kUAlign 344.5 ± 1% 953.6 ± 1% +176.77% (p=0.000 n=12)
256kUAlign 296.2 ± 2% 774.5 ± 0% +161.49% (p=0.000 n=12)
16mUAlign 118.6 ± 0% 158.1 ± 0% +33.28% (p=0.000 n=20)
geomean 293.4 515.8 +75.81%
MFC after: 1 month
MFC to: stable/15
Approved by: mhorne, markj (mentor)
Reviewed by: fuz
Sponsored by: Google LLC (GSoC 2024)
Differential Revision: https://reviews.freebsd.org/D46139
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Includes a scalar implementation of strlen() for the RISC-V
architecture and changes to the corresponding manpage.
Performance was benchamarked using before and after:
https://github.com/clausecker/strperf
os: FreeBSD
arch: riscv
│ strlen_baseline │ strlen_scalar │
│ sec/op │ sec/op vs base │
Short 541.2µ ± 17% 401.6µ ± 0% -25.78% (p=0.000 n=21+20)
Mid 249.6µ ± 3% 191.9µ ± 0% -23.13% (p=0.000 n=21+20)
Long 124.6µ ± 0% 110.7µ ± 0% -11.13% (p=0.000 n=21+20)
geomean 256.3µ 204.3µ -20.26%
│ strlen_baseline │ strlen_scalar │
│ B/s │ B/s vs base │
Short 220.3Mi ± 14% 296.8Mi ± 0% +34.74% (p=0.000 n=21+20)
Mid 477.6Mi ± 3% 621.3Mi ± 0% +30.09% (p=0.000 n=21+20)
Long 956.9Mi ± 0% 1076.7Mi ± 0% +12.52% (p=0.000 n=21+20)
geomean 465.2Mi 583.4Mi +25.40%
MFC after: 1 month
MFC to: stable/15
Approved by: mhorne, markj (mentor)
Reviewed by: fuz
Sponsored by: Google LLC (GSoC 2024)
Differential Revision: https://reviews.freebsd.org/D45693
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds scalar implementation of memset for RISC-V
and updates the relevant manpage
os: FreeBSD
arch: riscv
│ ./results/memset/memset_baseline │ ./results/memset/memset_scalar │
│ sec/op │ sec/op vs base │
40 527.5µ ± 1% 479.4µ ± 1% -9.12% (p=0.000 n=20)
168 254.5µ ± 1% 216.7µ ± 1% -14.86% (p=0.000 n=20)
2k 169.5µ ± 1% 128.4µ ± 0% -24.24% (p=0.000 n=20)
256k 161.2µ ± 1% 118.6µ ± 1% -26.42% (p=0.000 n=20)
16m 56.58m ± 0% 53.91m ± 0% -4.72% (p=0.000 n=20)
geomean 730.2µ 611.2µ -16.29%
│ ./results/memset/memset_baseline │ ./results/memset/memset_scalar │
│ B/s │ B/s vs base │
40 452.0Mi ± 1% 497.3Mi ± 1% +10.04% (p=0.000 n=20)
168 936.9Mi ± 1% 1100.4Mi ± 1% +17.45% (p=0.000 n=20)
2k 1.373Gi ± 1% 1.813Gi ± 0% +32.00% (p=0.000 n=20)
256k 1.444Gi ± 1% 1.962Gi ± 1% +35.91% (p=0.000 n=20)
16m 269.7Mi ± 0% 283.1Mi ± 0% +4.96% (p=0.000 n=20)
geomean 750.1Mi 896.1Mi +19.47%
MFC after: 1 month
MFC to: stable/15
Approved by: mhorne, markj (mentor)
Reviewed by: fuz
Sponsored by: Google LLC (GSoc 2024)
Differential Revision: https://reviews.freebsd.org/D45730
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added an optimized memchr() implementation in RISC-V assembly and updated
the relevant manpage.
│ memchr_baseline │ memchr_scalar │
│ sec/op │ sec/op vs base │
Short 636.6µ ± 1% 495.9µ ± 1% -22.10% (p=0.000 n=20)
Mid 279.7µ ± 1% 224.1µ ± 1% -19.87% (p=0.000 n=20)
Long 138.8µ ± 0% 124.9µ ± 0% -10.00% (p=0.000 n=20)
geomean 291.3µ 240.3µ -17.48%
│ memchr_baseline │ memchr_scalar │
│ B/s │ B/s vs base │
Short 187.3Mi ± 1% 240.4Mi ± 1% +28.37% (p=0.000 n=20)
Mid 426.2Mi ± 1% 531.9Mi ± 1% +24.79% (p=0.000 n=20)
Long 859.0Mi ± 0% 954.4Mi ± 0% +11.11% (p=0.000 n=20)
geomean 409.3Mi 496.0Mi +21.19%
MFC after: 1 month
MFC to: stable/15
Approved by: mhorne, markj (mentor)
Reviewed by: fuz
Sponsored by: Google LLC (GSoC 2024)
Differential Revision: https://reviews.freebsd.org/D46023
|
| |
|
|
|
|
|
|
| |
MFC after: 1 month
MFC to: stable/15
Approved by: mhorne, markj (mentor)
Sponsored by: Google LLC (GSoC 2024)
Differential Revision: https://reviews.freebsd.org/D47275
|
|
|
Implements strrchr in RISC-V assembly, leading to the following
improvements (performance measured on SiFive HF105-001)
os: FreeBSD
arch: riscv
│ strrchr_baseline │ strrchr_scalar │
│ sec/op │ sec/op vs base │
Short 837.2µ ± 1% 574.6µ ± 1% -31.37% (p=0.000 n=20+21)
Mid 639.7µ ± 0% 269.7µ ± 0% -57.84% (p=0.000 n=20+21)
Long 589.1µ ± 0% 176.7µ ± 0% -70.01% (p=0.000 n=20+21)
geomean 680.8µ 301.4µ -55.73%
│ strrchr_baseline │ strrchr_scalar │
│ MiB/s │ MiB/s vs base │
Short 149.3 ± 1% 217.6 ± 1% +45.71% (p=0.000 n=20+21)
Mid 195.4 ± 0% 463.6 ± 0% +137.22% (p=0.000 n=20+21)
Long 212.2 ± 0% 707.4 ± 0% +233.40% (p=0.000 n=20+21)
geomean 183.6 414.7 +125.88%
MFC after: 1 month
MFC to: stable/15
Approved by: mhorne, markj (mentor)
Sponsored by: Google LLC (GSoC 2024)
Differential Revision: https://reviews.freebsd.org/D47275
|