~stepbrobd/go

reflect: add more tests for Type.{CanSeq,CanSeq2}

For #71874.

Change-Id: I3850edfb3104305f3bf4847a73cdd826cc99837f
GitHub-Last-Rev: 574c1edb7a6152c71891fab011ac0aaeca955fc8
GitHub-Pull-Request: golang/go#71890
Reviewed-on: https://go-review.googlesource.com/c/go/+/651775
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
1dafccaa — Michael Anthony Knyszek 29 days ago
runtime: increase timeout in TestSpuriousWakeupsNeverHangSemasleep

This change tries increasing the timeout in
TestSpuriousWakeupsNeverHangSemasleep. I'm not entirely sure of the
mechanism, but GODEBUG=gcstoptheworld=2 and GODEBUG=gccheckmark=1 can
cause this test to fail at it's regular timeout. It does not seem to
indicate a deadlock, because bumping the timeout 10x make the problem
go away. I suspect the problem is due to the long STW times these two
modes can induce, plus the fact this test runs in parallel with others.

Let's just bump the timeout. The test is fundamentally sound, and it's
unclear to me how else to test for a deadlock here.

Fixes #71691.
Fixes #71548.

Change-Id: I649531eeec8a8408ba90823ce5223f3a17863124
Reviewed-on: https://go-review.googlesource.com/c/go/+/652756
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
6e8d7a11 — Filippo Valsorda 2 months ago
crypto/x509: avoid crypto/rand.Int to generate serial number

It's probabyl safe enough, but just reading bytes from rand and then
using SetBytes is simpler, and doesn't require allowing calls from
crypto into math/big's Lsh, Sub, and Cmp.

Change-Id: I6a6a4656761f7073f9e149f288c48e97048ab13c
Reviewed-on: https://go-review.googlesource.com/c/go/+/643278
Auto-Submit: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Daniel McCarney <daniel@binaryparadox.net>
Reviewed-by: Roland Shoemaker <roland@golang.org>
55597473 — KangJi 28 days ago
cmd/cgo: update generated headers for compatibility with latest MSVC C++ standards

Updates #71921

Change-Id: Idfbb72e259b169121c8ced6d89ee2f13d6254d0d
GitHub-Last-Rev: fcf12e5a221621f749841055df1d2c2ada3bf844
GitHub-Pull-Request: golang/go#72004
Reviewed-on: https://go-review.googlesource.com/c/go/+/653141
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
b3e36364 — Jes Cok 29 days ago
flag: replace interface{} -> any for textValue.Get method

Make it literally match the Getter interface.

Change-Id: I73f03780ba1d3fd2230e0e5e2343d40530d9e6d8
GitHub-Last-Rev: 398b90b2fb04fdd401a1d719bf3ce19152a4cf6a
GitHub-Pull-Request: golang/go#71975
Reviewed-on: https://go-review.googlesource.com/c/go/+/652795
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
d31c8055 — Damien Neil 29 days ago
net/http: reject newlines in chunk-size lines

Unlike request headers, where we are allowed to leniently accept
a bare LF in place of a CRLF, chunked bodies must always use CRLF
line terminators. We were already enforcing this for chunk-data lines;
do so for chunk-size lines as well. Also reject bare CRs anywhere
other than as part of the CRLF terminator.

Fixes CVE-2025-22871
Fixes #71988

Change-Id: Ib0e21af5a8ba28c2a1ca52b72af8e2265ec79e4a
Reviewed-on: https://go-review.googlesource.com/c/go/+/652998
Reviewed-by: Jonathan Amsterdam <jba@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
01351209 — Lin Lin 29 days ago
cmd/go: update c document

Fixes: #11875

Change-Id: I0ea2c3e94d7d1647c2aaa3d488ac3c1f5fb6cb18
GitHub-Last-Rev: 7512b33f055aa225d365d6c949a53778834e8dcd
GitHub-Pull-Request: golang/go#71966
Reviewed-on: https://go-review.googlesource.com/c/go/+/652675
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Matloob <matloob@golang.org>
2b737073 — Russ Cox a month ago
math/big: add tests for allocation during multiply

Test that big.Int.Mul reusing the same target is not allocating
temporary garbage during its computation. That code is going
to be modified in an upcoming CL.

Change-Id: I3ed55c06da030282233c29cd7af2a04f395dc7a2
Reviewed-on: https://go-review.googlesource.com/c/go/+/652056
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
0ab2253c — Russ Cox 2 months ago
math/big: move multiplication to natmul.go

No code changes.

This CL moves the multiplication (and squaring) code into natmul.go,
in preparation for cleaning up Karatsuba and then adding Toom-Cook
and FFT-based multiplication.

Change-Id: I7f84328284cc4e1ca4da0ebb9f666a5535e8d7f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/652055
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Alan Donovan <adonovan@google.com>
1f7e28ac — Russ Cox a month ago
math/big: optimize atoi of base 2, 4, 16

Avoid multiplies when converting base 2, 4, 16 inputs,
reducing conversion time from O(N²) to O(N).

The Base8 and Base10 code paths should be unmodified,
but the base-2,4,16 changes tickle the compiler to generate
better (amd64) or worse (arm64) when really it should not.
This is described in detail in #71868 and should be ignored
for the purposes of this CL.

goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
                      │     old      │                 new                 │
                      │    sec/op    │   sec/op     vs base                │
Scan/10/Base2-16         324.4n ± 0%   258.7n ± 0%  -20.25% (p=0.000 n=15)
Scan/100/Base2-16        2.376µ ± 0%   1.968µ ± 0%  -17.17% (p=0.000 n=15)
Scan/1000/Base2-16       23.89µ ± 0%   19.16µ ± 0%  -19.80% (p=0.000 n=15)
Scan/10000/Base2-16      311.5µ ± 0%   190.4µ ± 0%  -38.86% (p=0.000 n=15)
Scan/100000/Base2-16    10.508m ± 0%   1.904m ± 0%  -81.88% (p=0.000 n=15)
Scan/10/Base8-16         138.3n ± 0%   127.9n ± 0%   -7.52% (p=0.000 n=15)
Scan/100/Base8-16        886.1n ± 0%   790.2n ± 0%  -10.82% (p=0.000 n=15)
Scan/1000/Base8-16       9.227µ ± 0%   8.234µ ± 0%  -10.76% (p=0.000 n=15)
Scan/10000/Base8-16      165.8µ ± 0%   155.6µ ± 0%   -6.19% (p=0.000 n=15)
Scan/100000/Base8-16     9.044m ± 0%   8.935m ± 0%   -1.20% (p=0.000 n=15)
Scan/10/Base10-16        129.9n ± 0%   120.0n ± 0%   -7.62% (p=0.000 n=15)
Scan/100/Base10-16       816.3n ± 0%   730.0n ± 0%  -10.57% (p=0.000 n=15)
Scan/1000/Base10-16      8.518µ ± 0%   7.628µ ± 0%  -10.45% (p=0.000 n=15)
Scan/10000/Base10-16     158.6µ ± 0%   149.4µ ± 0%   -5.80% (p=0.000 n=15)
Scan/100000/Base10-16    8.962m ± 0%   8.855m ± 0%   -1.20% (p=0.000 n=15)
Scan/10/Base16-16        114.5n ± 0%   108.6n ± 0%   -5.15% (p=0.000 n=15)
Scan/100/Base16-16       648.3n ± 0%   525.0n ± 0%  -19.02% (p=0.000 n=15)
Scan/1000/Base16-16      7.375µ ± 0%   5.636µ ± 0%  -23.58% (p=0.000 n=15)
Scan/10000/Base16-16    171.18µ ± 0%   66.99µ ± 0%  -60.87% (p=0.000 n=15)
Scan/100000/Base16-16   9490.9µ ± 0%   682.8µ ± 0%  -92.81% (p=0.000 n=15)
geomean                  20.11µ        13.69µ       -31.94%

goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz
                      │      old      │                 new                 │
                      │    sec/op     │   sec/op     vs base                │
Scan/10/Base2-88          275.4n ± 0%   215.0n ± 0%  -21.93% (p=0.000 n=15)
Scan/100/Base2-88         1.869µ ± 0%   1.629µ ± 0%  -12.84% (p=0.000 n=15)
Scan/1000/Base2-88        18.56µ ± 0%   15.81µ ± 0%  -14.82% (p=0.000 n=15)
Scan/10000/Base2-88       270.0µ ± 0%   157.2µ ± 0%  -41.77% (p=0.000 n=15)
Scan/100000/Base2-88     11.518m ± 0%   1.571m ± 0%  -86.36% (p=0.000 n=15)
Scan/10/Base8-88          108.9n ± 0%   106.0n ± 0%   -2.66% (p=0.000 n=15)
Scan/100/Base8-88         655.2n ± 0%   594.9n ± 0%   -9.20% (p=0.000 n=15)
Scan/1000/Base8-88        6.467µ ± 0%   5.966µ ± 0%   -7.75% (p=0.000 n=15)
Scan/10000/Base8-88       151.2µ ± 0%   147.4µ ± 0%   -2.53% (p=0.000 n=15)
Scan/100000/Base8-88      10.33m ± 0%   10.30m ± 0%   -0.25% (p=0.000 n=15)
Scan/10/Base10-88        100.20n ± 0%   98.53n ± 0%   -1.67% (p=0.000 n=15)
Scan/100/Base10-88        596.9n ± 0%   543.3n ± 0%   -8.98% (p=0.000 n=15)
Scan/1000/Base10-88       5.904µ ± 0%   5.485µ ± 0%   -7.10% (p=0.000 n=15)
Scan/10000/Base10-88      145.7µ ± 0%   142.0µ ± 0%   -2.55% (p=0.000 n=15)
Scan/100000/Base10-88     10.26m ± 0%   10.24m ± 0%   -0.18% (p=0.000 n=15)
Scan/10/Base16-88         90.33n ± 0%   87.60n ± 0%   -3.02% (p=0.000 n=15)
Scan/100/Base16-88        506.4n ± 0%   437.7n ± 0%  -13.57% (p=0.000 n=15)
Scan/1000/Base16-88       5.056µ ± 0%   4.007µ ± 0%  -20.75% (p=0.000 n=15)
Scan/10000/Base16-88     163.35µ ± 0%   65.37µ ± 0%  -59.98% (p=0.000 n=15)
Scan/100000/Base16-88   11027.2µ ± 0%   735.1µ ± 0%  -93.33% (p=0.000 n=15)
geomean                   17.13µ        11.74µ       -31.46%

goos: linux
goarch: arm64
pkg: math/big
                      │     old      │                 new                  │
                      │    sec/op    │    sec/op     vs base                │
Scan/10/Base2-16         324.7n ± 0%    348.4n ± 0%   +7.30% (p=0.000 n=15)
Scan/100/Base2-16        2.604µ ± 0%    3.031µ ± 0%  +16.40% (p=0.000 n=15)
Scan/1000/Base2-16       26.15µ ± 0%    29.94µ ± 0%  +14.52% (p=0.000 n=15)
Scan/10000/Base2-16      334.3µ ± 0%    298.8µ ± 0%  -10.64% (p=0.000 n=15)
Scan/100000/Base2-16    10.664m ± 0%    2.991m ± 0%  -71.95% (p=0.000 n=15)
Scan/10/Base8-16         144.4n ± 1%    162.2n ± 1%  +12.33% (p=0.000 n=15)
Scan/100/Base8-16        917.2n ± 0%   1084.0n ± 0%  +18.19% (p=0.000 n=15)
Scan/1000/Base8-16       9.367µ ± 0%   10.901µ ± 0%  +16.38% (p=0.000 n=15)
Scan/10000/Base8-16      164.2µ ± 0%    181.2µ ± 0%  +10.34% (p=0.000 n=15)
Scan/100000/Base8-16     8.871m ± 1%    9.140m ± 0%   +3.04% (p=0.000 n=15)
Scan/10/Base10-16        134.6n ± 1%    148.3n ± 1%  +10.18% (p=0.000 n=15)
Scan/100/Base10-16       837.1n ± 0%    986.6n ± 0%  +17.86% (p=0.000 n=15)
Scan/1000/Base10-16      8.563µ ± 0%    9.936µ ± 0%  +16.03% (p=0.000 n=15)
Scan/10000/Base10-16     156.5µ ± 1%    171.3µ ± 0%   +9.41% (p=0.000 n=15)
Scan/100000/Base10-16    8.863m ± 1%    9.011m ± 0%   +1.66% (p=0.000 n=15)
Scan/10/Base16-16        115.7n ± 2%    129.1n ± 1%  +11.58% (p=0.000 n=15)
Scan/100/Base16-16       708.6n ± 0%    796.8n ± 0%  +12.45% (p=0.000 n=15)
Scan/1000/Base16-16      7.314µ ± 0%    7.554µ ± 0%   +3.28% (p=0.000 n=15)
Scan/10000/Base16-16    149.05µ ± 0%    74.60µ ± 0%  -49.95% (p=0.000 n=15)
Scan/100000/Base16-16   9091.6µ ± 0%    741.5µ ± 0%  -91.84% (p=0.000 n=15)
geomean                  20.39µ         17.65µ       -13.44%

goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
                      │     old      │                 new                 │
                      │    sec/op    │   sec/op     vs base                │
Scan/10/Base2-12         193.8n ± 2%   157.3n ± 1%  -18.83% (p=0.000 n=15)
Scan/100/Base2-12        1.445µ ± 2%   1.362µ ± 1%   -5.74% (p=0.000 n=15)
Scan/1000/Base2-12       14.28µ ± 0%   13.51µ ± 0%   -5.42% (p=0.000 n=15)
Scan/10000/Base2-12      177.1µ ± 0%   134.6µ ± 0%  -24.04% (p=0.000 n=15)
Scan/100000/Base2-12     5.429m ± 1%   1.333m ± 0%  -75.45% (p=0.000 n=15)
Scan/10/Base8-12         75.52n ± 2%   76.09n ± 1%        ~ (p=0.010 n=15)
Scan/100/Base8-12        528.4n ± 1%   532.1n ± 1%        ~ (p=0.003 n=15)
Scan/1000/Base8-12       5.423µ ± 1%   5.427µ ± 0%        ~ (p=0.183 n=15)
Scan/10000/Base8-12      89.26µ ± 1%   89.37µ ± 0%        ~ (p=0.237 n=15)
Scan/100000/Base8-12     4.543m ± 2%   4.560m ± 1%        ~ (p=0.595 n=15)
Scan/10/Base10-12        69.87n ± 1%   70.51n ± 0%        ~ (p=0.002 n=15)
Scan/100/Base10-12       488.4n ± 1%   491.2n ± 0%        ~ (p=0.060 n=15)
Scan/1000/Base10-12      5.014µ ± 1%   5.008µ ± 0%        ~ (p=0.783 n=15)
Scan/10000/Base10-12     84.90µ ± 0%   85.10µ ± 0%        ~ (p=0.109 n=15)
Scan/100000/Base10-12    4.516m ± 1%   4.521m ± 1%        ~ (p=0.713 n=15)
Scan/10/Base16-12        59.21n ± 1%   57.70n ± 1%   -2.55% (p=0.000 n=15)
Scan/100/Base16-12       380.0n ± 1%   360.7n ± 1%   -5.08% (p=0.000 n=15)
Scan/1000/Base16-12      3.775µ ± 0%   3.421µ ± 0%   -9.38% (p=0.000 n=15)
Scan/10000/Base16-12     80.62µ ± 0%   34.44µ ± 1%  -57.28% (p=0.000 n=15)
Scan/100000/Base16-12   4826.4µ ± 2%   450.9µ ± 2%  -90.66% (p=0.000 n=15)
geomean                  11.05µ        8.448µ       -23.52%

Change-Id: Ifdb2049545f34072aa75cdbb72bed4cf465f0ad7
Reviewed-on: https://go-review.googlesource.com/c/go/+/650640
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Robert Griesemer <gri@google.com>
f8f08bfd — Russ Cox 2 months ago
math/big: improve scan test and benchmark

Add a few more test cases for scanning (integer conversion),
which were helpful in debugging some upcoming changes.

BenchmarkScan currently times converting the value 10**N
represented in base B back into []Word form.
When B = 10, the text is 1 followed by many zeros, which
could hit a "multiply by zero" special case when processing
many digit chunks, misrepresenting the actual time required
depending on whether that case is optimized.

Change the benchmark to use 9**N, which is about as big and
will not cause runs of zeros in any of the tested bases.

The benchmark comparison below is not showing faster code,
since of course the code is not changing at all here. Instead,
it is showing that the new benchmark work is roughly the same
size as the old benchmark work.

goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
                      │     old     │                new                 │
                      │   sec/op    │   sec/op     vs base               │
ScanPi-12               43.35µ ± 1%   43.59µ ± 1%       ~ (p=0.069 n=15)
Scan/10/Base2-12        202.3n ± 2%   193.7n ± 1%  -4.25% (p=0.000 n=15)
Scan/100/Base2-12       1.512µ ± 3%   1.447µ ± 1%  -4.30% (p=0.000 n=15)
Scan/1000/Base2-12      15.06µ ± 2%   14.33µ ± 0%  -4.83% (p=0.000 n=15)
Scan/10000/Base2-12     188.0µ ± 5%   177.3µ ± 1%  -5.65% (p=0.000 n=15)
Scan/100000/Base2-12    5.814m ± 3%   5.382m ± 1%  -7.43% (p=0.000 n=15)
Scan/10/Base8-12        78.57n ± 2%   75.02n ± 1%  -4.52% (p=0.000 n=15)
Scan/100/Base8-12       548.2n ± 2%   526.8n ± 1%  -3.90% (p=0.000 n=15)
Scan/1000/Base8-12      5.674µ ± 2%   5.421µ ± 0%  -4.46% (p=0.000 n=15)
Scan/10000/Base8-12     94.42µ ± 1%   88.61µ ± 1%  -6.15% (p=0.000 n=15)
Scan/100000/Base8-12    4.906m ± 2%   4.498m ± 3%  -8.31% (p=0.000 n=15)
Scan/10/Base10-12       73.42n ± 1%   69.56n ± 0%  -5.26% (p=0.000 n=15)
Scan/100/Base10-12      511.9n ± 1%   488.2n ± 0%  -4.63% (p=0.000 n=15)
Scan/1000/Base10-12     5.254µ ± 2%   5.009µ ± 0%  -4.66% (p=0.000 n=15)
Scan/10000/Base10-12    90.22µ ± 2%   84.52µ ± 0%  -6.32% (p=0.000 n=15)
Scan/100000/Base10-12   4.842m ± 3%   4.471m ± 3%  -7.65% (p=0.000 n=15)
Scan/10/Base16-12       62.28n ± 1%   58.70n ± 1%  -5.75% (p=0.000 n=15)
Scan/100/Base16-12      398.6n ± 0%   377.9n ± 1%  -5.19% (p=0.000 n=15)
Scan/1000/Base16-12     4.108µ ± 1%   3.782µ ± 0%  -7.94% (p=0.000 n=15)
Scan/10000/Base16-12    83.78µ ± 2%   80.51µ ± 1%  -3.90% (p=0.000 n=15)
Scan/100000/Base16-12   5.080m ± 3%   4.698m ± 3%  -7.53% (p=0.000 n=15)
geomean                 12.41µ        11.74µ       -5.36%

Change-Id: If3ce290ecc7f38672f11b42fd811afb53dee665d
Reviewed-on: https://go-review.googlesource.com/c/go/+/650639
Reviewed-by: Alan Donovan <adonovan@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
872496da — Russ Cox 2 months ago
math/big: replace nat pool with Word stack

In the early days of math/big, algorithms that needed more space
grew the result larger than it needed to be and then used the
high words as extra space. This made results their own temporary
space caches, at the cost that saving a result in a data structure
might hold significantly more memory than necessary.
Specifically, new(big.Int).Mul(x, y) returned a big.Int with a
backing slice 3X as big as it strictly needed to be.
If you are storing many multiplication results, or even a single
large result, the 3X overhead can add up.

This approach to storage for temporaries also requires being able
to analyze the algorithms to predict the exact amount they need,
which can be difficult.

For both these reasons, the implementation of recursive long division,
which came later, introduced a “nat pool” where temporaries could be
stored and reused, or reclaimed by the GC when no longer used.
This avoids the storage and bookkeeping overheads but introduces a
per-temporary sync.Pool overhead. divRecursiveStep takes an array
of cached temporaries to remove some of that overhead.
The nat pool was better but is still not quite right.

This CL introduces something even better than the nat pool
(still probably not quite right, but the best I can see for now):
a sync.Pool holding stacks for allocating temporaries.
Now an operation can get one stack out of the pool and then
allocate as many temporaries as it needs during the operation,
eventually returning the stack back to the pool. The sync.Pool
operations are now per-exported-operation (like big.Int.Mul),
not per-temporary.

This CL converts both the pre-allocation in nat.mul and the
uses of the nat pool to use stack pools instead. This simplifies
some code and sets us up better for more complex algorithms
(such as Toom-Cook or FFT-based multiplication) that need
more temporaries. It is also a little bit faster.

goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) CPU @ 3.10GHz
                         │     old     │                 new                 │
                         │   sec/op    │   sec/op     vs base                │
Div/20/10-16               23.68n ± 0%   22.21n ± 0%   -6.21% (p=0.000 n=15)
Div/40/20-16               23.68n ± 0%   22.21n ± 0%   -6.21% (p=0.000 n=15)
Div/100/50-16              56.65n ± 0%   55.53n ± 0%   -1.98% (p=0.000 n=15)
Div/200/100-16             194.6n ± 1%   172.8n ± 0%  -11.20% (p=0.000 n=15)
Div/400/200-16             232.1n ± 0%   206.7n ± 0%  -10.94% (p=0.000 n=15)
Div/1000/500-16            405.3n ± 1%   383.8n ± 0%   -5.30% (p=0.000 n=15)
Div/2000/1000-16           810.4n ± 1%   795.2n ± 0%   -1.88% (p=0.000 n=15)
Div/20000/10000-16         25.88µ ± 0%   25.39µ ± 0%   -1.89% (p=0.000 n=15)
Div/200000/100000-16       931.5µ ± 0%   924.3µ ± 0%   -0.77% (p=0.000 n=15)
Div/2000000/1000000-16     37.77m ± 0%   37.75m ± 0%        ~ (p=0.098 n=15)
Div/20000000/10000000-16    1.367 ± 0%    1.377 ± 0%   +0.72% (p=0.003 n=15)
NatMul/10-16               168.5n ± 3%   164.0n ± 4%        ~ (p=0.751 n=15)
NatMul/100-16              6.086µ ± 3%   5.380µ ± 3%  -11.60% (p=0.000 n=15)
NatMul/1000-16             238.1µ ± 3%   228.3µ ± 1%   -4.12% (p=0.000 n=15)
NatMul/10000-16            8.721m ± 2%   8.518m ± 1%   -2.33% (p=0.000 n=15)
NatMul/100000-16           369.6m ± 0%   371.1m ± 0%   +0.42% (p=0.000 n=15)
geomean                    19.57µ        18.74µ        -4.21%

                 │     old      │                  new                   │
                 │     B/op     │     B/op      vs base                  │
NatMul/10-16         192.0 ± 0%     192.0 ± 0%        ~ (p=1.000 n=15) ¹
NatMul/100-16      4.750Ki ± 0%   1.751Ki ± 0%  -63.14% (p=0.000 n=15)
NatMul/1000-16     48.16Ki ± 0%   16.02Ki ± 0%  -66.73% (p=0.000 n=15)
NatMul/10000-16    482.9Ki ± 1%   165.4Ki ± 3%  -65.75% (p=0.000 n=15)
NatMul/100000-16   5.747Mi ± 7%   4.197Mi ± 0%  -26.97% (p=0.000 n=15)
geomean            41.42Ki        20.63Ki       -50.18%
¹ all samples are equal

                 │     old     │                 new                  │
                 │  allocs/op  │  allocs/op   vs base                 │
NatMul/10-16       1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/100-16      1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/1000-16     1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/10000-16    1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/100000-16   7.000 ± 14%   7.000 ± 14%       ~ (p=0.668 n=15)
geomean            1.476         1.476        +0.00%
¹ all samples are equal

goos: linux
goarch: amd64
pkg: math/big
cpu: Intel(R) Xeon(R) Platinum 8481C CPU @ 2.70GHz
                         │     old     │                 new                 │
                         │   sec/op    │   sec/op     vs base                │
Div/20/10-88               15.84n ± 1%   13.12n ± 0%  -17.17% (p=0.000 n=15)
Div/40/20-88               15.88n ± 1%   13.12n ± 0%  -17.38% (p=0.000 n=15)
Div/100/50-88              26.42n ± 0%   25.47n ± 0%   -3.60% (p=0.000 n=15)
Div/200/100-88             132.4n ± 0%   114.9n ± 0%  -13.22% (p=0.000 n=15)
Div/400/200-88             150.1n ± 0%   135.6n ± 0%   -9.66% (p=0.000 n=15)
Div/1000/500-88            275.5n ± 0%   264.1n ± 0%   -4.14% (p=0.000 n=15)
Div/2000/1000-88           586.5n ± 0%   581.1n ± 0%   -0.92% (p=0.000 n=15)
Div/20000/10000-88         25.87µ ± 0%   25.72µ ± 0%   -0.59% (p=0.000 n=15)
Div/200000/100000-88       772.2µ ± 0%   779.0µ ± 0%   +0.88% (p=0.000 n=15)
Div/2000000/1000000-88     33.36m ± 0%   33.63m ± 0%   +0.80% (p=0.000 n=15)
Div/20000000/10000000-88    1.307 ± 0%    1.320 ± 0%   +1.03% (p=0.000 n=15)
NatMul/10-88               140.4n ± 0%   148.8n ± 4%   +5.98% (p=0.000 n=15)
NatMul/100-88              4.663µ ± 1%   4.388µ ± 1%   -5.90% (p=0.000 n=15)
NatMul/1000-88             207.7µ ± 0%   205.8µ ± 0%   -0.89% (p=0.000 n=15)
NatMul/10000-88            8.456m ± 0%   8.468m ± 0%   +0.14% (p=0.021 n=15)
NatMul/100000-88           295.1m ± 0%   297.9m ± 0%   +0.94% (p=0.000 n=15)
geomean                    14.96µ        14.33µ        -4.23%

                 │     old      │                   new                   │
                 │     B/op     │     B/op       vs base                  │
NatMul/10-88         192.0 ± 0%     192.0 ±  0%        ~ (p=1.000 n=15) ¹
NatMul/100-88      4.750Ki ± 0%   1.758Ki ±  0%  -62.99% (p=0.000 n=15)
NatMul/1000-88     48.44Ki ± 0%   16.08Ki ±  0%  -66.80% (p=0.000 n=15)
NatMul/10000-88    489.7Ki ± 1%   166.1Ki ±  3%  -66.08% (p=0.000 n=15)
NatMul/100000-88   5.546Mi ± 0%   3.819Mi ± 60%  -31.15% (p=0.000 n=15)
geomean            41.29Ki        20.30Ki        -50.85%
¹ all samples are equal

                 │     old     │                 new                  │
                 │  allocs/op  │  allocs/op   vs base                 │
NatMul/10-88       1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/100-88      1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/1000-88     1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/10000-88    1.000 ±  0%   1.000 ±  0%       ~ (p=1.000 n=15) ¹
NatMul/100000-88   5.000 ± 20%   6.000 ± 67%       ~ (p=0.672 n=15)
geomean            1.380         1.431        +3.71%
¹ all samples are equal

goos: linux
goarch: arm64
pkg: math/big
                         │     old     │                 new                 │
                         │   sec/op    │   sec/op     vs base                │
Div/20/10-16               15.85n ± 0%   15.23n ± 0%   -3.91% (p=0.000 n=15)
Div/40/20-16               15.88n ± 0%   15.22n ± 0%   -4.16% (p=0.000 n=15)
Div/100/50-16              29.69n ± 0%   26.39n ± 0%  -11.11% (p=0.000 n=15)
Div/200/100-16             149.2n ± 0%   123.3n ± 0%  -17.36% (p=0.000 n=15)
Div/400/200-16             160.3n ± 0%   139.2n ± 0%  -13.16% (p=0.000 n=15)
Div/1000/500-16            271.0n ± 0%   256.1n ± 0%   -5.50% (p=0.000 n=15)
Div/2000/1000-16           545.3n ± 0%   527.0n ± 0%   -3.36% (p=0.000 n=15)
Div/20000/10000-16         22.60µ ± 0%   22.20µ ± 0%   -1.77% (p=0.000 n=15)
Div/200000/100000-16       889.0µ ± 0%   892.2µ ± 0%   +0.35% (p=0.000 n=15)
Div/2000000/1000000-16     38.01m ± 0%   38.12m ± 0%   +0.30% (p=0.000 n=15)
Div/20000000/10000000-16    1.437 ± 0%    1.444 ± 0%   +0.50% (p=0.000 n=15)
NatMul/10-16               166.4n ± 2%   169.5n ± 1%   +1.86% (p=0.000 n=15)
NatMul/100-16              5.733µ ± 1%   5.570µ ± 1%   -2.84% (p=0.000 n=15)
NatMul/1000-16             232.6µ ± 1%   229.8µ ± 0%   -1.22% (p=0.000 n=15)
NatMul/10000-16            9.039m ± 1%   8.969m ± 0%   -0.77% (p=0.000 n=15)
NatMul/100000-16           367.0m ± 0%   368.8m ± 0%   +0.48% (p=0.000 n=15)
geomean                    16.15µ        15.50µ        -4.01%

                 │     old      │                  new                   │
                 │     B/op     │     B/op      vs base                  │
NatMul/10-16         192.0 ± 0%     192.0 ± 0%        ~ (p=1.000 n=15) ¹
NatMul/100-16      4.750Ki ± 0%   1.751Ki ± 0%  -63.14% (p=0.000 n=15)
NatMul/1000-16     48.33Ki ± 0%   16.02Ki ± 0%  -66.85% (p=0.000 n=15)
NatMul/10000-16    536.5Ki ± 1%   165.7Ki ± 3%  -69.12% (p=0.000 n=15)
NatMul/100000-16   6.078Mi ± 6%   4.197Mi ± 0%  -30.94% (p=0.000 n=15)
geomean            42.81Ki        20.64Ki       -51.78%
¹ all samples are equal

                 │     old     │                  new                  │
                 │  allocs/op  │  allocs/op   vs base                  │
NatMul/10-16       1.000 ±  0%   1.000 ±  0%        ~ (p=1.000 n=15) ¹
NatMul/100-16      1.000 ±  0%   1.000 ±  0%        ~ (p=1.000 n=15) ¹
NatMul/1000-16     1.000 ±  0%   1.000 ±  0%        ~ (p=1.000 n=15) ¹
NatMul/10000-16    2.000 ± 50%   1.000 ±  0%  -50.00% (p=0.001 n=15)
NatMul/100000-16   9.000 ± 11%   8.000 ± 12%  -11.11% (p=0.001 n=15)
geomean            1.783         1.516        -14.97%
¹ all samples are equal

goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
                         │     old      │                new                 │
                         │    sec/op    │   sec/op     vs base               │
Div/20/10-12                9.850n ± 1%   9.405n ± 1%  -4.52% (p=0.000 n=15)
Div/40/20-12                9.858n ± 0%   9.403n ± 1%  -4.62% (p=0.000 n=15)
Div/100/50-12               16.40n ± 1%   14.81n ± 0%  -9.70% (p=0.000 n=15)
Div/200/100-12              88.48n ± 2%   80.88n ± 0%  -8.59% (p=0.000 n=15)
Div/400/200-12             107.90n ± 1%   99.28n ± 1%  -7.99% (p=0.000 n=15)
Div/1000/500-12             188.8n ± 1%   178.6n ± 1%  -5.40% (p=0.000 n=15)
Div/2000/1000-12            399.9n ± 0%   389.1n ± 0%  -2.70% (p=0.000 n=15)
Div/20000/10000-12          13.94µ ± 2%   13.81µ ± 1%       ~ (p=0.574 n=15)
Div/200000/100000-12        523.8µ ± 0%   521.7µ ± 0%  -0.40% (p=0.000 n=15)
Div/2000000/1000000-12      21.46m ± 0%   21.48m ± 0%       ~ (p=0.067 n=15)
Div/20000000/10000000-12    812.5m ± 0%   812.9m ± 0%       ~ (p=0.061 n=15)
NatMul/10-12                77.14n ± 0%   78.35n ± 1%  +1.57% (p=0.000 n=15)
NatMul/100-12               2.999µ ± 0%   2.871µ ± 1%  -4.27% (p=0.000 n=15)
NatMul/1000-12              126.2µ ± 0%   126.8µ ± 0%  +0.51% (p=0.011 n=15)
NatMul/10000-12             5.099m ± 0%   5.125m ± 0%  +0.51% (p=0.000 n=15)
NatMul/100000-12            206.7m ± 0%   208.4m ± 0%  +0.80% (p=0.000 n=15)
geomean                     9.512µ        9.236µ       -2.91%

                 │     old      │                   new                    │
                 │     B/op     │      B/op       vs base                  │
NatMul/10-12         192.0 ± 0%     192.0 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/100-12      4.750Ki ± 0%   1.750Ki ±   0%  -63.16% (p=0.000 n=15)
NatMul/1000-12     48.13Ki ± 0%   16.01Ki ±   0%  -66.73% (p=0.000 n=15)
NatMul/10000-12    483.5Ki ± 1%   163.2Ki ±   2%  -66.24% (p=0.000 n=15)
NatMul/100000-12   5.480Mi ± 4%   1.532Mi ± 104%  -72.05% (p=0.000 n=15)
geomean            41.03Ki        16.82Ki         -59.01%
¹ all samples are equal

                 │    old     │                  new                   │
                 │ allocs/op  │  allocs/op    vs base                  │
NatMul/10-12       1.000 ± 0%   1.000 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/100-12      1.000 ± 0%   1.000 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/1000-12     1.000 ± 0%   1.000 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/10000-12    1.000 ± 0%   1.000 ±   0%        ~ (p=1.000 n=15) ¹
NatMul/100000-12   5.000 ± 0%   1.000 ± 400%  -80.00% (p=0.007 n=15)
geomean            1.380        1.000         -27.52%
¹ all samples are equal

Change-Id: I7efa6fe37971ed26ae120a32250fcb47ece0a011
Reviewed-on: https://go-review.googlesource.com/c/go/+/650638
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
763766e9 — Russ Cox a month ago
math/big: report allocs in BenchmarkNatMul, BenchmarkNatSqr

Change-Id: I112f55c0e3ee3b75e615a06b27552de164565c04
Reviewed-on: https://go-review.googlesource.com/c/go/+/650637
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
b66d83cc — Russ Cox 2 months ago
math/big: clean up GCD a little

The GCD code was setting one *Int to the value of another
by smashing one struct on top of the other, instead of using Set.
That was safe in this one case, but it's not idiomatic in math/big
nor safe in general, so rewrite the code not to do that.
(In one case, by swapping variables around; in another, by calling Set.)

The added Set call does slow down GCDs by a small amount,
since the answer has to be copied out. To compensate for that,
optimize a bit: remove the s, t temporaries entirely and handle
vector x word multiplication directly. The net result is that almost
all GCDs are faster, except for small ones, which are a few
nanoseconds slower.

goos: darwin
goarch: arm64
pkg: math/big
cpu: Apple M3 Pro
                              │ bench.before │             bench.after             │
                              │    sec/op    │   sec/op     vs base                │
GCD10x10/WithoutXY-12            23.80n ± 1%   31.71n ± 1%  +33.24% (p=0.000 n=10)
GCD10x10/WithXY-12              100.40n ± 0%   92.14n ± 1%   -8.22% (p=0.000 n=10)
GCD10x100/WithoutXY-12           63.70n ± 0%   70.73n ± 0%  +11.05% (p=0.000 n=10)
GCD10x100/WithXY-12              278.6n ± 0%   233.1n ± 1%  -16.35% (p=0.000 n=10)
GCD10x1000/WithoutXY-12          153.4n ± 0%   162.2n ± 1%   +5.74% (p=0.000 n=10)
GCD10x1000/WithXY-12             456.0n ± 0%   411.8n ± 1%   -9.69% (p=0.000 n=10)
GCD10x10000/WithoutXY-12         1.002µ ± 1%   1.036µ ± 0%   +3.39% (p=0.000 n=10)
GCD10x10000/WithXY-12            2.330µ ± 1%   2.210µ ± 0%   -5.13% (p=0.000 n=10)
GCD10x100000/WithoutXY-12        8.894µ ± 0%   8.889µ ± 1%        ~ (p=0.754 n=10)
GCD10x100000/WithXY-12           20.84µ ± 0%   20.24µ ± 0%   -2.84% (p=0.000 n=10)
GCD100x100/WithoutXY-12          373.3n ± 3%   314.4n ± 0%  -15.76% (p=0.000 n=10)
GCD100x100/WithXY-12             662.5n ± 0%   572.4n ± 1%  -13.59% (p=0.000 n=10)
GCD100x1000/WithoutXY-12         641.8n ± 0%   598.1n ± 1%   -6.81% (p=0.000 n=10)
GCD100x1000/WithXY-12            1.123µ ± 0%   1.019µ ± 1%   -9.26% (p=0.000 n=10)
GCD100x10000/WithoutXY-12        2.870µ ± 0%   2.831µ ± 0%   -1.38% (p=0.000 n=10)
GCD100x10000/WithXY-12           4.930µ ± 1%   4.675µ ± 0%   -5.16% (p=0.000 n=10)
GCD100x100000/WithoutXY-12       24.08µ ± 0%   23.97µ ± 0%   -0.48% (p=0.007 n=10)
GCD100x100000/WithXY-12          43.66µ ± 0%   42.52µ ± 0%   -2.61% (p=0.001 n=10)
GCD1000x1000/WithoutXY-12        3.999µ ± 0%   3.569µ ± 1%  -10.75% (p=0.000 n=10)
GCD1000x1000/WithXY-12           6.397µ ± 0%   5.534µ ± 0%  -13.49% (p=0.000 n=10)
GCD1000x10000/WithoutXY-12       6.875µ ± 0%   6.450µ ± 0%   -6.18% (p=0.000 n=10)
GCD1000x10000/WithXY-12          20.75µ ± 1%   19.17µ ± 1%   -7.64% (p=0.000 n=10)
GCD1000x100000/WithoutXY-12      36.38µ ± 0%   35.60µ ± 1%   -2.13% (p=0.000 n=10)
GCD1000x100000/WithXY-12         172.1µ ± 0%   174.4µ ± 3%        ~ (p=0.052 n=10)
GCD10000x10000/WithoutXY-12      79.89µ ± 1%   75.16µ ± 2%   -5.92% (p=0.000 n=10)
GCD10000x10000/WithXY-12         160.1µ ± 0%   150.0µ ± 0%   -6.33% (p=0.000 n=10)
GCD10000x100000/WithoutXY-12     213.2µ ± 1%   209.0µ ± 1%   -1.98% (p=0.000 n=10)
GCD10000x100000/WithXY-12        1.399m ± 0%   1.342m ± 3%   -4.08% (p=0.002 n=10)
GCD100000x100000/WithoutXY-12    5.463m ± 1%   5.504m ± 2%        ~ (p=0.190 n=10)
GCD100000x100000/WithXY-12       11.36m ± 0%   11.46m ± 1%   +0.86% (p=0.000 n=10)
geomean                          6.953µ        6.695µ        -3.71%

goos: linux
goarch: amd64
pkg: math/big
cpu: AMD Ryzen 9 7950X 16-Core Processor
                              │ bench.before │             bench.after             │
                              │    sec/op    │   sec/op     vs base                │
GCD10x10/WithoutXY-32           39.66n ±  4%   44.34n ± 4%  +11.77% (p=0.000 n=10)
GCD10x10/WithXY-32              156.7n ± 12%   130.8n ± 2%  -16.53% (p=0.000 n=10)
GCD10x100/WithoutXY-32          115.8n ±  5%   120.2n ± 2%   +3.89% (p=0.000 n=10)
GCD10x100/WithXY-32             465.3n ±  3%   368.1n ± 2%  -20.91% (p=0.000 n=10)
GCD10x1000/WithoutXY-32         201.1n ±  1%   210.8n ± 2%   +4.82% (p=0.000 n=10)
GCD10x1000/WithXY-32            652.9n ±  4%   605.0n ± 1%   -7.32% (p=0.002 n=10)
GCD10x10000/WithoutXY-32        1.046µ ±  2%   1.143µ ± 1%   +9.33% (p=0.000 n=10)
GCD10x10000/WithXY-32           3.360µ ±  1%   3.258µ ± 1%   -3.04% (p=0.000 n=10)
GCD10x100000/WithoutXY-32       9.391µ ±  3%   9.997µ ± 1%   +6.46% (p=0.000 n=10)
GCD10x100000/WithXY-32          27.92µ ±  1%   28.21µ ± 0%   +1.04% (p=0.043 n=10)
GCD100x100/WithoutXY-32         443.7n ±  5%   320.0n ± 2%  -27.88% (p=0.000 n=10)
GCD100x100/WithXY-32            789.9n ±  2%   690.4n ± 1%  -12.60% (p=0.000 n=10)
GCD100x1000/WithoutXY-32        718.4n ±  3%   600.0n ± 1%  -16.48% (p=0.000 n=10)
GCD100x1000/WithXY-32           1.388µ ±  4%   1.175µ ± 1%  -15.28% (p=0.000 n=10)
GCD100x10000/WithoutXY-32       2.750µ ±  1%   2.668µ ± 1%   -2.96% (p=0.000 n=10)
GCD100x10000/WithXY-32          6.016µ ±  1%   5.590µ ± 1%   -7.09% (p=0.000 n=10)
GCD100x100000/WithoutXY-32      21.40µ ±  1%   22.30µ ± 1%   +4.21% (p=0.000 n=10)
GCD100x100000/WithXY-32         47.02µ ±  4%   48.80µ ± 0%   +3.78% (p=0.015 n=10)
GCD1000x1000/WithoutXY-32       3.417µ ±  4%   3.020µ ± 1%  -11.65% (p=0.000 n=10)
GCD1000x1000/WithXY-32          5.752µ ±  0%   5.418µ ± 2%   -5.81% (p=0.000 n=10)
GCD1000x10000/WithoutXY-32      6.150µ ±  0%   6.246µ ± 1%   +1.55% (p=0.000 n=10)
GCD1000x10000/WithXY-32         24.68µ ±  3%   25.07µ ± 1%        ~ (p=0.051 n=10)
GCD1000x100000/WithoutXY-32     34.60µ ±  2%   36.85µ ± 1%   +6.51% (p=0.000 n=10)
GCD1000x100000/WithXY-32        209.5µ ±  4%   227.4µ ± 0%   +8.56% (p=0.000 n=10)
GCD10000x10000/WithoutXY-32     90.69µ ±  0%   88.48µ ± 0%   -2.44% (p=0.000 n=10)
GCD10000x10000/WithXY-32        197.1µ ±  0%   200.5µ ± 0%   +1.73% (p=0.000 n=10)
GCD10000x100000/WithoutXY-32    239.1µ ±  0%   242.5µ ± 0%   +1.42% (p=0.000 n=10)
GCD10000x100000/WithXY-32       1.963m ±  3%   2.028m ± 0%   +3.28% (p=0.000 n=10)
GCD100000x100000/WithoutXY-32   7.466m ±  0%   7.412m ± 0%   -0.71% (p=0.000 n=10)
GCD100000x100000/WithXY-32      16.10m ±  2%   16.47m ± 0%   +2.25% (p=0.000 n=10)
geomean                         8.388µ         8.127µ        -3.12%

Change-Id: I161dc409bad11bcc553bc8116449905ae5b06742
Reviewed-on: https://go-review.googlesource.com/c/go/+/650636
Reviewed-by: Robert Griesemer <gri@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Alan Donovan <adonovan@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
37e9c5ea — Joel Sing 9 months ago
cmd/internal/obj/riscv: implement vector load/store instructions

Implement vector unit stride, vector strided, vector indexed and
vector whole register load and store instructions.

The vector unit stride instructions take an optional vector mask
register, which if specified must be register V0. If only two
operands are given, the instruction is encoded as unmasked.

The vector strided and vector indexed instructions also take an
optional vector mask register, which if specified must be register
V0. If only three operands are given, the instruction is encoded as
unmasked.

Cq-Include-Trybots: luci.golang.try:gotip-linux-riscv64
Change-Id: I35e43bb8f1cf6ae8826fbeec384b95ac945da50f
Reviewed-on: https://go-review.googlesource.com/c/go/+/631937
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mark Ryan <markdryan@rivosinc.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Meng Zhuo <mengzhuo1203@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>
927fdb78 — Joel Sing a month ago
cmd/compile: simplify intrinsification of TrailingZeros16 and TrailingZeros8

Decompose Ctz16 and Ctz8 within the SSA rules for LOONG64, MIPS, PPC64
and S390X, rather than having a custom intrinsic. Note that for PPC64 this
actually allows the existing Ctz16 and Ctz8 rules to be used.

Change-Id: I27a5e978f852b9d75396d2a80f5d7dfcb5ef7dd4
Reviewed-on: https://go-review.googlesource.com/c/go/+/651816
Reviewed-by: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
01ba8bfe — Guoqi Chen a month ago
runtime/cgo: use standard ABI call setg_gcc in crosscall1 on loong64

Change-Id: Ie38583d667d579751d643b2da2aa56390b69904c
Reviewed-on: https://go-review.googlesource.com/c/go/+/652255
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
11c847e5 — Guoqi Chen 3 months ago
runtime: use correct memory barrier in exitThread function on loong64

In the runtime.exitThread function, a storeRelease barrier
is required instead of a full barrier.

Change-Id: I2815ddb03e4984c891d71811ccf650a82325e10d
Reviewed-on: https://go-review.googlesource.com/c/go/+/631915
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: sophie zhao <zhaoxiaolin@loongson.cn>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
c594762a — Keith Randall 30 days ago
runtime: remove ret field from gobuf

It's not used for anything.

Change-Id: I031b3cdfe52b6b1cff4b3cb6713ffe588084542f
Reviewed-on: https://go-review.googlesource.com/c/go/+/652276
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2e71ae33 — Ian Lance Taylor 29 days ago
runtime/cgo: avoid errors from -Wdeclaration-after-statement

CL 652181 accidentally missed this iPhone only code.

For #71961

Change-Id: I567f8bb38958907442e69494da330d5199d11f54
Reviewed-on: https://go-review.googlesource.com/c/go/+/653135
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Commit-Queue: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Next
Do not follow this link