Skip to content

Commit

Permalink
index/suffixarray: add 32-bit implementation
Browse files Browse the repository at this point in the history
The original index/suffixarray used 32-bit ints on 64-bit machines,
because that's what 'int' meant in Go at the time. When we changed
the meaning of int, that doubled the space overhead of suffix arrays
for all uses, even though the vast majority of them describe less
than 2 GB of text.

The space overhead of a suffix array compared to the text is not
insignificant: there's a big difference for many uses between 4X and 8X.

This CL adjusts names in qsufsort.go so that a global search and
replace s/32/64/g produces a working 64-bit implementation,
and then it modifies suffixarray.go to choose between the 32-bit
and 64-bit implementation as appropriate depending on the input size.
The 64-bit implementation is generated by 'go generate'.

This CL also restructures the benchmarks, to test different
input sizes, different input texts, and 32-bit vs 64-bit.

The serialized form uses varint-encoded numbers and is unchanged,
so on-disk suffix arrays written by older versions of Go will be
readable by this version, and vice versa.

The 32-bit version runs a up to 17% faster than the 64-bit version
on real inputs, but more importantly it uses 50% less memory.

I have a followup CL that also implements a faster algorithm
on top of these improvements, but these are a good first step.

name                                  64-bit speed   32-bit speed    delta
New/text=opticks/size=100K/bits=*-12  4.44MB/s ± 0%  4.64MB/s ± 0%   +4.41%  (p=0.008 n=5+5)
New/text=opticks/size=500K/bits=*-12  3.70MB/s ± 1%  3.82MB/s ± 0%   +3.30%  (p=0.008 n=5+5)
New/text=go/size=100K/bits=*-12       4.40MB/s ± 0%  4.61MB/s ± 0%   +4.82%  (p=0.008 n=5+5)
New/text=go/size=500K/bits=*-12       3.66MB/s ± 0%  3.77MB/s ± 0%   +3.01%  (p=0.016 n=4+5)
New/text=go/size=1M/bits=*-12         3.29MB/s ± 0%  3.55MB/s ± 0%   +7.90%  (p=0.016 n=5+4)
New/text=go/size=5M/bits=*-12         2.25MB/s ± 1%  2.65MB/s ± 0%  +17.81%  (p=0.008 n=5+5)
New/text=go/size=10M/bits=*-12        1.82MB/s ± 0%  2.09MB/s ± 1%  +14.36%  (p=0.008 n=5+5)
New/text=go/size=50M/bits=*-12        1.35MB/s ± 0%  1.51MB/s ± 1%  +12.33%  (p=0.008 n=5+5)
New/text=zero/size=100K/bits=*-12     3.42MB/s ± 0%  3.32MB/s ± 0%   -2.74%  (p=0.000 n=5+4)
New/text=zero/size=500K/bits=*-12     3.00MB/s ± 1%  2.97MB/s ± 0%   -1.13%  (p=0.016 n=5+4)
New/text=zero/size=1M/bits=*-12       2.81MB/s ± 0%  2.78MB/s ± 2%     ~     (p=0.167 n=5+5)
New/text=zero/size=5M/bits=*-12       2.46MB/s ± 0%  2.53MB/s ± 0%   +3.18%  (p=0.008 n=5+5)
New/text=zero/size=10M/bits=*-12      2.35MB/s ± 0%  2.42MB/s ± 0%   +2.98%  (p=0.016 n=4+5)
New/text=zero/size=50M/bits=*-12      2.12MB/s ± 0%  2.18MB/s ± 0%   +3.02%  (p=0.008 n=5+5)
New/text=rand/size=100K/bits=*-12     6.98MB/s ± 0%  7.22MB/s ± 0%   +3.38%  (p=0.016 n=4+5)
New/text=rand/size=500K/bits=*-12     5.53MB/s ± 0%  5.64MB/s ± 0%   +1.92%  (p=0.008 n=5+5)
New/text=rand/size=1M/bits=*-12       4.62MB/s ± 1%  5.06MB/s ± 0%   +9.61%  (p=0.008 n=5+5)
New/text=rand/size=5M/bits=*-12       3.09MB/s ± 0%  3.43MB/s ± 0%  +10.94%  (p=0.016 n=4+5)
New/text=rand/size=10M/bits=*-12      2.68MB/s ± 0%  2.95MB/s ± 0%  +10.39%  (p=0.008 n=5+5)
New/text=rand/size=50M/bits=*-12      1.92MB/s ± 0%  2.06MB/s ± 1%   +7.41%  (p=0.008 n=5+5)
SaveRestore/bits=*-12                  243MB/s ± 1%   259MB/s ± 0%   +6.68%  (p=0.000 n=9+10)

name                               64-bit alloc/op  32-bit alloc/op  delta
New/text=opticks/size=100K/bits=*-12    1.62MB ± 0%    0.81MB ± 0%  -50.00%  (p=0.000 n=5+4)
New/text=opticks/size=500K/bits=*-12    8.07MB ± 0%    4.04MB ± 0%  -49.89%  (p=0.008 n=5+5)
New/text=go/size=100K/bits=*-12         1.62MB ± 0%    0.81MB ± 0%  -50.00%  (p=0.008 n=5+5)
New/text=go/size=500K/bits=*-12         8.07MB ± 0%    4.04MB ± 0%  -49.89%  (p=0.029 n=4+4)
New/text=go/size=1M/bits=*-12           16.1MB ± 0%     8.1MB ± 0%  -49.95%  (p=0.008 n=5+5)
New/text=go/size=5M/bits=*-12           80.3MB ± 0%    40.2MB ± 0%     ~     (p=0.079 n=4+5)
New/text=go/size=10M/bits=*-12           160MB ± 0%      80MB ± 0%  -50.00%  (p=0.008 n=5+5)
New/text=go/size=50M/bits=*-12           805MB ± 0%     402MB ± 0%  -50.06%  (p=0.029 n=4+4)
New/text=zero/size=100K/bits=*-12       3.02MB ± 0%    1.46MB ± 0%     ~     (p=0.079 n=4+5)
New/text=zero/size=500K/bits=*-12       19.7MB ± 0%     8.7MB ± 0%  -55.98%  (p=0.008 n=5+5)
New/text=zero/size=1M/bits=*-12         39.0MB ± 0%    19.7MB ± 0%  -49.60%  (p=0.000 n=5+4)
New/text=zero/size=5M/bits=*-12          169MB ± 0%      85MB ± 0%  -49.46%  (p=0.029 n=4+4)
New/text=zero/size=10M/bits=*-12         333MB ± 0%     169MB ± 0%  -49.43%  (p=0.000 n=5+4)
New/text=zero/size=50M/bits=*-12        1.63GB ± 0%    0.74GB ± 0%  -54.61%  (p=0.008 n=5+5)
New/text=rand/size=100K/bits=*-12       1.61MB ± 0%    0.81MB ± 0%  -50.00%  (p=0.000 n=5+4)
New/text=rand/size=500K/bits=*-12       8.07MB ± 0%    4.04MB ± 0%  -49.89%  (p=0.000 n=5+4)
New/text=rand/size=1M/bits=*-12         16.1MB ± 0%     8.1MB ± 0%  -49.95%  (p=0.029 n=4+4)
New/text=rand/size=5M/bits=*-12         80.7MB ± 0%    40.3MB ± 0%  -50.06%  (p=0.008 n=5+5)
New/text=rand/size=10M/bits=*-12         161MB ± 0%      81MB ± 0%  -50.03%  (p=0.008 n=5+5)
New/text=rand/size=50M/bits=*-12         806MB ± 0%     403MB ± 0%  -50.00%  (p=0.016 n=4+5)
SaveRestore/bits=*-12                   9.47MB ± 0%    5.28MB ± 0%  -44.29%  (p=0.000 n=9+8)

https://perf.golang.org/search?q=upload:20190126.1+|+bits:64+vs+bits:32

Fixes #6816.

Change-Id: Ied2fbea519a202ecc43719debcd233344ce38847
Reviewed-on: https://go-review.googlesource.com/c/go/+/174097
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
  • Loading branch information
rsc committed May 1, 2019
1 parent b098c0f commit 45be353
Show file tree
Hide file tree
Showing 5 changed files with 525 additions and 84 deletions.
31 changes: 31 additions & 0 deletions src/index/suffixarray/gen64.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// Copyright 2019 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

// +build ignore

// Gen64 generates qsufsort64.go from qsufsort.go by s/32/64/g.
package main

import (
"bytes"
"io/ioutil"
"log"
)

func main() {
log.SetPrefix("gen64: ")
log.SetFlags(0)

data, err := ioutil.ReadFile("qsufsort.go")
if err != nil {
log.Fatal(err)
}

data = bytes.Replace(data, []byte("\n\n"), []byte("\n\n// Code generated by gen64.go; DO NOT EDIT.\n//go:generate go run gen64.go\n\n"), 1)
data = bytes.Replace(data, []byte("32"), []byte("64"), -1)

if err := ioutil.WriteFile("qsufsort64.go", data, 0666); err != nil {
log.Fatal(err)
}
}
66 changes: 34 additions & 32 deletions src/index/suffixarray/qsufsort.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,26 +24,28 @@

package suffixarray

import "sort"
import (
"sort"
)

func qsufsort(data []byte) []int {
func qsufsort32(data []byte) []int32 {
// initial sorting by first byte of suffix
sa := sortedByFirstByte(data)
sa := sortedByFirstByte32(data)
if len(sa) < 2 {
return sa
}
// initialize the group lookup table
// this becomes the inverse of the suffix array when all groups are sorted
inv := initGroups(sa, data)
inv := initGroups32(sa, data)

// the index starts 1-ordered
sufSortable := &suffixSortable{sa: sa, inv: inv, h: 1}
sufSortable := &suffixSortable32{sa: sa, inv: inv, h: 1}

for sa[0] > -len(sa) { // until all suffixes are one big sorted group
for sa[0] > -int32(len(sa)) { // until all suffixes are one big sorted group
// The suffixes are h-ordered, make them 2*h-ordered
pi := 0 // pi is first position of first group
sl := 0 // sl is negated length of sorted groups
for pi < len(sa) {
pi := int32(0) // pi is first position of first group
sl := int32(0) // sl is negated length of sorted groups
for pi < int32(len(sa)) {
if s := sa[pi]; s < 0 { // if pi starts sorted group
pi -= s // skip over sorted group
sl += s // add negated length to sl
Expand All @@ -67,12 +69,12 @@ func qsufsort(data []byte) []int {
}

for i := range sa { // reconstruct suffix array from inverse
sa[inv[i]] = i
sa[inv[i]] = int32(i)
}
return sa
}

func sortedByFirstByte(data []byte) []int {
func sortedByFirstByte32(data []byte) []int32 {
// total byte counts
var count [256]int
for _, b := range data {
Expand All @@ -84,20 +86,20 @@ func sortedByFirstByte(data []byte) []int {
count[b], sum = sum, count[b]+sum
}
// iterate through bytes, placing index into the correct spot in sa
sa := make([]int, len(data))
sa := make([]int32, len(data))
for i, b := range data {
sa[count[b]] = i
sa[count[b]] = int32(i)
count[b]++
}
return sa
}

func initGroups(sa []int, data []byte) []int {
func initGroups32(sa []int32, data []byte) []int32 {
// label contiguous same-letter groups with the same group number
inv := make([]int, len(data))
prevGroup := len(sa) - 1
inv := make([]int32, len(data))
prevGroup := int32(len(sa)) - 1
groupByte := data[sa[prevGroup]]
for i := len(sa) - 1; i >= 0; i-- {
for i := int32(len(sa)) - 1; i >= 0; i-- {
if b := data[sa[i]]; b < groupByte {
if prevGroup == i+1 {
sa[i+1] = -1
Expand All @@ -114,13 +116,13 @@ func initGroups(sa []int, data []byte) []int {
// This is necessary to ensure the suffix "a" is before "aba"
// when using a potentially unstable sort.
lastByte := data[len(data)-1]
s := -1
s := int32(-1)
for i := range sa {
if sa[i] >= 0 {
if data[sa[i]] == lastByte && s == -1 {
s = i
s = int32(i)
}
if sa[i] == len(sa)-1 {
if sa[i] == int32(len(sa))-1 {
sa[i], sa[s] = sa[s], sa[i]
inv[sa[s]] = s
sa[s] = -1 // mark it as an isolated sorted group
Expand All @@ -131,31 +133,31 @@ func initGroups(sa []int, data []byte) []int {
return inv
}

type suffixSortable struct {
sa []int
inv []int
h int
buf []int // common scratch space
type suffixSortable32 struct {
sa []int32
inv []int32
h int32
buf []int32 // common scratch space
}

func (x *suffixSortable) Len() int { return len(x.sa) }
func (x *suffixSortable) Less(i, j int) bool { return x.inv[x.sa[i]+x.h] < x.inv[x.sa[j]+x.h] }
func (x *suffixSortable) Swap(i, j int) { x.sa[i], x.sa[j] = x.sa[j], x.sa[i] }
func (x *suffixSortable32) Len() int { return len(x.sa) }
func (x *suffixSortable32) Less(i, j int) bool { return x.inv[x.sa[i]+x.h] < x.inv[x.sa[j]+x.h] }
func (x *suffixSortable32) Swap(i, j int) { x.sa[i], x.sa[j] = x.sa[j], x.sa[i] }

func (x *suffixSortable) updateGroups(offset int) {
func (x *suffixSortable32) updateGroups(offset int32) {
bounds := x.buf[0:0]
group := x.inv[x.sa[0]+x.h]
for i := 1; i < len(x.sa); i++ {
if g := x.inv[x.sa[i]+x.h]; g > group {
bounds = append(bounds, i)
bounds = append(bounds, int32(i))
group = g
}
}
bounds = append(bounds, len(x.sa))
bounds = append(bounds, int32(len(x.sa)))
x.buf = bounds

// update the group numberings after all new groups are determined
prev := 0
prev := int32(0)
for _, b := range bounds {
for i := prev; i < b; i++ {
x.inv[x.sa[i]] = offset + b - 1
Expand Down
173 changes: 173 additions & 0 deletions src/index/suffixarray/qsufsort64.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 45be353

Please sign in to comment.