Array: optimize shift and unshift #10081

asterite · 2020-12-15T19:04:53Z

Revival of #8036 and more!

How Array#shift used to work

Before this PR, an Array used to consist of a buffer, a size and a capacity. The buffer would have a capacity. As elements are pushed into it the size increases. Once the capacity is reached, the buffer is doubled in size. To do this, we call realloc on the buffer.

When calling shift, because we must remember the exact pointer where the buffer was allocated in order to do a realloc, the only thing we could do was to call realloc, then move all the elements "to the right":

If the array is:
 
  [1, 2, 3]

and we shift:

  4

we realloc:

  [1, 2, 3, _, _, _]

then we move the elements to the right:

  [_, 1, 2, 3, _, _]

then we put the new value:
 
  [4, 1, 2, 3, _, _]

The next time shift is called we must do the same thing again, except that, well, we don't need a realloc for a while, but we still need to copy the memory to the right.

How Array#shift is optimized

With this PR, Array now also tracks an offset to where the buffer starts. So for example:

If the array is:
 
  [1, 2, 3]
   ^
   buffer
   offset = 0

Initially, the buffer points to "1" and the offset to the buffer is zero.

If we shift:

  4

Then we can move the buffer (well, the point to it) to the right and remember that we have an offset of 1:
 
  [_, 2, 3]
      ^
      buffer
      offset = 1

This is very efficient because all we need to do is increment a couple of variables: no memory is moved.

Furthermore, if we later unshift a value, we can move the buffer to the left and decrement offset.

As explained in #8036, adding this offset to Array doesn't change its memory consumption, at least not in 64 bits platforms. And in 32 bits platforms it's just 4 more bytes per Array which isn't terrible.

How Array#unshift is optimized

This is something that wasn't done in #8036

The idea is that when we unshift an element and we have no more space left, we grow the array but leave the pointer to the buffer pointing to the middle of the buffer:

If the array is:
 
  [1, 2, 3]
   ^ buffer

and we unshift:

  4

Then we resize the array:
 
  [1, 2, 3, _, _, _]
   ^ buffer

We move the elements to the right and also the pointer to the buffer:

  [_, _, _, 1, 2, 3]
               ^
              buffer

In this way if another unshift comes we just need to move the buffer to the left. Just after 3 unshifts we'll need to reallocate and move the memory again, and then it will happen after 6 unshifts.

Also note how this is similar to push but in the other direction: when you push and there's no more space, some extra space remains on the left in case more pushes comes later on. This is exactly the same but in the other direction.

Why is this important?

I consider this change extremely important. With this, Array can be used as a list, as a queue, as a dequeue... whatever! Well, previously it could be used like that too, but it wasn't efficient in all cases. So with this, Array becomes a universal, efficient data structure. Just like in Ruby. No more fear of calling shift or unshift and thinking "Ugh, why do I have to pay this penalty?".

Benchmarks!

Shift

require "benchmark"

Benchmark.ips do |x|
  x.report("shift") do
    array = Array.new(10_000, &.itself)
    while array.shift?
    end
  end
end

I know this is allocating an array, but still, look at the times:

before  495.08 (  2.02ms) (± 3.56%)  39.1kB/op
after   40.19k ( 24.88µs) (± 0.98%)  39.1kB/op

Unshift

require "benchmark"

Benchmark.ips do |x|
  x.report("unshift") do
    array = [] of Int32
    10_000.times do |i|
      array.unshift(i)
    end
  end
end

before  535.09 (  1.87ms) (± 3.75%)  96.8kB/op  fastest
after   35.95k ( 27.81µs) (± 1.63%)  96.8kB/op  fastest

In both operations, this change leads to speed that's around 2 orders of magnitude faster.

Please

Can we have this in 1.0 pretty please? I don't mind undoing this if we later decide to go with a thread-safe Array. But until then everyone can enjoy a more efficient Array.

bcardiff · 2020-12-15T19:35:10Z

I believe that in the CrystalArraySyntheticProvider, the self.buffer initialization should change to self.buffer = self.valobj.child[3].

asterite · 2020-12-15T19:48:10Z

@bcardiff Where do I change that?

bcardiff · 2020-12-15T19:59:39Z

At https://github.com/crystal-lang/crystal/blob/master/etc/lldb/crystal_formatters.py#L14

Unfortunately, the manual specs that can be trigger by https://github.com/crystal-lang/crystal/blob/master/spec/debug/test.sh do not include a spec for Array.

bcardiff · 2020-12-15T20:00:19Z

(or you can change the ivars order 🤷‍♂️ )

asterite · 2020-12-15T20:05:57Z

@bcardiff Good catch, done!

We can't change the ivars order because then the Array will occupy more space.

bcardiff · 2020-12-15T20:27:06Z

FYI, The linux_32 failure seems to be because of a different value of __pad1 : UShort which should not matter and has probably random date. But default struct equality compares all fields.

asterite · 2020-12-15T21:00:45Z

Thanks. We can fix that in a separate PR.

RespiteSage · 2020-12-15T22:00:49Z

This is exciting! However, how does Deque compare now to Array? Is there still a performance benefit, or does its usefulness become limited to a more focused API?

mwlang · 2020-12-16T02:34:15Z

This is a big deal. I would love to see this particular optimization make it into the merge. I am constantly having to refactor code that uses Array's to find different solutions than what comes most naturally. Perhaps I'm just spoiled by Ruby's implementation of Arrays and I need to be less lazy, but old habits die hard and one great reason to use Crystal is precisely because so many Ruby concepts carry over.

asterite · 2020-12-17T18:33:23Z

However, how does Deque compare now to Array?

Good question. You can try doing some benchmarks on both data structures and see.

My guess is that Deque will be more memory efficient when used exclusively as a deque, compared to Array, but I don't know. For example rotate is more efficient in Deque. I wouldn't remove Deque from the standard library.

asterite added 2 commits December 15, 2020 14:44

Array: optimize shift

d13f01d

Array: optimize unshift

0bb29c3

asterite added performance topic:stdlib:collection labels Dec 15, 2020

asterite mentioned this pull request Dec 15, 2020

Overdosing on GC #10079

Closed

Fix Array debug info

8b6a60f

straight-shoota approved these changes Dec 18, 2020

View reviewed changes

asterite added this to the 1.0.0 milestone Dec 18, 2020

asterite merged commit cbb33db into master Dec 18, 2020

asterite deleted the opt/array-shift branch December 18, 2020 10:10

Sija mentioned this pull request Jan 9, 2021

Fix selected issues found by Ameba #10230

Closed

HertzDevil mentioned this pull request Feb 10, 2021

each_cons should be able to use deques when reusing the buffer. #7189

Closed

straight-shoota mentioned this pull request Mar 24, 2021

Function return type does not match operand type of return inst! #10544

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Array: optimize shift and unshift #10081

Array: optimize shift and unshift #10081

asterite commented Dec 15, 2020 •

edited

Loading

bcardiff commented Dec 15, 2020

asterite commented Dec 15, 2020

bcardiff commented Dec 15, 2020

bcardiff commented Dec 15, 2020

asterite commented Dec 15, 2020

bcardiff commented Dec 15, 2020

asterite commented Dec 15, 2020

RespiteSage commented Dec 15, 2020

mwlang commented Dec 16, 2020

asterite commented Dec 17, 2020

Array: optimize shift and unshift #10081

Array: optimize shift and unshift #10081

Conversation

asterite commented Dec 15, 2020 • edited Loading

How Array#shift used to work

How Array#shift is optimized

How Array#unshift is optimized

Why is this important?

Benchmarks!

Shift

Unshift

Please

bcardiff commented Dec 15, 2020

asterite commented Dec 15, 2020

bcardiff commented Dec 15, 2020

bcardiff commented Dec 15, 2020

asterite commented Dec 15, 2020

bcardiff commented Dec 15, 2020

asterite commented Dec 15, 2020

RespiteSage commented Dec 15, 2020

mwlang commented Dec 16, 2020

asterite commented Dec 17, 2020

asterite commented Dec 15, 2020 •

edited

Loading