WIP: Introduce PacketBatch with Growable buffer #29859

steviez · 2023-01-24T15:52:56Z

WIP: Still pulling stuff in, so this PR should be seeing some heavy adds for the next couple days. In any case, I'm being intentional with splitting things up per-commit. The "scaffolding" changes are currently left for sake of reducing the diff, and I have already identified a piece or two that can be carved out in their own PR's. Working on that to keep what will be the end diff reasonable.

Problem

We want to support transactions that exceed the current 1232 byte limit. In order to do so, we need to use a container that is capable of holding transactions that are larger in serialized size. Another PR (#29055) took a blanket approach of adjusting the buffer for all Packets along the Tx processing path, but that approach uses additional memory for all transactions, even those that fit well within the original 1232 byte buffer.

Summary of Changes

Introduce a data structure, TxPacketBatch (leaning towards rename to GrowablePacketBatch). This data structure is meant to be a drop-in for the existing PacketBatch with following callouts:

The buffer length is the same for all packets in the batch, but can be grown at runtime (ie if a caller wants more space than the default 1232 bytes)
The buffers are laid out contiguously in memory (as opposed to a Vec<Vec<u8>>) for sake of memory locality (perf + ability to sigverify with GPU)

Fixes #

steviez · 2023-01-24T16:57:56Z

streamer/src/streamer.rs

+                        full_packet_batches_count.fetch_add(1, Ordering::Relaxed);
+                    }
+
+                    let packet_batch = TxPacketBatch::from_packet_batch(packet_batch);


This while recv_loop2() & receiver2() is a kind of a hack I made at the moment in lieu of deferring a design decision to review.

The question is whether we want to use the new data structure TxPacketBatch all throughout the codebase (Tx's, shreds, gossip), or keep its' use limited to the Tx path. As I have it coded now, I keep the change limited to the Tx processing path, and this is a shim for minimizing code duplication (can probably get something cleaner in the end result).

So, the direction question that I'd like feedback on is: Should we keep the existing PacketBatch data structure around for uses where we will continue to use UDP (and thus keep the 1232 byte restriction), or do we want to go all in and use the new data structure everywhere?

ilya-bobyr · 2023-01-25T06:05:02Z

perf/src/tx_packet_batch.rs

+    /// Current capacity of batch (packets)
+    capacity: usize,
+    /// Current length of batch (packets)
+    len: usize,
+    /// Length of batch packet's payload buffer (bytes)
+    buffer_len: usize,
+    /// Offset to ensure Meta references from buffer are properly aligned
+    alignment_offset: usize,


I wonder if it would be useful to use wrapper types to put information as to the kind of value counted by these usize fields into types.
The (packets), (bytes) clarifications in the field comments might be pointing in that direction.

Here is a relevant discussion: https://www.reddit.com/r/rust/comments/hp6dx4/idiomatic_custom_number_types/

So something like

use derive_more::{Add, Display, From, Into}; #[derive(PartialEq, Eq, Debug, From, Add)] struct PacketCount(usize); #[derive(PartialEq, Eq, Debug, From, Add)] struct BufferSize(usize); pub struct TxPacketBatch { /// Shared buffer for all packets in batch // TODO: change to u64 so we get guaranteed same alignment as Packet (8) data: PinnedVec<u8>, /// Current capacity of batch (packets) capacity: PacketCount, /// Current length of batch (packets) len: PacketCount, /// Length of batch packet's payload buffer (bytes) buffer_len: BufferSize, /// Offset to ensure Meta references from buffer are properly aligned // TODO Not sure what unit is the right one to use here. `ByteOffset`? alignment_offset: usize, }

I have not seen much use of wrapper types in other cases.
So, this might be out of scope for this change.

perf/src/tx_packet_batch.rs

ilya-bobyr · 2023-01-25T06:29:28Z

perf/src/tx_packet_batch.rs

+    pub fn from_packet_batch(old_batch: PacketBatch) -> Self {
+        // This implementation adapted from an example provided in the
+        // std::mem::transmute documentation
+        // https://doc.rust-lang.org/std/mem/fn.transmute.html
+        let mut old_batch = ManuallyDrop::new(old_batch);


I think it would be good to explain the reasoning behind the transformations that happen here.
A reference to transmute is useful, but it is better to just be explicit.
My guess is that we are trying to reinterpret PacketBatch content directly as a sequence of bytes.
But you might have additional reasoning, such as performance expectations.

ilya-bobyr · 2023-01-25T06:32:49Z

perf/src/tx_packet_batch.rs

+        let data = unsafe {
+            PinnedVec::from_vec(Vec::from_raw_parts(
+                ptr as *mut u8,
+                len * size_of::<Packet>(),
+                capacity * size_of::<Packet>(),
+            ))
+        };


This transformation looks pretty scary.
It is only correct, if both PacketBatch is Copy.
I wonder if there is a way to enforce that.

At the same time, my understanding is that elsewhere, when we send data into the network or receive it from the network, we do not directly treat memory content as bytes.
We use bincode and other serialization libraries.
Maybe instead of reinterpreting bytes directly here, there is a safer way that provides the same performance?
For example, serializing PacketBatch into a byte buffer that will be later sent over the network?

perf/src/tx_packet_batch.rs

ilya-bobyr · 2023-01-25T06:44:53Z

perf/src/tx_packet_batch.rs

+        let new_buffer_len = TxPacketSize::Double;
+        let new_buffer_len = new_buffer_len.into();


minor

Maybe just

let new_buffer_len = TxPacketSize::Double.into();

ilya-bobyr · 2023-01-25T07:22:11Z

core/src/sigverify_stage.rs

@@ -216,27 +202,11 @@ impl SigVerifierStats {
    }
 }

-impl SigVerifier for DisabledSigVerifier {


Maybe add a note into the commit description as to why you are removing this trait?
Looks like it had only one use case - part of the API of TransactionSigVerifier.
Was it an unnecessary abstraction?

Was it an unnecessary abstraction?

Yes, made things a little easier with this change. I intentionally have that change broken out into its' own commit, and should we proceed with this PR, I will open a separate PR for just that commit and rebase this PR on top

ilya-bobyr · 2023-01-25T07:27:41Z

perf/src/deduper.rs

@@ -85,14 +83,14 @@ impl Deduper {

    pub fn dedup_packets_and_count_discards(
        &self,
-        batches: &mut [PacketBatch],
-        mut process_received_packet: impl FnMut(&mut Packet, bool, bool),
+        batches: &mut Vec<TxPacketBatch>,


A change from &mut [T] to &mut Vec[T] is reducing generalization.
As the rest of the changes in this patch do not need that, maybe keep &mut [T]?

Pull request has been modified.

steviez · 2023-01-26T20:19:54Z

perf/src/tx_packet_batch.rs

+    #[inline]
+    /// Returns a mutable pointer to the beginning of the data buffer.
+    pub fn as_mut_ptr(&mut self) -> *mut u8 {
+        unsafe { self.data.as_mut_ptr().cast::<u8>().add(0) }


Oops, I'll clean this up

steviez · 2023-01-27T07:54:48Z

Had a good conversation with @ilya-bobyr - this approach temporarily on hold in favor of another

steviez added the noCI Suppress CI on this Pull Request label Jan 24, 2023

steviez requested review from joncinque, KirillLykov and lijunwangs January 24, 2023 15:53

KirillLykov requested a review from ilya-bobyr January 24, 2023 16:03

steviez commented Jan 24, 2023

View reviewed changes

ilya-bobyr previously requested changes Jan 25, 2023

View reviewed changes

steviez added 3 commits January 26, 2023 13:57

Introduce TxPacketBatch data structure

5d1c839

Rip out SigVerifier trait

d250fca

Plumbing TxPacketBatch through Tx packet path

02f407f

steviez force-pushed the XxTx branch from 6c4bc00 to 02f407f Compare January 26, 2023 19:58

steviez commented Jan 26, 2023

View reviewed changes

github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Feb 10, 2023

github-actions bot closed this Feb 20, 2023

steviez deleted the XxTx branch May 25, 2023 04:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Introduce PacketBatch with Growable buffer #29859

WIP: Introduce PacketBatch with Growable buffer #29859

steviez commented Jan 24, 2023

steviez Jan 24, 2023

ilya-bobyr Jan 25, 2023

ilya-bobyr Jan 25, 2023

ilya-bobyr Jan 25, 2023

ilya-bobyr Jan 25, 2023

ilya-bobyr Jan 25, 2023

steviez Jan 26, 2023

ilya-bobyr Jan 25, 2023

steviez Jan 26, 2023

steviez commented Jan 27, 2023

		let new_buffer_len = TxPacketSize::Double;
		let new_buffer_len = new_buffer_len.into();

WIP: Introduce PacketBatch with Growable buffer #29859

WIP: Introduce PacketBatch with Growable buffer #29859

Conversation

steviez commented Jan 24, 2023

Problem

Summary of Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steviez commented Jan 27, 2023