5x Slower Than Go? How to Optimize Rust Protobuf Decoding Performance by samanthasu

Share This Article

Sed ut perspiciatis unde.

When optimizing the write performance of GreptimeDB v0.7, we discovered through flame graphs that the CPU time spent parsing Prometheus write requests accounted for about 12% of the total. In comparison, the CPU time spent on protocol parsing by VictoriaMetrics, which is implemented in Go, is only around 5%. This forced us to start considering optimizing the overhead of the protocol conversion layer.

To simplify the discussion, all the test code is stored in the GitHub repository https://github.com/v0y4g3r/prom-write-request-bench.

bash

git clone https://github.com/v0y4g3r/prom-write-request-bench 
cd prom-write-request-bench
export PROJECT_ROOT=$(pwd)

Optimizing the overhead of the protocol conversion layer

Step 1: Reproduce the cases

First, let’s set up the baseline using a minimal reproducible benchmark. Corresponding branch:

bash

git checkout step1/reproduce

Rust-related benchmark code（benches/prom_decode.rs）：

rust

fn bench_decode_prom_request(c: &mut Criterion) {
    let mut d = std::path::PathBuf::from(env!("CARGO_MANIFEST_DIR"));
    d.push("assets");
    d.push("1709380533560664458.data");
    let data = Bytes::from(std::fs::read(d).unwrap());
    let mut request_pooled = WriteRequest::default();
    c.benchmark_group("decode")
        .bench_function("write_request", |b| {
            b.iter(|| {
                let mut request = WriteRequest::default();
                let data = data.clone();
                request.merge(data).unwrap();
            });
        });
}

Run the benchmark command multiple times:

bash

cargo bench -- decode/write_request

To receive the baseline result:

text

decode/write_request
time:   [7.3174 ms 7.3274 ms 7.3380 ms]
change: [+128.55% +129.11% +129.65%] (p = 0.00 < 0.05)

Pull the VictoriaMetrics code in the current directory to set up a Go performance testing environment:

bash

git clone  https://github.com/VictoriaMetrics/VictoriaMetrics
cd VictoriaMetrics
cat <<EOF > ./lib/prompb/prom_decode_bench_test.go
package prompb

import (
        "io/ioutil"
        "te

5x Slower Than Go? How to Optimize Rust Protobuf Decoding Performance by samanthasu

5x Slower Than Go? How to Optimize Rust Protobuf Decoding Performance by samanthasu

Share This Article

Newsletter

Optimizing the overhead of the protocol conversion layer

Step 1: Reproduce the cases

HackTech

Leave a comment Cancel reply

Editor's Choice

5x Slower Than Go? How to Optimize Rust Protobuf Decoding Performance by samanthasu

5x Slower Than Go? How to Optimize Rust Protobuf Decoding Performance by samanthasu

Share This Article

Newsletter

Optimizing the overhead of the protocol conversion layer ​

Step 1: Reproduce the cases ​

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter

Optimizing the overhead of the protocol conversion layer

Step 1: Reproduce the cases