Thank you for your response. I retract my concerns. The protocol seems adequately performant with the new information you have provided.
The benchmark is clearly artificially bottlenecking on (non-disclosed by the vendor) I/O limits and being provided excess compute for stability/"target deployment" reasons and is thus not indicative of the actual protocol compute bottleneck.
It might be beneficial to include these details in the documentation so that your benchmarks do not appear to show much worse performance to a casual reader who does not know the internal structure of the benchmarked system. That or present a benchmark that is not artificially bottlenecked (or show compute load of the bottlenecked implementation) to demonstrate the actual performance limits of the protocol.
The benchmark is clearly artificially bottlenecking on (non-disclosed by the vendor) I/O limits and being provided excess compute for stability/"target deployment" reasons and is thus not indicative of the actual protocol compute bottleneck.
It might be beneficial to include these details in the documentation so that your benchmarks do not appear to show much worse performance to a casual reader who does not know the internal structure of the benchmarked system. That or present a benchmark that is not artificially bottlenecked (or show compute load of the bottlenecked implementation) to demonstrate the actual performance limits of the protocol.