A string with a local buffer does have some benefits. But sometimes it can be even worse. More memory is used. Move semantics, swapping, copying, default construction etc. take a bit more work than usual. And that takes it’s toll.
There were definitely things that could have been improved. I might look more into it. I’m pretty sure the quick and dirty implementation I did is far from perfect.
But for now. I’m starting to think that a string view coupled with the regular string implementation might work better. I’ll have to try and see.
Default construction empty:
static void Heap(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String s("abcd");
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Heap);
static void Local(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String2 s("abcd");
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Local);
Output:
--------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------
Heap 1 ns 1 ns 1000000000
Local 2 ns 2 ns 407922793
Default construction with C string of 4 chars:
static void Heap(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String s("abcd");
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Heap);
static void Local(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String2 s("abcd");
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Local);
Output:
--------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------
Heap 34 ns 34 ns 21367384
Local 5 ns 5 ns 100000000
Default construction with C string of 19 chars:
static void Heap(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String s("abcdefghijklmnefgh");
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Heap);
static void Local(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String2 s("abcdefghijklmnefgh");
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Local);
Constructing with a string that exceeds the local buffer capacity:
--------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------
Heap 33 ns 33 ns 22435754
Local 6 ns 6 ns 125840759
Default construction with C string of 24 chars:
static void Heap(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String s("abcdefghijklmnefghijklmn");
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Heap);
static void Local(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String2 s("abcdefghijklmnefghijklmn");
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Local);
Output:
--------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------
Heap 33 ns 33 ns 18643076
Local 37 ns 37 ns 18696461
Copying arrays of strings:
typedef Urho3D::Vector< Urho3D::String > StringVec;
typedef Urho3D::Vector< Urho3D::String2 > String2Vec;
static const unsigned sizes[] = {
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,
32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63
};
static const char * cstr = "abcdefghijklmnefghijklmnabcdefghijklmnefghijklmnabcdefghijklmnefghijklmn";
static void Heap(benchmark::State& state) {
StringVec v1, v2;
v1.Resize(64);
v2.Resize(64);
for (unsigned i = 0; i < 64; ++i)
v1[i].Append(cstr, i);
for (auto _ : state) {
v1 = v2;
}
}
BENCHMARK(Heap);
static void Local(benchmark::State& state) {
String2Vec v1, v2;
v1.Resize(64);
v2.Resize(64);
for (unsigned i = 0; i < 64; ++i)
v1[i].Append(cstr, i);
for (auto _ : state) {
v1 = v2;
}
}
BENCHMARK(Local);
Output:
--------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------
Heap 308 ns 310 ns 2361658
Local 409 ns 407 ns 1725827
Integer conversion:
static void Heap(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String s(242554);
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Heap);
static void Local(benchmark::State& state) {
for (auto _ : state) {
Urho3D::String2 s(242554);
benchmark::DoNotOptimize(s);
}
}
BENCHMARK(Local);
Output:
--------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------
Heap 111 ns 111 ns 5608938
Local 68 ns 68 ns 8974301
Environment:
- Windows 7 x64
- MinGW 7.2.0 x64 POSIX SEH
- Ryzen 5 1600x single threaded @3.7.