I have some thoughts on how reference counting is implemented.
As of now, for every object in Urho3D that inherits from RefCounted (so nearly everything in the engine and most of the stuff in a user’s application) there will be two calls to malloc. One to allocate the object and another for allocating the RefCount object: https://github.com/urho3d/Urho3D/blob/master/Source/Urho3D/Container/RefCounted.cpp#L33
The question here is: Is calling malloc twice having any impact on performance? Or are there other things that outweigh the cost of allocating?
If this is a concern, then I have two ideas. The first is to inline refs_ and weakRefs_ into RefCounted and get rid of RefCount entirely. In order for weakrefs to still work, the conditions for when the object is destructed and freed would have to be modified: Call the destructor when refs_ reaches zero, free the memory when weakRefs_ reaches zero. This has the advantage of only having a single malloc call and the refcounts are located close to the object itself which makes the cache more coherent. The disadvantage is the additional delete logic, having to overload operator new, and the object remains allocated as long as there are weak references pointing to it.
Another idea might be to have a memory pool for RefCount objects. This would improve allocation speed but the refcount would be located far away from the object in memory, which is bad news for the cache whenever you modify the refcounts.
I’d like to hear your thoughts. Maybe this whole thing is also not an issue.