virtual function performance issue



  • @tampere2021
    It's hard to compare virtual vs. non-virtual function calls. Mainly because a non-virtual function call can often be inlined = the call disappears completely. This also often leads to more optimization opportunities, e.g. by constant propagation.
    (In theory, virtual calls can also be inlined in some situations, but in practice you will not see this very often. Vs. inlining with non-virtual functions, which is very very common.)

    Assuming that the non-virtual function call is really a call (no inlining), then a non-virtual function call - on most platforms - is just a direct call. Which is typically very fast.

    A virtual call on the other hand requires at least one pointer to be fetched from the object + an indirect function call. In a typical implementation, it's even more because there's typically at least one additional indirection.

    So 2 loads 1 load + one indirect call for virtual vs. just a direct call for non-virtual. The main problem there is that the second load depends on the first one. This can noticeably slow down the call. Especially if the actual function that is called changes.

    As long as the called function is always the same, modern CPUs will often remember and predict the correct target address & speculatively execute that piece of code. At least when compiled without Spectre-mitigation 🙂

    EDIT: Seems I mis-remembered and the 2nd load isn't necessary. Hm. Strange. But the generated code doesn't lie 🙂
    (Of course there is a second load, because the indirect call is loading the target address from memory, but that's not what I meant.)


  • Gesperrt

    @hustbaer hi hustbaer, do you know why virtual function are slower compared to statically bound call? which of below reasons do you agree here? Based on discussions, here, i feel option 3.
    1)The virtual function call has to search for the correct class binding
    2)The class must use memory to maintain a table of virtual function pointers
    3)The virtual function call requires the run time type of the owning class to be identified
    4)The virtual function call requires a table lookup at runtime before calling.



  • @tampere2021
    You're quite persistent in trying to get people to pick one of those options. Which means you want to know this for some kind of test/assignment/... Which means I won't answer. Also the choices are worded in a strange way, a bit ambiguous. Many of them could be interpreted in a way so that the answer is "yes".


  • Gesperrt

    @hustbaer I have summarized these points based on discussions so far. its not test or assignment. Being a huge code base, trying to narrow down the root cause to a specific info and hence need to clarify on this.



  • @VLSI_Akiko sagte in virtual function performance issue:

    Yeah, the costly part is the one where the runtime has to figure out which instance of the virtual function needs to be called.

    I'm pretty sure you know what you're talking about, but you describe it in a strange way. I would never say the runtime needs to "figure that out". There is no "runtime" involved, just a few machine code instructions. Also there is nothing that I would call "figuring out". One load and one indirect jump and Bob's your uncle. Of course that's still enough to make things slow, especially when the call target changes frequently.


  • Gesperrt

    @hustbaer it takes longer to invoke a virtual method and that extra memory is required to store the information needed for the lookup. Virtual function calls must be resolved at run time by performing a vtable lookup, whereas non-virtual function calls can be resolved at compile time. This can make virtual function calls slower than non-virtual calls. In reality, this overhead may be negligible, particularly if our function does non-trivial work or if it is not called frequently.



  • @tampere2021 sagte in virtual function performance issue:

    @hustbaer it takes longer to invoke a virtual method and that extra memory is required to store the information needed for the lookup.

    Well, yes, kind-of. Yes, it takes memory. But usually not much. Each object typically only grows by the size of one pointer. (More if the class uses multiple inheritance.)
    And then, one vtable is needed per class. If you have lots of classes with lots of virtual functions, this can also add up to a lot. But it's usually not a big problem.

    Virtual function calls must be resolved at run time by performing a vtable lookup, whereas non-virtual function calls can be resolved at compile time.

    Yes

    This can make virtual function calls slower than non-virtual calls. In reality, this overhead may be negligible, particularly if our function does non-trivial work or if it is not called frequently.

    Yes. I'd even go as far as to say: it's very often negligible.

    But again: this comparison only makes sense when comparing to non-inlined direct calls. Whenever you see a noticeable slowdown from making a function virtual, chances are that the reason isn't actually the overhead of the virtual function call, but the fact that it can no longer be inlined.



  • Just wanted to throw in a broader argument: If the problem you're trying to solve really requires a solution akin to dynamic polymorphism, C++ virtual functions are probably one of the most efficient ways to go about it. In that case, you will have to pay the additional cost either way, likely in any language - e.g. in C you would probably solve that problem with a table of function pointers, which is roughly equivalent.

    If you can somehow avoid that overhead - either via the compiler's devirtualization optimization or by rephrasing your program somehow using static types, static polymorphism like CRTP - chances are that you didn't really need dynamic polymorphism in the first place.

    So if you are worried about the performance impact of virtual function calls, try to establish if they are really necessary or if an altenative would also work. If not, i'd assume you cannot avoid that overhead anyways. My suggestion would be don't use what you don't need and in return C++ won't make you pay for what you don't use (mostly) 😉



  • @hustbaer sagte in virtual function performance issue:

    @VLSI_Akiko sagte in virtual function performance issue:

    Yeah, the costly part is the one where the runtime has to figure out which instance of the virtual function needs to be called.

    I'm pretty sure you know what you're talking about, but you describe it in a strange way. I would never say the runtime needs to "figure that out". There is no "runtime" involved, just a few machine code instructions. Also there is nothing that I would call "figuring out". One load and one indirect jump and Bob's your uncle. Of course that's still enough to make things slow, especially when the call target changes frequently.

    Yeah sorry, I'm used to talk to people which never programmed in their life or only did some lines in BASIC. From time to time I also teach C++20 to people who must learn it (to keep their jobs) but really don't want to learn it. You would be surprised how hard it is to explain pointers to them - or even better - what a pointer to an array of pointers is (you know, the char ** in main()). That can keep them busy for weeks. For some strange reason people understand it better if you put an entity behind stuff.

    Back to the topic: Is the code compiled in debug mode or does it even run in some kind of analyzer (performance, debug, etc.)? What OS and compiler is used?


  • Gesperrt

    @VLSI_Akiko I used VS2019(v142 compiler) on 21H1 OS with Release build.



  • @tampere2021 sagte in virtual function performance issue:

    @VLSI_Akiko I used VS2019(v142 compiler) on 21H1 OS with Release build.

    What would also be really useful is a minimal working example which exhibits the virtual function slowdown you observed. I bet there are quite a few people here which would have a go at analyzing and optimizing it. To be honest, i am not really fully convinced that the slowdown is really caused by virtual function calls, as their effect is often so miniscule that it can be really hard to measure it.



  • @Finnegan sagte in virtual function performance issue:

    To be honest, i am not really fully convinced that the slowdown is really caused by virtual function calls, as their effect is often so miniscule that it can be really hard to measure it.

    I'm not convinced there is any actual code. To date, @tampere2021 only asked questions that very much smell like exam/homework questions.

    If there really is actual code, and if it shows a considerable slowdown because of the use of virtual functions, I'd be happy to take a look at it as well.


  • Gesperrt

    @hustbaer I cant share complete code here, but I have done some optimizations in the code. I know that a few virtual calls aren’t going to massively slow down a codebase. However, my code was a mass of tiny virtual calls for trivial operations. Parts of my code were iterating over every entity with at least three or four chained virtual function calls each and every time! In the my new code, I cut out all the virtual methods completely, sticking to bare C++ method calls only. I did inlined a lot of these and could see some difference. Also I made below changes in my code: Used initializer list to avoid default construction of contained objects. I saw that after Inlining, it removes function call overhead and speed up execution.


Anmelden zum Antworten