About add-ons optimization
Hey, just thought I'd ask for some good tips for optimizing my add-ons. Any really obvious things I should do/ tools I should use?
So far what I've been doing is focusing on functions that run frequently (OnUpdate, UNIT_AURA, log events) and removing all calls to global functions/ variables, plus making sure no variables are created in that scope. The only tool I have for it is WowGlobalFinder, and, err, the game's default display of the 3 add-ons that take up the most memory. What other techniques and add-ons should I know about? I can only assume there are a bunch of tips and tricks for working with strings and tables that I'm not aware of... |
In descending order, I'd say these should be your areas of focus:
1. Make sure you are not leaking any globals, and reconsider the necessity of any intentional globals, including the names of frames and other objects. Your main addon object and/or main display frame are good candidates for global names; other objects generally don't need names unless you're using Blizzard templates that require them. 2. Upvalue any globals you're reading in frequently called functions, like OnUpdate, CLEU, UNIT_AURA, etc. Don't bother upvaluing globals you only read infrequently, such as in response to the user changing an option, the player leaving combat, etc. 3. Avoid using OnUpdate scripts for anything other than updating a visual display, such as a statusbar, and move as much of that work as you can to other places, such as event handlers. For example, if you're showing a timer bar for a buff on the player, don't call UnitAura each time your OnUpdate script runs -- store the info you need in variables, and update those variables when UNIT_AURA fires for the player unit instead. Depending on what you're updating, you may be able to offload some/all of the work into C code by using animations. 4. Avoid calling functions when you can, as it's really slow. Some common examples of unnecessary function calls include: (a) using local x = select(2, ...) instead of local _, x = ... (b) using string.format("raid%d", i) instead of "raid"..i (c) using for i, v in ipairs(tbl) do instead of for i = 1, #tbl do local v = tbl[ i ] 5. Avoid creating new tables when you can reuse existing tables. In keeping with #4, don't use wipe unless you actually need to -- for example, if you have a table that's storing info about a buff obtained via UnitAura, you don't need to wipe it, since the keys don't change, and you just overwrite the values. 6. Keep variables limited to the narrowest scope necessary. For example, if a variable is only used inside a loop, define it locally inside the loop, not outside. 7. Call string functions directly, eg. format("banana", "%l", strupper, 1) instead of of ("banana"):format("%l", strupper, 1) -- both incur the cost of a function call, but the second also incurs the costs of getting a metatable and performing a table lookup. If you upvalue format then the direct-call method is enormously faster, but even as a global lookup it's still slightly faster. The only time I'd recommend using the meta-method is when you're chaining multiple string operations; in that case, it's still the slower way to do it, but the increased readability of your code is usually worth it. If you want more specifics, post your code. |
No, I don't think you want me to post the thousands of lines of all my add-ons :p
Good tips, I especially like 5 (because I haven't thought of it at all). However, 4 surprised me a bit. When I was doing my testing, defining a single number variable in OnUpdate had a very visible effect on my performance; calling a function (or twenty) didn't seem to have such an effect. |
Static memory usage is almost completely irrelevant, and I have nothing but contempt for whichever Blizzard developer thought it would be a good idea to put "how much memory your addons are using" into the default UI. Most users (and most addon developers) do not understand what those numbers mean, and having them displayed just leads to annoyed developers having to explain over and over that "lower memory usage" is not intrinsically good or a worthwhile goal.
The metrics you should be caring about -- the ones that actually matter -- are increasing memory usage (how quickly your addon's memory usage is growing) and CPU time. Those are the things that actually affect framerates. I use OptionHouse to measure them, but there are plenty of other options, including Broker plugins. Make sure you disable CPU profiling when you're done, as leaving it enabled will slow down everything across the board. This thread on WowAce from last year includes some benchmark tests I did showing the differences in speed (CPU time) for global lookups vs. upvalues, and different styles of calling string methods: http://forums.wowace.com/showthread....000#post322000 Also, you should just attach your files, or post a link to your addon download. It doesn't really matter whether it's 100 lines or 100,000 lines; any bad coding habits will be just as easy to scan through and see regardless of the size, and I have a lot of free time at work. :p |
Don't worry, I get that static memory bit. What I meant was, that display helped me see how SanityCheck was taking up rapidly increasing amounts of memory.
I wouldn't go as far as to flame the very idea of that display... my machine, for one, is old enough for me to care if I have several heavy add-ons loaded, even if they don't use much CPU. I doubt it was intended for this purpose; Blizzard, or any other sane organization, would not put that kind of tool on the main display of their game. Also, just to make sure we're on the same page: when you say "upvalue" you mean this sort of assignment, right? Lua Code:
Anyway, I will definitely get that OptionHouse and see about that old thread, thanks. And if you're truly bored enough, please have a look at this and this and maybe this. Do not under any circumstances look at the code of MooTrack, it was my first add-on over 50 lines and the code sucks immensely :p |
Quote:
lua Code:
If we would ignore the fact that I need maxCPoints as an upper bound for the loop for the sake of the example, wouldn't it be still better to define maxCPoints outside the for-loop, despite it is only needed inside? Wouldn't this spare me a global look-up for MAX_COMBO_POINTS and a garbage collection on every loop iteration? I don't know how GC works in loops, but if it collects on every iteration would this be cheaper than having to move one scope up to find maxCPoints? The other question is whether it is better to use: Code:
cPoint:SetTexture(ns.colors.cpoints[i][1], ns.colors.cpoints[i][2], ns.colors.cpoints[i][3]) |
That's actually more specific of a situation, as your definition of maxCPoints is needed in all iterations in the for loop, it's best to keep it where it is instead of reinitializing the same value to the same local multiple times. This can impact performance to do so unnecessarily.
As a rule of thumb, if a variable is needed in all iterations of a loop instead of in each independent iteration, it's best to define it as an upvalue instead of inside the loop. As far as using unpack() versus manually indexing tables, it depends on how you're indexing the table for each return and at some point, I would think unpack() would start to be more appealing, but no data exists to suggest at what point (if any) this will start to happen. It can be guessed at that indexing the table will happen in C side when dealing with unpack(), but another question is how would it compare to manual indexing in a single assignment statement? To make the comparison fair, the manual indexing will need to be a single-dimensional table. For example, ns.colors.cpoints[i][1] will use 4 indexing operations for a single value when storing the specific table into a local, then indexing it in your function call will result in only 1 index operation per value. For example: Code:
local points=ns.colors.cpoints[i]; |
What is the difference between "needed in all iterations" and "needed in each independent iteration"? Please ignore the use of maxCPoints as the upper bound of the loop as this would be an obvious reason to define it locally in a higher scope as I also need it in each iteration. I thought that the point here would be that I assign a global (MAX_COMBO_POINTS) to maxCPoints, so that I would have a local variable definition, a global look-up and a garbage collect(??) for each loop iteration, where the global look-up is the most expensive part. So would the global look-up be reason enough to define maxCPoints outside the loop despite it being used only in the loop?
|
Quote:
A good rule of thumb is that the cost of defining a local vs indexing hits the break even point at 3 look-ups. So, if you need an indexed value 3+ times you are not hurting anything by creating a local reference and using it instead. That said, you do not need MAX_COMBO_POINTS within the loop at all. The calculation you are doing is static for each iteration so it only needs to be done once prior to the loop. Doing that also lets you get rid of the cPoint:GetWidth() index/call as you will already have that value. Code:
local AddComboPointsBar = function(self, width, height, spacing) 1. Moved the size calculations to outside the loop 2. Remove the name of the points to avoid clutter in _G 3. Changed the math a bit on sizing to make full use of the width passed 4. Used local references for indexes used in the loop (if you don't use the alpha color parameter remove the color[4] bit) With that said, I can't imagine that function ever being called enough times to have ever warranted more that a quick pass for optimization. You only really need to scrutinize code for heavily called stuff like OnUpdate scripts and certain event handlers (CLEU, UNIT_AURA, etc...). |
You are totally right, Vrul, thank you for your proposal, I'm going to make the changes. The loop iterates only 5 times and the function is called only once for the target frame in my oUF layout, so yes, it doesn't need that much of optimization. The global names are so I could debug frames and textures positioning with /fstack (had hard time with those because of wrong parenting and overuse of SetFrameLevel and stuff). So yes, it is overall a bad example for the need of optimization, I just wanted to show some real code as it raised questions for me and I could apply the answers elsewhere.
So, assumed I cannot move the calculation involving MAX_COMBO_POINTS, it's better to define it outside the loop so that I could spare the global look-up on every loop iteration. If it would be a local function call, or a local table look-up or a literal, I shall declare the variable for this in the loop. Is this the right way to sum it up? |
Quote:
Quote:
|
Quote:
|
To illustrate:
Lua Code:
|
Quote:
|
Quote:
Code:
local unit Code:
for i = 1, 4 do Code:
local x = 4 |
Quote:
Quote:
|
Phanx: Sorry, what I meant to say was not that I don't understand the statement, but that I don't understand the reasoning behind it, which appears counter-intuitive to me.
If we take your first and second examples, what do you gain by moving the variable into the loop? It seems like you just define 4 variables instead of just 1, which means more CPU time spent on creating the variable and more work for the GC. |
I won't comment too much on the performance aspect, but there's probably nothing to gain by doing it that way. It's possible that there's some minuscule performance loss, but it'll never ever be something that anyone will notice, much less worry about. Assigning the value to the variable is the significant portion of the code, not the declaration of the variable.
Anyway, the real reason is readability and consistency. If you have a variable that's only relevant in one iteration of a loop, it has no place outside of that loop. If you have a variable that doesn't change for the entire loop however, it makes sense to define that variable and assign the value only once. |
Quote:
|
If an addon has A.lua(at the top of the toc file), B.lua
A.lua: Code:
local addonName, addon = ... Code:
local addonName, addon = ... |
That's an interesting one, but is it really worth it? You add everything from _G you use in your addon to the addon namespace, even if you only access it once. An upvalue is still faster than the non-global environment and you loose the ability to access _G directly. Or am I wrong about it?
|
That looks like a giant cluster**** and I would not recommend doing anything like it. If you need functions from one of your addon's files in another, just put them in the namespace table and call them as methods.
Also, that's not really relevant to this thread. If you want to discuss using custom function environments, please take that discussion to its own thread. |
This is used to keep your variables not pollute the _G, and gain some code flexibility.Also, the up-value is not as quick as you think.
Here is a test, a() function contains a long calculation, and after it, there are three call types : 1. Define function in the global, call the global function directly. 2. Define up-values, the use the up-values to "speed up" the calculation. 3. Use the standalone environment, do the calculation. Lua Code:
The result may not be exactly because there are too many things should cost the cpu in the same time, but you can see a lot in it, I run these code three times in lua 5.1.2 on mac shell: Quote:
Quote:
Quote:
In an addon, you don't need access all things in the _G, only what you need is saved to your addon, and if you want access something once, you always can do it like : print(_G.CAT_FORM) Just access if from _G table, you won't save the CAT_FORM in your addon. |
Oh, one more thing, about the memory usage, the up-values cost much more if you use the local vars in many functions in your add-on, Here is a little test :
Lua Code:
So, here is the result: Quote:
|
Quote:
The point of keeping variables in their relevant scope is akin to only allocating memory when you need to use it and proper cleanup when you don't in other languages. All this is done behind the scenes with Lua's garbage collector, but it's still a good practice to follow nonetheless. These so-called "good programming practices" make it easy and simple to program reliable code regardless of which language you're using. The code posted by Rainrider and Phanx are both correct, although the situations of the variables in question by each are completely different. |
Quote:
While it's obvious that, code-wise, a variable should be defined in the smallest scope possible, I think that that is not a tip that belongs in a discussion about optimization :p |
So it is worth to upvalue globals now or not?
|
Quote:
|
Quote:
If you're accessing the global in response to the user checking a box in your options panel, or in response to an event like PLAYER_LOGIN or PLAYER_REGEN_DISABLED that doesn't fire very often, then while you will technically see a speed improvement by upvaluing it, in practical terms there is no value in doing so, so I'd recommend you keep your code tidy and not clutter it up with a bunch of practically useless upvalues. |
Quote:
But, without seeing the actual code in question, it's really pointless to talk about it. |
Well, if you need a var in the OnUpdate or something happened frequently, I suggest you may try the coroutine. Also a test example :
1. First part is using a local var 'sum' to contains the sum result. And then we call it 10000 times. 2. Second part is using a coroutine to keep everything, we also can it 10000 times. Lua Code:
The result is : Quote:
Lua Code:
So, when your frame is visible, the thread will be called again and again, when not, the thread will be stop until the frame is shown again. But, if your handle code is tiny, just use upvalue, in the previous example, if you change the code Lua Code:
Lua Code:
The result should be Quote:
|
Please stop derailing this "basic tips for optimization" thread with your posts about coroutines and custom function environments. If you want to post tutorials on those subjects, please do it a new thread.
|
I find his stuff very interesting to be honest.
1. Don't do premature optimization. 2. Don't sacrifice readability for negligible optimizations. That being said, here's a PDF on performance tips for Lua written by its lead architect. http://www.lua.org/gems/sample.pdf |
Quote:
I also take back the word 'The upvalue is the slowest', I'm confused about the result too, redo the test several times today, only 1 time the upvalue is slower than others, the diff between them is little. I prefer the custom environment just because after some time, the custom environment will store all things that the addon needed, from the point, the custom environment table will be stable compares to the _G. |
Quote:
|
On the subject to declaring variables to their most confining scope next should be noted:
Lua Code:
Code:
-- Compiles into Lua Code:
Code:
-- Compiles into Lua Code:
Code:
-- Compiles into While doing a large amount of variable declarations, next should also be noted (I'll leave the compiled versions out since I don't want to pollute the thread): Lua Code:
If you really want to optimize your addon, you need to look at compiled code and understand how function calls/lua stack works. |
This is still deviating from the point Phanx and I were making. Phanx is stating as a general rule that locals should stay in the tightest scope possible. My posts state that like most rules, there are some exceptions over the argument whether locals should stay in or out of loops.
Among these is the point that when dealing with constants or CPU-intensive calculations that don't change in the loop, you're best left upvaluing them instead of having the loop reinitialize the variable with the same value multiple times. |
Yep, you guys are definitely missing the point. This is a thread about simple, general tips that don't require a lot of coding experience or deep knowledge of how Lua works internally -- it's not meant to cover every possible scenario, and it's not meant to delve into complicated schemes for extreme optimization of every single CPU cycle. I'm tired of asking, but if you want to provide lengthy benchmarking results and tutorials covering every possible exception to the rule, or extreme micro-optimization, please start your own threads for that stuff.
|
Quote:
|
Quote:
p.s. If you are really curious, the test results with your suggestion is : here |
Quote:
Lua Code:
Tested it ingame: 7980 ms 8073 ms 8664 ms Also it's pretty fast with integers only, much slower with strings/other variables. ~8500 ms ~600 ms gain on a million calls. |
Quote:
I mean, in the example bellow (from a non publshed little addon of mine), the function AlkaT_SpORCo_update() is called only once every 0.25 seconds, and it is in this function that I call two global functions. If I understood correctly, upvalueing these two functions would bring no benefit at all, right? Lua Code:
|
The point of upvaluing global functions called frequently is to reduce the amount of time it takes to get a return from those functions. A local is much faster to access than a global is.
Your script calls your AlkaT_SpORCo_update function and has to wait for it to finish before moving on to the next line in the script (AlkaT_timeSinceLastUpdate = 0). The faster it is completed, the faster it can move on. Your AlkaT_SpORCo_update function has to wait on the global functions that it calls before moving on in its code. And so on. Lua is single-threaded. |
Thank you for your reply, Seerah.
I was however already aware of what you said. Funny though, it made the thoughts clearer in my mind - I don't know how you did that, but thank you! :cool: So, I guess my real question should rather be: What would be the cons, if any, of upvalueing those global functions? I now recall that, on the thread which brought me to this one, there was a link to another thread, specifically on upvalueing, so I guess I'll have a look at that when I get some time. If I still have questions about this snippet afterwards, I'll post them back here. ;) P.s.: Your comment about the single-threaded nature of Lua wasn't exactly new to me, but, honestly, I had presumed it rather than properly looked into it. I'm not in the ITC industry (never was) but codeing has been a hobby of mine since I was about 15 (back then mainly in plain C), over 25 years now. And to be honest, the multi-threading technology was never something I fully understood (nor dived much into), codeing-wise. The way I see it, even something like Lua coroutines is only a sequential pause-this-thread, then jump-to-another, then back-to-the-first (at the simplest case of only two "coroutines"). I can understand event-driven programming and have a shallow understanding of CPU IRQs as well, so I have a vague idea on how two (or more) cores CPUs work together. But again, it's only a vague understanding of it, and as far I remember, I've never actually done any programming that wasn't, in my mind, "single-threaded". I guess I could have started a new topic about this but I fear I would quickly loose the "thread" of it in the ensueing discussion, so I'm just dropping this here now. :D |
This is a huge oversimplification, and is not 100% accurate, but it gets the point across:
Lua Code:
If a language supported up to the new 36 threads from AMD and Intel, you could send 36 commands at the same time. As to your question about cons, the two biggest about upvaluing are:
|
(Someone with 8GB of RAM would be fine ;) )
|
Upvaluing globals is a micro-optimization that you shouldn't bother actively thinking about when you're coding; looking up a variable is very fast and any addon that would see an appreciable difference from storing a local copy to access is probably not going to be saved by this.
Write your program to do what you want it to do, then if something is impacting performance, clean it up. Anything that isn't an obvious performance gain should be deferred. Premature optimization can introduce unexpected behavior that can end up wasting a lot of your time tracking down and debugging. Creating a local copy of a variable comes with the unintended (or intentional) consequence of preventing it from being replaced or hooked by other addons later in the loading process. For example, let's say you want to monitor when another addon modifies a cvar so you know not to touch that cvar in the future, so you hook the global function SetCVar. When an addon calls SetCVar it also runs your function and everyone is happy. But what if this addon creates a local reference to SetCVar before you hook it? Now when it calls SetCVar your function hook doesn't run, and everyone is sad because your addon ends up overwriting their addon's settings because it didn't see the change it made. Additionally, lua actually has an internal limit of 60 upvalues. This isn't something you normally have to worry about, but you could conceivably run into it by redeclaring a large number of globals as local variables in your addon, and then trying to access them all in a function. |
Quote:
And since both calls also goes up to another C function the time execute it would be a minuscule bit faster call, for the memory size of the function pointer. |
Thanks for all your replies.
@ myrroddin Very nice explanation ons single-threaded vs multi-threaded languages. That was pretty much what I thought, it's the "huge oversimplification, and is not 100% accurate" part I'm sure would loose me... :D On the cons of upvalueing, again, nice explanation, and makes perfect sense. But I was a bit surprised that 8GB would give so little leeway, having WoW running (mostly) alnoe. Seerah seems to have a different view on this, so... :confused: @ aemiar Those are very interesting points. Regarding that little addon of mine (prints player coordinates and speed on BattlefieldMinimap), it's pretty much done. I actually picked it up as example fpr its simplicity. But thanks for the hint, it is a good coding principle, I think. And you pretty much answered it in regards to my original question: upvalueing, in my case, where the globals are called only once per 0.25 seconds isn't worth it; might be if they were called every time OnUpdate was called (only to check how long since last call presently). But, as it is, no point. Quote:
@ Resike Quote:
I thought it was (NOT actual code): Lua Code:
A) Lua Code:
or... B) Lua Code:
Which way does it work like? |
A, I guess. The function never changes, you just get a new reference to it.
|
Quote:
Imagine that a is a reference to the function you want to upvalue. By accessing the function through a variable, you're essentially just retrieving the pointer to the memory where the function is stored (b). When you upvalue a function, you're copying the contents of a, that is to say the memory address to your function, but not the actual function. The function you end up calling is the same, whether global or local in your scope. |
Quote:
Global call: Pointer to the global table _G -> lookup for the subtable _G.func -> get the table's pointer value -> call the C function from the returned pointer (Since lua does not support multithreading calling _G and accessing it's subtable's value will take 2 cycles!) Upvalued call: Pointer to the upvalued func function -> get the table's pointer value -> call the C function from the returned pointer |
Quote:
|
Quote:
|
Quote:
You can try it yourself: Lua Code:
You can do this in the other way around, it won't change a thing: Lua Code:
|
Quote:
|
Quote:
This does not print anything: Code:
local SetCVar = SetCVar |
Quote:
Lua Code:
So globalIndexName is a _G[index] value. You can abuse this for most work that runs alongside the default WoW interface. However, of course, it only works for functions and most frame functions that are global. You can find some more info on WoWProgramming. |
Quote:
|
Quote:
|
All times are GMT -6. The time now is 08:53 PM. |
vBulletin © 2024, Jelsoft Enterprises Ltd
© 2004 - 2022 MMOUI