ActionScript and function calls April 11, 2011
Posted by viswaperiyanan in Flex.trackback
Recently I was investigating some performance issues that showed up after we migrated our application to flex 4 from 3.6. After few attempts, narrowed it down to our rather large style sheet with a lot of universal selectors. When I went through Flex source code that performed the selection of appropriate styles to apply for each component on screen (matchStyleDeclarations in StyleProtoChain), I found nothing wrong.
The screen I was looking at had about 1500 components (several grids). For each of the 1500 UIComponent, the CSSDeclarations were examined for potential matches. Our style sheet had 400 of those . The selection process is straight forward: Examine all 400 CSSDeclerations to find matches. Several other objects – CSSCondition and CSSSelector are involved in finding matches. According to the profiler the system was spending close to 2 seconds to find matching styles. This ended up being a double loop. Outerloop ran 1500 times and inner loop 400 times. The functions calls are several levels deep and across multiple objects.
It turns out that function calls are really costly in Action Script and it becomes very noticeable when function calls are being made in a loop. Here is an example:
var t1:int = getTimer();
for (var i:int =0; i<1500; i++) {
for (var j:int =0; j<1500; j++) {
var k:Number = j+k;
var k1:Number = j+k;
var k2:Number = j+k;
var k3:Number = j+k;
}
}
var t2:int = getTimer();
Takes 35 milliseconds to execute – fair enough. Now let us change the code to do those additions in a function:
private function add(n1:int, n2:int):int{
return n1 + n2;
}
and modify code to use the function:
var t1:int = getTimer();
for (var i:int =0; i<1500; i++) {
for (var j:int =0; j<1500; j++) {
var k:int = add(j,k);
var k1:int = add(j, k);
var k2:int = add(j, k);
var k3:int = add(j,k);
}
}
var t2:int = getTimer();
Takes 1.08 seconds to execute – way too long!
Solution
Given this finding – went to back to StyleProtoChain and modified the code:
- Eliminated all the getter methods by making variables public in CSSSelector, CSSDeclerationd CSSCondition
- Inlined all possible functions directly into matchStyleDeclarations (and named the new function fastMatchStyleDeclarations).
The fastMatchStyleDeclarations is large and not so nice to read, but it eliminated most of the performance issues with this loop. We are also working on reducing the number of universal selectors in our CSS file.
Just for curiosity I ran a similar program in Java:
private int add(int i, int j) {
return i+j;
}
and
long l1 = System.currentTimeMillis();
for (int i=0; i<1500; i++) {
for (int j=0; j<1500; j++) {
int k = add(i, j);
int k1 = add(i, j);
int k2 = add (i, j);
int k3 = add(i, j);
}
}
long l2 = System.currentTimeMillis();
Takes 8 milliseconds to execute, the difference is narrowed if we use Objects instead of primitives – but Java continues to be faster by several orders of magnitude
and here is the patch to get around the performance issues when using a large style sheet.
Update – 04/14
James Ward found that he was not getting the same numbers when he ran the test. The numbers above were captured using debug player (on a release build).
Here are the (new) numbers in standard and debug player (10.2 IE 9):
| Release Build | Debug Build | |
| Standard Player | 20/187 | 18/190 |
| Debug Player | 40/1300 | 1014/4140 |
In blue is the time spent executing the loop when running the inline version, the other number in orange is the time spent (in milliseconds) when executing the function variation. So the in-lined version is not 30 times faster, it is just 10 times faster – very surprised to find that running the program on the standard player makes such a big difference.
I’m not sure if it will change your result significantly, but in your first test you only iterated over ‘j’ 400 times instead of 1500 times.
Good catch, I was testing with various numbers and ended up copying the wrong code snippet. The 35 mills is for a 1500×1500 loop. For 1500×400 loop takes less than 10 milliseconds to execute
Java’s compiler will inline those function calls – not a fair comparison.
I don’t think that invalidates the results. It doesn’t matter whether it’s inefficiencies at the compiler level or inefficiencies in the runtime (or both).
In the patch file near the top of StyleProtoChain.as you have an unused variable declared that looks like it can be removed:
private static var universalDictionary:Dictionary = new Dictionary();
Yes, the unused variable can be removed, remnants of a different approach I tried to take.