JavaScript Compiler Optimization Techniques— only for Experts

How JS Engine works under the hood and performance tips!

Supratik Basu
codeburst

--

Lamborghini Huracan — Photo by Noah Boyer

The Problem

You literally can’t buy faster servers to improve the performance of your client side application. I mean you can buy all of your customers faster computers I guess !! 🤑🤑🤑
A lot of time and energy is spent compressing assets, removing requests, and reducing latency, but what about once the application is running? Most of the times parsing and compiling is the main culprit.

Performance profile for Facebook-Profile page

Here you can see that the thin little blue is the time taken to load the assets from the server, and the majority of the time is spent for scripting i.e. executing your javascript code (yellow).

This is bench-marked in my Macbook Pro 2019 16-inch model. But not everyone accessing the page has high-end device. Think of the mid-range mobile users with 4G connectivity (I guess that’s the vast majority of the users) who will visit the websites. The javascript parsing in the modern mid-range smartphones averages to 0.75MB/s.

Parse times for a 1MB bundle of JavaScript across desktop & mobile devices of differing classes.

A survey by websiteoptimization.com says that an average user would no longer wait more than 8–10 seconds for a webpage to load. And a broadband user expects it to load even faster.

Another major issue is the ever increasing bundle size — which are increasing day by day as our application is growing. As an engineer we must think how to reduce the scripting and rendering time of our javascript code.

Golden Rules of Performance

  • Doing less stuff takes less time — Every time we do less stuff, it will be fast. Doing nothing is even better.
  • If you can do it later, do it later — Doing stuff later is better than doing it now.

So if you can get away with not doing it, don’t do it. But if you can push the task then do it later.

A general everyday use case, which you could relate would be tracking. Don’t bundle it with your main code and ship it. What you can do is after your application is loaded you can now ship your tracking code and then track the visitors in your application.
If your visitors don’t wait to load your application and go away, whom would you track?

Let’s understand the JavaScript Engine

The JavaScript Engine is a program or an interpreter which executes JavaScript code. A JavaScript engine can be implemented as a standard interpreter, or just-in-time compiler that compiles JavaScript to bytecode in some form.

Interpreter way

The interpreter uses a concept called REPL — read-eval-print-loop. The real advantage of this method is immediate output and it’s easy to implement. But this comes with the cost of execution speed:
— Eval is slow. It doesn’t leverage the speed of machine code
— Not being able to optimize code across the program

So imagine you have a loop iterating N times in which you call a function F, the interpreter will end up executing all the lines of code in function F all the N times, which means unnecessarily re-evaluating N-1 times.

Compiler way

In contrast to interpreter, the compiler translates all of the code to executable at once. As a result compilers can make optimization like sharing of machine code for repeated lines of code.

But this has a con. Any guesses?? It will have a slow start. Now which way should you choose if you build your own compiler?
Not really anything is pleasing right? Thank god we have a 3rd version of compiler which uses the best of the two worlds. Yeah the JS Gurus guessed it right!

JIT Compiler way 😎🤘

JIT stands for Just-In-Time. So JIT starts with interpreting but it keeps track of warm codes which are run few times and also hot codes which runs way more times.

The warm codes are sent to compiler, so that it can re-use the code wherever possible. While the hot codes are sent to a more efficient optimizing compiler which actually makes assumptions on the code and make optimizations over that. I will drive you with some examples later on how optimizations are made by this. Important point to note here is if these assumptions stands invalid, again it has to de-optimize and go back to less efficient bytecode compiled earlier. This is a very costly operation though — the translating operation.

The JIT compiler also comes with a cost of memory allocation during interpreting, because that is the key for tracking your code. But hey, now we have GB’s of RAM so that is not a concern for general web application.

The V8 Engine (not that of a sports car!)

There are many JavaScript Engines — different for different browsers or vendors. I am not going to touch on every other engines in this article, but can do a general dive-in of the architecture maybe, in upcoming articles (So watch out and follow me for updates). But do note most of the optimizations we discuss here is more or less applicable to all of the other vendors. Some of the other famous JavaScript Engines are — Rhino, SpiderMonkey, JavaScriptCore, Chakra and Nashorn just to name a few.

Lets dive into the architecture of V8

High-level code journey through V8 Engine

So there is a thing called cloud. When we do a webpack build, it gets stored in cloud and our js code gets served from the cloud when user requests it. How it reached cloud? That’s a question for backend guys 😜
Now the cloud sends us JavaScript file, which is a junk of text basically and now to make sense it goes through a parser which parses the JavaScript file and converts it to an AST(Abstract Syntax Tree). You can think of AST as data-structure that represents what this code really means.
Now the V8 compilers take care of rest of the work. So the first step is interpreter which interprets the code and identify the hotspots which I mentioned earlier and generates semi-optimized bytecode. Any code that can be optimized then goes to the optimizing compiler. Then the optimizing compiler analyzes the code and make assumptions to make it even faster. The optimizing compiler generates highly optimized machine code, but we discussed that sometimes it has to de-optimize on runtime and change back to the byte code. Hats off to the naming of V8 team, they really know how to name their engines. The interpreter which generates the bytecode is called Ignition (yes the ignition of the car, i.e. the start) and the optimizing compiler is called TurboFan (turbo boost which speeds up the car).

How to optimize now?

As you have already guessed by now we need to ship code which goes to that green arrow to run our code faster. And we must avoid the red arrow which is not good.

Three things engine does to help us out

  • Speculative optimization
  • Hidden classes for dynamic lookups
  • Function inlining

Lets play with some code to understand the concepts!

Optimization Techniques

Work with types that are coherently compatible with operations

I have written a simple algorithm to calculate kth prime number and I find the 30,000th prime number with the help of this code.

kth-prime.js

It took some little time more than 3sec to calculate it in my Macbook Pro.

Average time taken — 3.045sec

This code is not optimized as per JavaScript compiler. Can you find the culprit which could have caused this issue? If you look closely in the function isPrimeDivisible, you will find that in the for loop I am comparing till the length of the array and not till a number less than the length. Now change that line of code to:

for (let i = 0; i < len; ++i) {

Now lets benchmark this!

Average time taken — 1.173sec

Here is the jsbench link for all you lazy ones who don’t try on your computer.

See the difference? Just iterating a less makes this huge difference. Now our code is ~160% faster! So what happened under the hood? First point to note is JavaScript doesn’t complain about undefined type. So always with the last element of the iteration it wants to do a modulo operation with undefined , hence the compiler which optimized our code to work with small integer numbers (V8 calls it SMI), now has to change the track and shift to de-optimized compiler which takes a time hit and performs the modulo over undefined to produce NaN which evaluates the condition to false . This is actually a non-mathematical operation and takes time in C++. Hence, the time taken is more. Now you know what to do — avoid mathematical computation with types that are non-mathematical!

Grammar Optimization

Now let’s take a look at some other aspect of optimization, guess which will be faster execution? Line-1 or Line-3

Now here’s the thing it depends actually. But logically we can say that Line-3 is faster. The reason being { in JavaScript grammar means a lot of things like — start of a function, start of an object, start of a block, start of class and many more things. So while parsing it, the compiler has to go forward and parse more to determine what is it actually. But in case of JSON.parse() it’s always have to be object! So the parser knows that it’s an javascript object when it encounters a { . But hey, everything in JavaScript is object hence the compiler is highly optimized to determine objects and hence line-1 is faster in general cases. But here is the catch — when the data starts to become large like 10KB or more, JSON.parse() starts to overtake the speed.

Speculative Optimization

Go through the code, ignore the performance part that is for measuring performance of the code for output. NOTE I am using node v13.9.0

When you run the code, it gives output as-

It took only about 14.27 milliseconds to execute. But if you uncomment the line 15 and re-run you will be surprised to see the results

add(numA, '5') // add this line to code in line 15

Now it took 52.63 milliseconds. Let’s play with some V8 flags to understand. I am going to introduce two flags --trace-opt and --trace-deopt which will trace the optimization and deoptimization done by the engine. Now re-run the program with these flags —
node --trace-opt --trace-deopt speculative-optimization.js
You will find bunch of dumps in the console. But the method here we are interested in is add, so do a grep add on the output —
node --trace-opt --trace-deopt speculative-optimization.js | grep add

with string addition

You can clearly see that add function gets optimized at first and then gets deoptimized again. Thus our code take a huge hit for conversion from optimized code to deoptimized code. Now again if we remove the addition with string, and re-run with flags then we see it only optimizes.

without string addition

Now let’s try removing every optimization V8 does for us and try running the code again! I just wrote a handy V8 internal function to the code and it now looks like:

speculative-optimization-never-optimize.js

Now run the code with a V8 internal flag --allow-natives-syntax as
node --allow-natives-syntax speculative-optimization-never-optimize.js

Woooe it’s now ~230ms! For those who are thinking it might be due to extra function declaration, see the output by commenting out line 15

// neverOptimizeFunction(add)

Now see the output difference

Just to prove you that entering into optimization compiler isn’t free i wrote this code

no-free-optimization.js

Now just run the code by
node --allow-natives-syntax --trace-opt no-free-optimization.js

Theres only the Performance output no optimization message.

Now again try uncommenting line 15 (damn something is with line 15 everytime)

optimizeFunctionOnNextCall(add) // add in line 15

We now get the optimization message.

The optimizing compiler sees that this function is only called with a single type and hence tries to predict this hypothesis and tries to optimize it for a particular type, but when it encounters a different type (string here), its prediction fails and it deoptimizes the code.

So what we can do? Pass the fixed type to function whenever possible or just use TypeScript. (Well really it’s time to adopt TypeScript)

Optimization for Hidden Class

Hidden Class is not a javascript feature, it’s a V8’s feature. Every object/primitives that you define are mapped to a certain hidden class. We as programmer should not change the object structure because it creates a new hidden class every time a property is added or deleted. End of the day, all of our javascript code gets executed by C++ code and C++ doesn’t have a concept of objects like javascript, hence it incurs a cost for changing the shape of the object. Lets take an example —

hidden-class.js

On running it produces the output

without uncommenting

It took about 2.7sec to run the program. Now yet again, uncomment line 15

with uncommenting

It took whooping 7sec. But shouldn’t the JSON.stringify() have less work to do now? But why so? It’s only because we are changing the Hidden Class of the shapeobject by deleting the property. Hidden class also depends on the sequence of properties in object are added, if the sequence is the same, then both of the object gets mapped to the same hidden class. The operations performed on the objects are also dependent on the morphism of the object.
Morphism of the objects can be of 3 types
— Monomorphic: All objects have the same shape or hidden class, and the compiler can perform well.
— Polymorphic: Compiler has seen few shapes and identify the shape from the list, and perform optimization on that.
— Megamorphic: Compiler has seen a lot of shapes and is not particularly specialized. Hence, no optimization will be done.

Now let me introduce to a new V8 internal method:
%HaveSameMap(arg1, arg2)
This method as you have already guessed it — tells about if both of the objects that are passed belongs to the same Hidden Class or not. Now lets see some interesting results! Hope now things will be bit clearer.

is-same-hc.js

If you want to experiment, change the values and run the code including the --allow-natives-syntax flag.

Hope this gave you a good insight for hidden class, you can read more here.

Optimization in Scoping and Prototypes

We must be aware of how we scope our variable and use them. Let’s jump directly into the code.

prototypes-scopes.js

This is a simple program to calculate the perimeter of a Triangle, assuming the triangle is valid. Running this takes about ~3.87sec

inside the scope of test

How much time difference do you think moving Triangle class to upper scope will take? Any guess? Okay for sure it will make code fast. How much fast? 7%? 70%? 700%? 7000%? 70,000%?????

prototypes-scopes.js with outer scope

70,000% is a bit exaggeration 😅. But the truth is— 69,936% faster in my machine. It just took ~5.54ms in my machine. Don’t believe me, check on your computer!

outside the scope as of test

Now many of you will argue that you are creating a new class every time, but I don’t think that brought us from 3880 seconds to 5.5 milliseconds. Let me prove you that!

%HaveSameMap() to my rescue!

Same Map test

It produces false as output. So what happens here is every time a fresh object is created with a new Prototype. It doesn’t have the reference to same Triangle class and hence cannot optimize. Still if you don’t believe, you can try out creating a new class say Point, just create it in test method and let the time decide your hypothesis! It do increases but not much as much as you think it to be!

Function Inlining

The purpose of V8 is to optimize our code and our purpose should be to write readable and maintainable code. Therefore, abstractions are good if you understand how to make use of function-inlining. Here’s a small example —

If you run this program and after that if comment out line 12 and uncomment line 13, you will see that both have similar timings. V8 finds out where the function is called and inlines it as if the function doesn’t exists. If the same function is called from various places then it fails to inline the function, increasing the call stack.

With readable code (1st case)
With unreadable code (2nd case)

Hope this makes sense and encourages you to focus on readability.

Key Takeaways

  • The simplest way to reduce parse, compile and execution time is to ship less code.
  • Use the User Timing API (the one I used, even the browser has it!) to determine where is biggest damage is.
  • Consider using a type system maybe, so that you don’t have to remember all that stuff that I just wrote about.

If you haven’t already seen my previous post on areas of optimization and when to optimize, do check it out here.

Here on Medium, I try to write a weekly article on Web Technologies and share my knowledge with people. If you want direct updates then consider me following. Do share with your friends if you liked the content, it keeps me motivated to write more.

You can find me on LinkedIn Facebook Instagram or you can mail me over mail.supratikbasu@gmail.com.

Feedback and suggestions are welcome. See you next week!

[BONUS]

If you want to learn more about optimization in array particularly then go ahead and read this article.

If you are wondering why a Lamborghini Huracan in the picture, then let me tell you — This is the car which has some performance factor. And also my favorite car!

Edits

[20 April, 2020]

Thanks Abhas Bhattacharya and Ashish Mishra for your suggestions, really appreciate it!

  • Language & grammar improvements

References

--

--