They way computers talk...
In mainstream computer languages, a source code in a high-level language is transformed to a low-level language (a machine or assembly language) by either being compiled or interpreted . It is a very simple concept, but it is a fundamental one!
Compilers and Interpreters
Compilers produce an intermediate form called code object, which is like machine code but augmented with symbols tables to make executable blocks (library files, with file objects). A linker is used to combine them to form executables finally.
Interpreters execute instructions without compiling into machine language first. They are first translated into a lower-level intermediate representations such as byte code or abstract syntax trees (ASTs). Then they are interpreted by a virtual machine.
The truth is that things are generally mixed. For example, when you type some instruction in Python's REPL, the language executes four steps: lexing (breaks the code into pieces), parsing (generates an AST with those pieces - it is the syntax analysis), compiling (converts the AST into code objects - which are attributes of the function objects), and interpreting (executes the code objects).
In Python, byte-compiled code, in the form of .pyc files, is used by the compiler to speed-up the start-up time (load time) for short programs that use a lot of standard modules. And, by the way, byte codes are attributes of the code object so to see them, you need to call
func_code (code object) and
>>> def some_function():
So we see that when modern languages choose the way they compile or interpret code, they are trading off with the speed they want things to run. Since browsers are preoccupied with delivering content the faster they can, this is a fundamental concept.
Method JITs and Tracing JITs
To speed things up, instead of having the code being parsed and then executed (one at a time), dynamic translators (Just-in-time translators, or JIT) can be used. JITs translate intermediate representation into machine language at runtime. They have the efficiency of running native code with the cost of startup time plus increased memory (when the bytecode or AST are first compiled).
Engines have different policies on code generation, which can roughly be grouped into types: tracing and method.
Method JITs emit native code for every block (method) of code and update references dynamically. Method JITs can implement an inline cache for rewriting type lookups at runtime.
So, for example, supposing an application that has an object Point (borrowed from the official documentation):
this.x = x;
this.y = y;
We can create several objects:
var a = new Point(0,1);
var b = new Point(2,3);
And we can access the propriety
x in these object by:
Garbage collection is a form of automatic memory management: an attempt to reclaim the memory occupied by objects that are not being used any longer (i.e., if an object loses its reference, the object's memory has to be reclaimed).
The other possibility is manual memory management, which requires the developer to specify which objects need to be deallocated. However, manual garbage collection can result in bugs such as:
Dangling pointers: when a piece of memory is freed while there are still pointers to it.
Double free bugs: when the program tries to free a region of memory that it had already freed.
Memory leaks: when the program fails to free memory occupied by an object that had become unreachable, leading to memory exhaustion.
In terms of performance, besides direct compilation to native code, three main features in V8 are fundamental:
In-line caching as an optimization technique.
Efficient memory management system (garbage collection).
Let's take a look at each of them.
V8's Hidden Class
In V8, as execution goes on, objects that end up with the same properties will share the same hidden class. This way, the engine applies dynamic optimizations.
Consider the Point example from before: we have two different objects,
b. Instead of having them completely independent, V8 makes them share a hidden class. So instead of creating two objects, we have three. The hidden class shows that both objects have the same proprieties, and an object changes its hidden class when a new property is added.
So, for our example, if another Point object is created:
Initially, the Point object has no properties, so the newly created object refers to the initial class C0. The value is stored at offset zero of the Point object.
xis added, V8 follows the hidden class transition from C0 to C1 and writes the value of
xat the offset specified by C1.
yis added, V8 follows the hidden class transition from C1 to C2 and writes the value of
yat the offset specified by C2.
Instead of having a generic lookup for propriety, V8 generates efficient machine code to search the propriety. The machine code generated for accessing
x is something like this:
# ebx = the point object
cmp [ebx, <class offset>], <cached class>
jne <inline cache miss>
mv eax, [ebx, <cached x offset>]
Instead of a complicated lookup at the propriety, the propriety reading translates into three machine operations!
It might seem inefficient to create a new hidden class whenever a property is added. However, because of the class transitions, the hidden classes can be reused several times. It turns out that most of the access to objects are within the same hidden class.
V8's Inline caching
When the engine runs the code, it does not know about the hidden class. V8 optimizes property access by predicting that the class will also be used for all future objects accessed in the same section of code, and adds the information to the inline cache code.
Inline caching is a class-based object-oriented optimization technique employed by some language runtimes. The concept of inline caching is based on the observation that the objects that occur at a particular call site are often of the same type. Therefore, performance can be increased by storing the result of a method lookup inline (at the call site).
If V8 has predicted the property's value correctly, this is assigned in a single operation. If the prediction is incorrect, V8 patches the code to remove the optimization. To facilitate the process, call sites are assigned in four different states:
Unitilized: Initial state, for any object that was never seen before.
Pre-monomorphic: Behaves like an uninitialized but do a one-time lookup and rewrite it to the monophorfic state. It's good for code executed only once (such as initialization and setup).
Monomphorpic: Very fast. Recodes the hidden class of the object already seen.
Megamorphic: Like the initialized stub (since it always does runtime lookup) except that it never replaces itself.
V8's Efficient Garbage Collecting
In V8, a precise garbage collection is used. Every pointer's location is known on the execution stack, so V8 is able to implement incremental garbage collection. V8 can migrate an object to another place and rewire the pointer.
In summary, V8's garbage collection:
stops program execution when performing a garbage collection cycle,
processes only part of the object heap in most collection cycles (minimizing the impact of stopping the application),
always knows exactly where all objects and pointers are in memory (avoiding falsely identifying objects as pointers).
 When the Python interpreter is invoked with the
-O flag, optimized code is generated and stored in .pyo files. The optimizer removes assert statements.