Skip to content

Commit

Permalink
Documentation update
Browse files Browse the repository at this point in the history
Updated section ECMA of JerryScript Internals documentation ending with subsection LCache.

JerryScript-DCO-1.0-Signed-off-by: István Kádár [email protected]
  • Loading branch information
ktorpi authored and LaszloLango committed May 19, 2016
1 parent 76b8c95 commit 31a873a
Show file tree
Hide file tree
Showing 6 changed files with 35 additions and 71 deletions.
104 changes: 34 additions & 70 deletions 04.internals.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ ECMA component of the engine is responsible for the following notions:

* Data representation
* Runtime representation
* Garbage collection (gc)
* Garbage collection (GC)

## Data representation

Expand All @@ -225,22 +225,22 @@ The major structure for data representation is `ECMA_value`. The lower two bits

![ECMA value representation]({{ site.baseurl }}/img/ecma_value.png){: class="thumbnail center-block img-responsive" }

In case of number, string and object the value contains an encoded pointer.
Simple value is a pre-defined constant which can be:
In case of number, string and object the value contains an encoded pointer, and
simple value is a pre-defined constant which can be:

* undefined
* null
* true
* false
* empty (uninitialized value)

For other value types the higher bits of `ECMA_value` structure contains compressed pointer to the real value.

### Compressed pointers

Compressed pointers were introduced to save heap space.
![Compressed Pointer]({{ site.baseurl }}/img/ecma_compressed.png){: class="thumbnail center-block img-responsive" }

These pointers are 8 byte alligned 16 bit long pointers which can address 512 Kb of memory which is also the maximum size of the JerryScript heap.

ECMA data elements are allocated in pools (pools are allocated on heap)
Chunk size of the pool is 8 bytes (reduces fragmentation).

Expand All @@ -257,90 +257,54 @@ Several references to single allocated number are not supported. Each reference

### String

### Object / Lexical environment
Strings in JerryScript are not just character sequences, but can hold numbers and so-called magic ids too. For common character sequences there is a table in the read only memory that contains magic id and character sequence pairs. If a string is already in this table, the magic id of its string is stored, not the character sequence itself. Using numbers speeds up the property access. These techniques save memory.

Object and lexical environment structures, 8 bytes each, have common (GC) header:
* Stack refs counter
* Next object/lexical environment in list of objects/lexical environments
* GC's visited flag
* is_lexenv flag
### Object / Lexical environment

Remaining fields of these structures are different and are shown on the figure below.
An object can be a conventional data object or a lexical environment object. Unlike other data types, object can have references (called properties) to other data types. Because of circular references, reference counting is not always enough to determine dead objects. Hence a chain list is formed from all existing objects, which can be used to find unreferenced objects during garbage collection. The `gc-next` pointer of each object shows the next allocated object in the chain list.

![Object/Lexicat environment structures]({{ site.baseurl }}/img/ecma_object.jpg){: class="thumbnail center-block img-responsive" }
Lexical environments ([link](http://www.ecma-international.org/ecma-262/5.1/#sec-10.2)) are implemented as objects in JerryScript, since lexical environments contains key-value pairs (called bindings) like objects. This simplifies the implementation and reduces code size.

### Property of an object / description of a lexical environment variable
![Object/Lexicat environment structures]({{ site.baseurl }}/img/ecma_object.png){: class="thumbnail center-block img-responsive" }

While objects comprise of properties, lexical environments consist of variables. Both of these units are tied up into lists. Unit types could be different:
* named data (property or variable)
* named accessor (property)
* internal (implementation defined)
The objects are represented as following structure:

All these units occupy 8 bytes and have common header:
* type - 2 bit
* next property/variable in the object/lexical environment (compressed pointer)
* Reference counter - number of hard (non-property) references
* Next object pointer for the garbage collector
* GC's visited flag
* type (function object, lexical environment, etc.)

The remaining parts are differnt:
![Object property/lexcial environment variable]({{ site.baseurl }}/img/ecma_object_property.jpg){: class="thumbnail center-block img-responsive" }
### Properties of objects
![Object properties]({{ site.baseurl }}/img/ecma_object_property.png){: class="thumbnail center-block img-responsive" }

### Collections
Objects have a linked lists that contains their properties. This list actually contains property pairs, in order to save memory described in the followings:
A property is 7 bit long and its type field is 2 bit long which consumes 9 bit which does not fit into 1 byte but consumes 2 bytes. Hence, placing together two properties (14 bit) with the 2 bit long type field fits into 2 bytes.

ECMA runtime utilizes collections for intermediate calculations. Collection consists of a header and a number of linked chunks, which hold collection values.

Header occupies 8 bytes and consists of:
* compressed pointer to the next chunk
* number of elements
* rest space, aligned down to byte, is for the first chunk of data in collection

Chunk's layout is following:
* compressed pointer to the next chunk
* rest space, aligned down to byte, is for data stored in corresponding part of the collection

### Internal properties:

* [[Class]] - class of the object (ECMA-defined)
* [[Prototipe]] - is stored in object description
* [[Extensible]] - is stored in object description
* [[CScope]] - lexical environment (function's variable space)
* [[ParametersMap]] - arguments object -0 code of the function
* [[Code]] - where to find bytecode of the function
* native code - where to find code of native unction
* native handle - some uintptr_t assosiated with the objec
* [[FormalParameters]] - collection of pointers to ecma_string_t (the list of formal parameters of the function)
* [[PrimitiveValue]] for String - for String object
* [[PrimitiveValue]] for Number - for Number object
* [[PrimitiveValue]] for Boolean - for Boolean object
* built-in related:
* built-in id - id of built-in object
* built-in routine id - id of built-in routine
* "non-instantiated" mask - what built-in properties where notinstantiated yet (lazy instantiation)
* extention object identifier
If the number of property pairs reach a limit (currently this limit defined to 16), the first element of the property pair list is a hashmap (called property hashmap), which is used to find a property instead of finding it by linear search.

### LCache
### Collections

LCache is a cache for property variable search requests.
Collections are array-like data structures, which are optimized to save memory. Actually, a collection is a linked list whose elements are not single elements, but arrays which can contain multiple elements.

![LCache]({{ site.baseurl }}/img/ecma_lcache.png){: class="thumbnail center-block img-responsive"}
### Internal properties

The entries of LCache has the following layout:
* object (pointer to object)
* property name (pointer to string)
* property (pointer to property)
Internal properties are special properties that carry meta-information that cannot be accessed by the JavaScript code, but important for the engine itself. Some examples of internal properties are listed below:

The layout above presents multiple times in row. The rows of LCache is indexed by property name hash. When a property access occurs, all row's entries are searched by comparing object pointer and property name according entry's fields, full comparison is used for property name.
* [[Class]] - class (type) of the object (ECMA-defined)
* [[Code]] - points where to find bytecode of the function
* native code - points where to find the code of a native function
* [[PrimitiveValue]] for Boolean - stores the boolean value of a Boolean object
* [[PrimitiveValue]] for Number - stores the numeric value of a Number object

If corresponding entry was found, its property pointer is returned (may be NULL - in case when there is no property with specified name in given object).
Otherwise, the property set of the considered object is iterated over and the corresponding record is registered in LCache (with property pointer if it was found or NULL otherwise).
### LCache

## Runtime
LCache is a hashmap for finding a property specified by an object and by a property name. The object-name-property layout of the LCache presents multiple times in a row as it is shown in the figure below.

ECMA-defined runtime operations are implemented mostly with routine having the following signature:
![LCache]({{ site.baseurl }}/img/ecma_lcache.png){: class="thumbnail center-block img-responsive"}

`ecma_completion_value_t ecma_op_* ([ecma_value_t arguments])`
or
`ecma_property_t * ecma_op_[find/get]*_property (objs, name string, ...)`
When a property access occurs, a hash value is extracted form the demanded property name and than this hash is used to index the LCache. After that, in the indexed row the specified object and property name will be searched.

However, there could be some combinations.
It is important to note, that if the specified property is not found in the LCache, it does not mean that it does not exist. If the property is not found, it will be searched in the property-list of the object, and if it is found there, the property will be placed into the LCache.

### Completion value

Expand Down
2 changes: 1 addition & 1 deletion css/img.css
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ img[alt="byte-code layout"] {
}

img[alt="ECMA value representation"] {
max-width: 40%;
max-width: 50%;
display: block;
}

Expand Down
Binary file removed img/ecma_object.jpg
Binary file not shown.
Binary file added img/ecma_object.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed img/ecma_object_property.jpg
Binary file not shown.
Binary file added img/ecma_object_property.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 31a873a

Please sign in to comment.