-
Notifications
You must be signed in to change notification settings - Fork 677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use heap allocated buffer instead of char collection in ecma_strings #678
Conversation
3d-raytrace.js fails for me. Please check your measurements. egavrin@ubuntu:~/jr$ ./jr-native-str ../jsperf/sunspider-1.0.2/3d-raytrace.js
egavrin@ubuntu:~/jr$ echo $?
1 Measurements on rasp pi 2: $ clear; setarch linux32 -R ../tools/run-perf-test.sh ./jr-master ./jr-native-str 5 5000 ../jsperf/sunspider-1.0.2/
This patch greatly improves performance, nice work! Heap state before allocation of string:
Current implementation:
PR's implementation:
|
a = []
for (i = 0; i < 3000; i++)
{
a[i] = i + '.';
} Master: OK |
Mem stats for the following code: a = []
for (i = 0; i < 1000; i++)
{
a[i] = i + '.';
} ./jerry.master --mem-stats ./t.js ; echo $?
Heap stats:
Heap size = 258048 bytes
Chunk size = 64 bytes
Allocated chunks count = 0
Allocated = 0 bytes
Waste = 0 bytes
Peak allocated chunks count = 636
Peak allocated = 40632 bytes
Peak waste = 205 bytes
Pools stats:
Chunk size: 8
Pools: 0
Allocated chunks: 0
Free chunks: 0
Peak pools: 631
Peak allocated chunks: 5041 ./jerry.new --mem-stats ./t.js ; echo $?
Heap stats:
Heap size = 258048 bytes
Chunk size = 64 bytes
Allocated chunks count = 0
Allocated = 0 bytes
Waste = 0 bytes
Peak allocated chunks count = 1386
Peak allocated = 33469 bytes
Peak waste = 55235 bytes
Pools stats:
Chunk size: 8
Pools: 0
Allocated chunks: 0
Free chunks: 0
Peak pools: 380
Peak allocated chunks: 3039 The most important difference is the following, waste is 2 times higher than allocated memory: master -> new
Peak allocated chunks count = 636 -> 1386
Peak allocated = 40632 bytes -> 33469 bytes
Peak waste = 205 bytes -> 55235 bytes |
@dbatyai, could you, please, describe causes of the performance / memory consumption changes and also try to figure out ways to prevent fragmentation and reduce amount of waste space? Maybe, improvement of "literal storage" and switching to it for storage of strings can solve the issues. Could you, please, share your opinion about this? |
JerryScript-DCO-1.0-Signed-off-by: Dániel Bátyai [email protected]
Most of the performance gain comes from storing the length of strings, and not having to iterate over them every time it's needed. The cause of the large amount of wasted memory is that blocks are allocated as 64 byte chunks, and since short strings don't need that much, the rest will be unused. Adding a special case for shorter strings, and only allocating a pool chunk if they fit into it could help in the test case mentioned by @egavrin , but there would still be other cases where wasted space would be large. What do you think? |
304509d
to
62d400d
Compare
Length of string is in /**
* Description of a collection's header.
*/
typedef struct
{
/** Number of elements in the collection */
ecma_length_t unit_number;
/** Compressed pointer to first chunk with collection's data */
mem_cpointer_t first_chunk_cp;
/** Compressed pointer to last chunk with collection's data */
mem_cpointer_t last_chunk_cp;
} ecma_collection_header_t; |
|
@dbatyai, I see. Thanks. Could we add the length to beginning of a string's storage? |
I would say that is big, but needed change regardless the string optimization.
Could you share the exact measurements for sunspider using both |
@dbatyai, what about this PR? The initial results were really promising. Could you update your patch? |
Closing, as this is no longer relevant. |
Remove char collection from ecma_strings, and use a heap buffer instead, where we also store string size and length.
Results (measured on rpi2):
(+ is better)
(+ is better)