You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In version 1.1-4, when ConcurrentHClientPool release HClient, if it is opened, it is pooled in availableClientQueue.
If we are in case of HThrifClient with TTransport wrapped with TFramedTransport, TMemoryInputTransport readBuffer_ keeps datas of operations done by HClient.
These given datas multiplied by connection's number can increase quickly the memory.
Why doesn't clear readBuffer_ on HClient release ?
I have 1 additional question,
Why max active connection is divided by 3 to obtain HClient number by host ?
Thanks
The text was updated successfully, but these errors were encountered:
Why doesn't clear readBuffer_ on HClient release ?
It keeps the data, but it is overwritten during the next usage.
This is a "feature" of that version of Thrift. It keeps the underlying byte[] to avoid having to re-allocate/re-grow. The problem, as you have discovered, is that they will grow out to except a larger payload, but they will not shrink, doing so all the way out to the max message length (15mb by default).
Why max active connection is divided by 3 to obtain HClient number by host ?
Did not need to have all MAX_CONNECTIONS threads allocated and that seemed a good number from empirical observation of adding a service into a running architecture. That was a good guess it seemed as no one has yet had a big enough issue with it to want to add a MIN_CONNECTIONS or similar :)
So, the maximum (approximately ) retained heap by the ConcurrentHClientPool using HThriftClient with TFramedTransport follows this rule :
HOSTS_NUMBER * (MAX_ACTIVE_CONNECTION / 3) * MAX_MESSAGE_LENGTH ?
This by itself may be a reason to incorporate the DataStax Java Driver for simple operations in your code as well, maintaining a much smaller pool of hector connections for large batch mutates or getting at dynamic columns easier.
Further, the binary protocol for CQL uses evented IO via Netty on the client and server so is significantly more efficient resource wise.
That said, despite what you may read elsewhere, using raw thrift is more performant and flexible if (a really big "if" there) you understand the underlying storage model and its limits.
In version 1.1-4, when ConcurrentHClientPool release HClient, if it is opened, it is pooled in availableClientQueue.
If we are in case of HThrifClient with TTransport wrapped with TFramedTransport, TMemoryInputTransport readBuffer_ keeps datas of operations done by HClient.
These given datas multiplied by connection's number can increase quickly the memory.
Why doesn't clear readBuffer_ on HClient release ?
I have 1 additional question,
Why max active connection is divided by 3 to obtain HClient number by host ?
Thanks
The text was updated successfully, but these errors were encountered: