-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug(snapshot): on journal write entry with opcode COMMAND and MULTI_C… #662
Conversation
…OMMAND Signed-off-by: adi_holden <[email protected]>
tests/dragonfly/utility.py
Outdated
command_count = 0 | ||
while True: | ||
cmds = await queue.get() | ||
if cmds is None: | ||
queue.task_done() | ||
break | ||
|
||
pipe = client.pipeline(transaction=False) | ||
is_multi_transaction = False | ||
if command_count == self.multi_transaction_ratio: | ||
is_multi_transaction = True | ||
pipe = client.pipeline(transaction=is_multi_transaction) | ||
for cmd in cmds: | ||
pipe.execute_command(cmd) | ||
|
||
await pipe.execute() | ||
queue.task_done() | ||
++command_count | ||
await client.connection_pool.disconnect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adi, there is no pre-increment in Python 😄
Those are batches generated by the generator - with a default size of 1000 there are usually just a few on the tests. I've seen you reduced the size, so there are more, probably a few dozens. If you run the tests with -s
it prints how many batches were submitted.
That doesn't include the runs with a target number of batches, i.e. target_times. Lets rename it to target_ops then and make it calculate the number of batches. If you want, I can do it quickly.
Now you could would run this only one if command_count whould've ever reached self.multi_transaction_ratio. Instead, we can do it probabilistic, i.e. if run it if random.random() < transaction_prob
Also one more small nit, in the snapshot_tests, set it to a zero value, so they don't spend time on filling it with tx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more issue, lets encode this on the producer, inside _generator_task
, so it can print in in the file as well, where one tx started and where it ended, and instead of sending just the command we'll send a tuple, that indicates whether this is a tx or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont understand the comment about set it to 0 in snapshot_tests? what is "they" and "it" when you say "so they don't spend time on filling it with tx"
Isnt it good to test the multi flow in the snapshot tests as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The snapshots don't have journal mode enabled. I suppose that the basic snapshot functionality works reliably enough regardless of execution type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also did not understand this:
That doesn't include the runs with a target number of batches, i.e. target_times. Lets rename it to target_ops then and make it calculate the number of batches.
From what I understand when calling run with target_times, we call the client.pipeline target_times times. Each call will execute batch_size commands (default 100).
What I want to do now is with probability of multi_transaction_probability make the call to client.pipeline with transaction=true. This will execute this batch_size commands in one multi transaction.
This is the logic in the PR, I just commited a fix. are you suggesting something else?
Signed-off-by: adi_holden <[email protected]>
tests/dragonfly/utility.py
Outdated
@@ -408,13 +409,22 @@ def should_run(): | |||
while should_run(): | |||
start_time = time.time() | |||
blob, deviation = self.gen.generate() | |||
is_multi_transaction = False | |||
if random.random() < self.multi_transaction_probability: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could've been done in just one line with is_tx = random() < prob
About target_times:
This parameter is used to indicate how much data to stream in the stable
state tests - how many batches to submit. By making the default batch size
smaller, you greatly reduced their load (you must have noticed they run
much faster) We could just change the batch size for them - but its better
to have many batches to test the behavior with many transactions.
Either multiply all the values of this parameter in all the tests by 10, or
change this to be target_ops, and calculate the value of batches as batches
= ops / batch_size
…On Tue, Jan 10, 2023, 17:15 adiholden ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In tests/dragonfly/utility.py
<#662 (comment)>
:
> + command_count = 0
while True:
cmds = await queue.get()
if cmds is None:
queue.task_done()
break
-
- pipe = client.pipeline(transaction=False)
+ is_multi_transaction = False
+ if command_count == self.multi_transaction_ratio:
+ is_multi_transaction = True
+ pipe = client.pipeline(transaction=is_multi_transaction)
for cmd in cmds:
pipe.execute_command(cmd)
await pipe.execute()
queue.task_done()
+ ++command_count
await client.connection_pool.disconnect()
Also did not understand this:
That doesn't include the runs with a target number of batches, i.e.
target_times. Lets rename it to target_ops then and make it calculate the
number of batches.
From what I understand when calling run with target_times, we call the
client.pipeline target_times times. Each call will execute batch_size
commands (default 100).
What I want to do now is with probability of multi_transaction_probability
make the call to client.pipeline with transaction=true. This will execute
this batch_size commands in one multi transaction.
This is the logic in the PR, I just commited a fix. are you suggesting
something else?
—
Reply to this email directly, view it on GitHub
<#662 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AE42AL75JPH2VDIBWCJDBODWRVVIVANCNFSM6AAAAAATWO4M7A>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
Signed-off-by: adi_holden <[email protected]>
@@ -325,8 +325,8 @@ OpStatus DflyCmd::StartFullSyncInThread(FlowInfo* flow, Context* cntx, EngineSha | |||
}; | |||
|
|||
// Shard can be null for io thread. | |||
CHECK(!sf_->journal()->OpenInThread(false, ""sv)); // can only happen in persistent mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a bug that the shard unique count was not calculated in case the transaction was on non shard thread.
SetMultiUniqueShardCount is not called if ServerState::tlocal()->journal() = null
Signed-off-by: adi_holden [email protected]