-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine Grained Concurrency Control Standardisation #294
Comments
The development of the RWLock system here https://gist.github.com/CMCDragonkai/4de5c1526fc58dac259e321db8cf5331 may be usable by a number of domains to increase their concurrency. |
The RWLock system has been improved with the ability to do read-preferring or write-preferring. The async-init has been upgraded to use locking for their start, stop, destroy methods. This makes it alot better now, they are using write-preferring rwlock. There was some discussion about locking abstractions, the usage of a generic I'm just trying to find where I wrote that. |
The commentary about a general It came out of discussion about general resource management which includes locking. |
Attempting to implement However I also need to map & index into the tuple type before spreading. This is a bit more complicated, had to ask a SO question about this: https://stackoverflow.com/questions/70736753/how-to-map-index-into-tuple-types-that-is-generically-spread-in-typescript |
Initial attempt, however the types are not correct due to type Resource<T = void> = () => Promise<[
release: () => Promise<void>,
resource: T
]>;
async function withF<
Resources extends Array<Resource<unknown>>,
Result
>(
resources: Resources,
f: (...resources: [...Resources]) => Promise<Result>
): Promise<Result> {
const releases = [];
const items = [];
try {
for (const resource of resources) {
const [release, item] = await resource();
releases.push(release);
items.push(item);
}
return await f(...items);
} finally {
releases.reverse();
for (const release of releases) {
await release();
}
}
} |
It appears one of the problems is that mapped types don't seem to work correctly for tuple types. For example: type Api = {
createItem: (name: string) => Promise<any>;
getItem: (id: number) => Promise<any>;
updateItem: (item: any) => Promise<any>;
deleteItem: (id: number) => Promise<void>;
};
// type Api = [() => number, () => string];
type NewApi = {
[K in keyof Api]: ReturnType<Api[K]>
}; The above type checks, however if you switch to using the second |
Actually maybe the However we will need to also index into the |
Based on this answer: https://stackoverflow.com/a/60713409/582917, there appears to be a way to do this. We will need to dispense with Here's an example: type Call<R> = (...args) => R;
type FunctionReturns<T extends Record<number, Call<any>>> = {
[K in keyof T] : T[K] extends Call<infer R> ? R: never
}
type FunctionReturns2<T extends readonly Call<any>[]> = {
[K in keyof T] : T[K] extends Call<infer R> ? R: never
}
function getReturns<
T extends (readonly [Call<any>] | readonly Call<any>[])
>(fs: T): FunctionReturns2<T> {
// Escape hatch!
return fs.map(f => f()) as any;
}
getReturns([() => 123, () => 'abc', () => [1,2] as const])
// force inference as a tuple, and not as an array
const fs = [() => 123, () => 'abc'] as const;
getReturns(fs); The Notice I have 2 variants. The It is essential to understand that tuples are "readonly arrays". It has to be written like |
Note that the equivalent type ReturnType<T extends (...args: any) => any> = T extends (...args: any) => infer R ? R : any However it's better for us to have a separate |
Some solutions are incoming... type ResourceAcquire<T = void> = () => Promise<[() => Promise<void>, T?]>;
type Resources<T extends readonly ResourceAcquire<any>[]> = {
[K in keyof T] : T[K] extends ResourceAcquire<infer R> ? R: never
}
async function withF<
ResourceAcquires extends readonly [ResourceAcquire<unknown>] | readonly ResourceAcquire<unknown>[],
Result
>(
resourceAcquires: ResourceAcquires,
f: (...resources: Resources<ResourceAcquires> & unknown[]) => Promise<Result>
): Promise<Result> {
const releases: Array<() => Promise<void>> = [];
const resources: Array<unknown> = [];
try {
for (const resourceAcquire of resourceAcquires) {
const [release, resource] = await resourceAcquire();
releases.push(release);
resources.push(resource);
}
return await f(...resources as unknown as Resources<ResourceAcquires>);
} finally {
releases.reverse();
for (const release of releases) {
await release();
}
}
}
async function x() {
let count: number = 0;
const h: ResourceAcquire<number> = async () => {
++count;
return [async () => { --count; }, count];
};
await withF(
[
h,
async () => {
return [async () => { }];
}
],
async (c, cs) => {
console.log(c, cs);
return c;
}
);
}
x(); One of the strange things is when we make the resource array constant. Functions that don't have readonly applied if given a readonly array will fail. If we say the parameter is readonly, we're guaranteeing that we won't modify this array. So having a strictly readonly array is technically more flexible. So a few more tweaks on the above and things should work. |
Slight extension: type ResourceAcquire<T = void> = () => Promise<readonly [() => Promise<void>, T?]>;
type Resources<T extends readonly ResourceAcquire<any>[]> = {
[K in keyof T] : T[K] extends ResourceAcquire<infer R> ? R: never
}
async function withF<
ResourceAcquires extends readonly [ResourceAcquire<any>] | readonly ResourceAcquire<any>[],
Result
>(
resourceAcquires: ResourceAcquires,
f: (...resources: Resources<ResourceAcquires> & any[]) => Promise<Result>
): Promise<Result> {
const releases: Array<() => Promise<void>> = [];
const resources: Array<any> = [];
try {
for (const resourceAcquire of resourceAcquires) {
const [release, resource] = await resourceAcquire();
releases.push(release);
resources.push(resource);
}
return await f(...resources as unknown as Resources<ResourceAcquires>);
} finally {
releases.reverse();
for (const release of releases) {
await release();
}
}
}
async function x() {
let count: number = 0;
await withF(
[
async () => {
return [async () => { }];
}
] as const,
async (c) => {
console.log(c);
return c;
}
);
const h: ResourceAcquire<number> = async () => {
++count;
return [async () => { --count; }, count];
};
await withF(
[
h,
async () => {
return [async () => { } ];
}
] as const,
async (c, cs) => {
console.log(c, cs);
return c;
}
);
const arr = [
h
] as const;
await withF(arr, async (c) => {
console.log(c);
});
const y = async () => {
return [async () => {}] as const;
}
const arr2 = [
y
] as const;
await withF(arr2, async (c) => {
console.log(c);
});
}
x(); Notice the usage of |
Changed to type ResourceAcquire<T = void> = () => Promise<readonly [() => Promise<void>, T?]>;
type Resources<T extends readonly ResourceAcquire<any>[]> = {
[K in keyof T] : T[K] extends ResourceAcquire<infer R> ? R: never
}
async function withF<
ResourceAcquires extends readonly [ResourceAcquire<unknown>] | readonly ResourceAcquire<unknown>[],
Result
>(
resourceAcquires: ResourceAcquires,
f: (...resources: Resources<ResourceAcquires> & any[]) => Promise<Result>
): Promise<Result> {
const releases: Array<() => Promise<void>> = [];
const resources: Array<unknown> = [];
try {
for (const resourceAcquire of resourceAcquires) {
const [release, resource] = await resourceAcquire();
releases.push(release);
resources.push(resource);
}
return await f(...resources as unknown as Resources<ResourceAcquires>);
} finally {
releases.reverse();
for (const release of releases) {
await release();
}
}
} |
Attempting to do the async function* g1 () {
const x = yield 1;
yield x;
return 3;
}
async function* g2 () {
try {
return yield* g1();
} finally {
console.log('DONE!');
}
}
async function main () {
const g = g2();
console.log('first', await g.next());
console.log('second', await g.next(2));
console.log('third', await g.next());
for await (const v of g2()) {
console.log(v);
}
}
main(); The result is:
The Furthermore the |
The type ResourceAcquire<Resource = void> = () => Promise<
readonly [ResourceRelease, Resource?]
>;
type ResourceRelease = () => Promise<void>;
type Resources<T extends readonly ResourceAcquire<any>[]> = {
[K in keyof T]: T[K] extends ResourceAcquire<infer R> ? R : never;
};
/**
* Make sure to explicitly declare or cast `acquires` as a tuple using `[ResourceAcquire...]` or `as const`
*/
async function withF<
ResourceAcquires extends
| readonly [ResourceAcquire<unknown>]
| readonly ResourceAcquire<unknown>[],
T,
>(
acquires: ResourceAcquires,
f: (resources: Resources<ResourceAcquires>) => Promise<T>,
): Promise<T> {
const releases: Array<ResourceRelease> = [];
const resources: Array<unknown> = [];
try {
for (const acquire of acquires) {
const [release, resource] = await acquire();
releases.push(release);
resources.push(resource);
}
return await f(resources as unknown as Resources<ResourceAcquires>);
} finally {
releases.reverse();
for (const release of releases) {
await release();
}
}
}
/**
* Make sure to explicitly declare or cast `acquires` as a tuple using `[ResourceAcquire...]` or `as const`
*/
async function* withG<
ResourceAcquires extends
| readonly [ResourceAcquire<unknown>]
| readonly ResourceAcquire<unknown>[],
T = unknown,
TReturn = any,
TNext = unknown,
>(
acquires: ResourceAcquires,
g: (
resources: Resources<ResourceAcquires>,
) => AsyncGenerator<T, TReturn, TNext>,
): AsyncGenerator<T, TReturn, TNext> {
const releases: Array<ResourceRelease> = [];
const resources: Array<unknown> = [];
try {
for (const acquire of acquires) {
const [release, resource] = await acquire();
releases.push(release);
resources.push(resource);
}
return yield* g(resources as unknown as Resources<ResourceAcquires>);
} finally {
releases.reverse();
for (const release of releases) {
await release();
}
}
}
export { withF, withG };
export type { ResourceAcquire, ResourceRelease }; It's important to start to plan the usage of this for other domains like nodes and wherever there's a |
Partially resolved by these commits which have been cherry picked into master:
@tegefaulkes once you rebase on master, you can also start using the |
The node connections are now using |
While working on #326, I noticed that the |
Some things to note:
Further prototyping is required to realise the application of this to all domains that are using |
Right now the So the usage of it was to allow re-entrancy, however it's not able to track that these 2 independent calls. Consider this difference:
If the call stack was:
We would expect that However if the call stack was instead:
Where they are called simultaneously & concurrently, then now we want want So |
The quick solution is to extract the common functionality out of |
It seems the DBTransaction was the only way to maintain context in a call stack/graph. Basically methods of Then if This context is different from the instance context, since it only exists during the call stack. That would be the only way to track the lock ownership on a call-basis. If we were do this for However It is possible to create an acquisition function for the DB transaction: public acquireTransaction: ResourceAcquire<DBTransaction> = async () => {
const tran = new Transaction({ db: this.db });
return [
async (e?: Error) => {
if (e == null) {
try {
await tran.commit();
} catch (e) {
await tran.rollback();
throw e;
}
await tran.finalize();
} else {
await tran.rollback();
}
},
tran
];
}; To do so, a little change must be done on
It's backwards compatible, but basically means the resource releaser will now have a reference to the error in case it was thrown during the acquisition or the handler call. |
Applying the concept of Consider a "transaction" that takes place between withF([
acquireTransaction()
], async ([tran]) => {
await nodeGraph.doSomething(tran);
await acl.doSomething(tran);
}); The result is that any mutation here is atomic between At the same time it's a sledgehammer because it locks the entire GG and ACL together. There's no fine-grained concurrency control here. The usage of the transaction, allows one to be more finegrained, since the interactions operate entirely on the transaction context. Note that transactions are sort of limited due to iteration, but that may be resolved by MatrixAI/js-db#4. It is the combination of smaller locks, and sophisticated locks that |
What if We want to use Methods should take a transaction parameter, most likely optionally so that the method can still work within its own transaction if it doesn't need to be shared between any other higher-level methods. class NodeGraph {
public doSomething(tran?: DBTransaction) {
if (tran == null) {
return await withF([
acquireTransaction()
], async ([tran]) => {
return this.doSomething(tran);
});
}
this.acl.doSomething(tran);
// continue with mutations
}
} |
Also the passing of See https://www.measurethat.net/Benchmarks/Show/6274/4/access-to-proxy-vs-object, most important would be to avoid nesting proxies as per https://gomakethings.com/better-proxy-performance-in-vanilla-js/. Accessing proxy vs accessing object isn't that bad. It's just an extra pointer dereference. I could imagine something like:
This would be a "proxy" based API:
|
The proxy is just used so that it will maintain the transaction context across calls to the underlying domain. The callback is still needed to know when a context must needs to be started and ended. It naturally "scopes" the context. The proxy objects mean that you don't need to do explicit passing of This possibly avoids having to have a The proxy can actually be a "decorator" of sorts. But it's important to maintain type-safety, so that the proxy can still be used whenever some function requires the same domain type. The creation of the proxy is therefore the same as any other resource, and fits in the resource API.
This abstracts above the One can abstract But you'd need to figure out how to incorporate locks as well in their relevant domains too. Just a crazy idea, but the benefit is just to avoid having to pass the For future investigation. |
I feel like there's a way to extend the ES6 promise for one situation (or better a decorator around a promise, so that subsequent promises remain the same), and this would enable you to do something like:
And after Unfortunately I'm not sure how to do it. I tried adding a |
Can't you already await a promise multiple times to get the same result? wouldn't this be like caching the returned promise rather than augmenting the promise itself? |
Not sure what you mean, but I've abandoned that approach. I'm moving ahead with how it is done right now in EFS. Just need to abstract the locks collections structure from both DB and PK into |
During our meeting, we discussed that there can be a slight overhead with using the DB transaction with locks. Consider Now technically this doesn't require a lock, since a check can be made through 2 reads, even if those reads may be inconsistent. But this is a good example to illustrate the problem. Now if you don't pass a But if you do pass So the caller of It's going to be quite complex to always keep track of what keys needs to be locked. In SQL, this problem doesn't exist in this way, because locks are within the transaction context, not outside. Then if one were to call Now this would mean that you are attempting to acquire a lock while within the transaction. Is that necessarily a bad thing? I reckon However right now, nested acquisitions would result in a deadlock.
I believe this problem was mentioned already in #294 (comment), and the solution proposed was solution 2, where locks are also tracked as they are passed down the call graph with the So there are some further things to do:
|
Note that combined |
Some further reading of:
Tells me that our LevelDB transactions are quite similar to STM. In our case, the state is leveldb, and the transactions are all just software and in-memory in a single process. One major difference is that the transactions in STM are meant to be "optimistic", and they are designed to take a snapshot of the state before the transaction starts, and then when they commit they check if their initial state is still the same before committing, and if it is not, the whole transactional operation is retried at the beginning but with the new value as the initial state. The user code which is provided as a function that takes the initial state is then re-ran, but it can decide to abort the transaction if the initial state is unsatisfactory. This is a form of optimistic concurrency control. Our And during commits, it does not check if the initial state or snapshot state is consistent with the underlying database. There is a way to make our One major change is to only compare the values that have been touched. This means any The problem is that the while (true) {
const [tranRelease, tran] = await tranAcquire();
await tran.put('x', 'y');
try {
await tranRelease();
} catch (e) {
if (e instanceof DBCommitWriteConflict) {
// we have to restart the transaction entirely, because the `x` was a conflict
continue;p
}
throw e;
}
} Which means convenience methods would have to be used to deal with this situation in a nicer way, and they can just rerun the callback that is passed in. Seems like we have quite a lot of flexibility to manipulate our |
Further reading here https://www.sobyte.net/post/2022-01/rocksdb-tx/ shows that rocksdb actually supports both optimistic transactions and pessimistic transactions. The optimistic transaction follows what I wrote above. It also just throws an exception and expects the user to figure out what they want to do and that could mean a retry, or a loop goto with In the pessimistic case, it bundles up a lock manager into the transaction, and this lock manager is also detecting deadlocks, since by allowing locks to form within the transaction, deadlocks can occur. It also addresses re-entrant locking by allowing locks to be acquired if already acquired in the same transaction. But locks that are just deadlocked by other transactions due to different lock-ordering, the only solution is to use a timeout mechanism. Once timedout, it also can do some diagnostics to find out why the transaction failed. In this case, you can track what locks you were waiting on. And that this points to particular transactions with transaction IDs. And you can check what those transactions were waiting for, and if they were waiting for a lock held by your transaction ID, then you have a circular deadlock. |
This could mean that task 1 in #294 (comment) is just about bringing together a proper pessimistic DB transaction. However many places in PK, it can probably do well with just optimistic DB transaction. More prototyping needed here. |
The js-resources has been been updated so that ResourceAcquire functions will take an array of already allocated resources. This enables Note that I tried to do it a bit more elegantly with using a POJO to form a resource DAG (https://github.com/microsoft/p-graph), but this was difficult to do, and didn't allow users to form a strict order. And may require more machinery like memoization. So for now this is usable. However I couldn't get the types correct so explicit types are needed for the resource types. An example of this: withF([
iNodeMgr.inoAllocation(),
([rootIno]: [INodeIndex]) => iNodeMgr!.transaction(rootIno)()
], async ([rootIno, tran]) => {
// ... use rootIno and tran
}); The reason this was done is because |
I'm going to need to implement pessimistic transaction support in |
So most of the updates to the PK codebase should be doable, but the final As for deadlock detection, we would consider that a programmer error when discovered, so when deadlock is discovered, an exception is thrown as normal. Users can retry of course. The PK application should not crash in this instance however. |
Pessimistic shouldn't be needed. Working on optimistic instead. It's alot more flexible. |
Snapshot Isolation based OCC transactions should be backwards compatible with the existing usage of The main difference is the possibility of In EFS, it will continue to use locks because an OCC transaction can be converted to a PCC transaction as long as locks are used. In EFS, it's API demands PCC behaviour, therefore it will continue to use locks, even if it updates to the new SI transactions. Thus the SI implementation will be backwards compatible. Also it's usage of locking should mean that write-skews cannot occur. However if upgrading to the SI implementation results in EFS throwing One happy change is the expectation that the reads will now be consistent entirely with The situation is different in PK, as soon as it upgrades to SI transactions, it will basically drop all usages of locking with respect to the DB. However it may preserve locking in situations where write-skew may occur. Write skew occurs when one transaction writes, and another transaction reads, and some consistency rule is broken. See:
A variety of solutions are possible: materialize the conflict, use read locks... etc. In the long term, we may upgrade the transaction from SI to SSI (serializable snapshot isolation) which was a recent invention from 2008, and this will even prevent write-skews, and thus all DB locks can be completely dropped. Now when a conflict occurs, we may decide to auto-retry. However this is complicated by other side-effects that may occur. Only if at least one of these is true:
Can you do an auto-retry. Auto-retries should not be done when the action should be changed due a change in state. What "should" means depends on the usability of the program. So there's an "roadmap" for the transaction development:
The second and third phases do not block our testnet deployment. They just improve our software model, reduce future bugs, and reduce entropy in our code. |
With the new DB integration, there are 2 places where locking will be necessary to ensure serial counter updates:
To deal with these, we need to add locking before starting the transaction, and only do this as part of the public methods. If these 2 properties may be via a public method, then we should be using a The DB has no knowledge about these locks atm, so they occur outside the transaction as properties on the When acquiring locks for these transactions do it in this order: withF([this.lock.write(), this.db.transaction()], async ([, tran]) => {
// use tran
}); It is important to acquire the lock prior to the transaction to avoid building up resources to hold the transaction while waiting for the lock. Note that when the DB gains the ability to lock things internal to the transaction, that is PCC control, this can be replaced with just: withF([this.db.transaction()], async ([tran]) => {
await tran.lock('counter');
await tran.get('counter');
}); The API hasn't been fully worked out for PCC locking, it's possible locking can be integrated directly into Even when PCC is used, Still to do is to identify where write-skews may occur. |
Currently the |
Expanded the spec with details from the PR. |
The current staging of For EFS, the update should focus on dealing with:
For PK, the update should focus on dealing with:
We should also add benchmarks to identify slowness, I think the rocksdb is a bit slower in normal gets/puts, but the iterator and transaction implementation should be alot faster since it's using the native C++ without additional JS abstraction. The PR to js-polykey should also solve #244. |
Because lots of methods will now be transactional with an optional transaction, they will all need variants of something like this: public async pushTask(task, tran?: DBTransaction): Promise<void> {
if (tran == null) {
return this.db.withTransactionF(
(tran) => this.pushTask.apply(this, [...arguments, tran])
);
}
// Do the work
} Ideally we could abstract even the calling itself more... but |
Specification
Locking in js-polykey has gone through alot of iterations. The most recent iteration is in
js-db
usage of locks where fine grained locks provided byasync-mutex
is used, as well as theRWLock
abstraction insrc/utils.ts
.For most domains such as
ACL
, the locks are too coarse-grained, causing one to lock the entire domain itself. Many of these locks should be replaced with the usage of DB transactions as it is done injs-encryptedfs
.In some places, fine grained locks can replace the existing coarse grained locking or even replace the usage of condition variables. Currently the
src/network/Connection.ts
and derived classes makes use of a_composed
boolean which did double duty in terms of indicating when composition was done but as a way to prevented repeated concurrent calls ofcompose
. The first duty is fine, but the second duty should done with a fine grained lock shared between the all the calls that should be blocked when composition operation is occurring. This means all the methods that currently check_composed
and throw exceptions when it is not true.Transactions has been introduced to js-db. With this we can replace a lot of the existing locking with the use of the new db transactions. The general changes that need to be implemented are as follows.
withF
,withG
locking directly.ErrorDBTransactionConflict
Error should never be seen by the user. We should catch and override it with a more descriptive error for the context.LevelPath
andKeyPath
s instead.db.put
,db.get
anddb.del
should be using transactions viatran.put/get/del
This applies to all domains that make use of DB OR domains that depend on others that make use of DB. The goal here is to make any even starting from the handlers atomic.
There are limitations to this however. Since a transaction can fail if there is overlapping edits between transactions. We can't really include changes to the db that will commonly or guarantee conflict. Example of this are counters or commonly updated fields. So far this has been seen in;
NotificationsManager
. Makes use of a counter so any transactions that include Adding or removing a notification WILL conflict. Reads also update metadata so concurrently reading the same message WILL conflict.Some cases we will need to make use of locking along with a transaction. A good example of this is in the
NotificationManager
where we are locking the counter update. When this is the case we need to take extra care with the locking. Unless the lock wraps the whole transaction it is still possible to conflict on the transaction. we can't compose operations that rely on this locking with larger transactions.An example of this problem is.
This means that some operations or domains can't be composed with larger transactions. It has yet to be seen if this will cause an issue since more testing is required to confirm any problem. I suppose this means we can't mix pessimistic and optimistic transactions. So far it seems it will be a problem with the following domains.
Note that this has nothing to do with IPC locking as in #290.
Additional Context
NodeConnection
and network connections and having a race condition withthis._composed
being true, but the properties that should only be used when composition finishes being undefined.withF
andwithG
context for DB transactions and provide transactional iterators js-db#10 - EnablingwithF
andwithG
usage in DBTasks
withF
,withG
locking directly.ErrorDBTransactionConflict
Error should never be seen by the user. We should catch and override it with a more descriptive error for the context.LevelPath
andKeyPath
s instead.db.put
,db.get
anddb.del
should be using transactions viatran.put/get/del
The text was updated successfully, but these errors were encountered: