Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop writing duplicated blobs to storage for routerlicious driver. #592

Merged
merged 4 commits into from
Nov 13, 2019

Conversation

jatgarg
Copy link
Contributor

@jatgarg jatgarg commented Nov 12, 2019

1.) Stop writing duplicated blobs to storage for routerlicious driver.

@jatgarg jatgarg requested a review from vladsud November 12, 2019 21:35
@jatgarg jatgarg self-assigned this Nov 12, 2019
@jatgarg jatgarg changed the title Stop writing duplicated blobs to storage Stop writing duplicated blobs to storage for routerlicious driver. Nov 12, 2019
@vladsud
Copy link
Contributor

vladsud commented Nov 13, 2019

ODSP part is the most interesting and missing.
Could you please implement it?

Copy link
Contributor

@vladsud vladsud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@jatgarg
Copy link
Contributor Author

jatgarg commented Nov 13, 2019

ODSP part is the most interesting and missing.
Could you please implement it?

Yes, first wanted to get this done. So that I get more idea.


/**
* Document access to underlying storage for routerlicious driver.
*/
export class DocumentStorageService implements IDocumentStorageService {

private readonly blobsPathCache = new Map<string, string>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache currently may include loose git objects for nacked summaries. In normal summarize case, there is a tiny window of opportunity for them to be garbage collected before the server commits them and moves the pointer to that commit.
Here, we would be increasing that window up to the life of the summarizer container. Probably fine for now, but we might like to invalidate the cache for nacks or change it so we enable on ack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened Issue # 598 to follow up here.
Note - it might be there is no work to do here, if R11S promises to never GC blobs.


In reply to: 346028389 [](ancestors = 346028389)

@arinwt
Copy link
Contributor

arinwt commented Nov 14, 2019

The cache should be initialized during load by populating it in the getSnapshotTree call. It can go through the resulting tree and add all blob ids to the cache.

@jatgarg
Copy link
Contributor Author

jatgarg commented Nov 14, 2019

I will follow up on initializing the cache.

@arinwt
Copy link
Contributor

arinwt commented Nov 16, 2019

Linking to issue #398

jatgarg added a commit that referenced this pull request Nov 23, 2019
* Stop writing duplicated blobs to storage for routerlicious driver. (#592)

* stop writing duplicated blobs to storage

* change hashing logic

* add assert

* populate cache with the summary blobs (#626)

* populate cache with the summary blobs

* add in cache while reading tree

* rename

* blob deduping for odsp driver (#639)

* blob deduping for odsp driver

* pr suggestions

* update map in get latest

* have 2 caches with latest and prev caching

* populate cache in blob read

* change comment

* make map local

* make local

* local
curtisman pushed a commit that referenced this pull request Nov 27, 2019
)

* Stop writing duplicated blobs to storage for routerlicious driver. (#592)

* stop writing duplicated blobs to storage

* change hashing logic

* add assert

* populate cache with the summary blobs (#626)

* populate cache with the summary blobs

* add in cache while reading tree

* rename

* blob deduping for odsp driver (#639)

* blob deduping for odsp driver

* pr suggestions

* update map in get latest

* have 2 caches with latest and prev caching

* populate cache in blob read

* change comment

* make map local

* make local

* local
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants