Skip to content
This repository has been archived by the owner on Dec 1, 2024. It is now read-only.

RocksDB version upgrade #143

Closed
wants to merge 5 commits into from
Closed

Conversation

lu4
Copy link
Contributor

@lu4 lu4 commented Sep 22, 2019

No description provided.

@@ -176,189 +179,505 @@
}]
]
, 'sources': [
'rocksdb/db/auto_roll_logger.cc'
# , 'rocksdb/cache/cache_bench.cc'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these files no longer exist, or why did you comment them out?

And could you provide some context on the changes in general?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, this must have been my git stash apply

Copy link
Contributor Author

@lu4 lu4 Sep 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah pardon me, yes they're not, I've updated git submodule with new RocksDB sources, it's latest version now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange that github didn't actually picked the change, it should have been reflected in github somehow....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vweevers Anyway what I've basically did was:

  1. Updated to latest version every submodule in the project (namely only RocksDB)
> git submodule foreach git pull origin master
  1. Updated rocksdb.gyp with all the changes from official RocksDB repo.

The code builds fine, but fails at runtime, the issue doesn't seem critical or complex (just need to figure out what kind of Symbol is missing)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I meant to ask what the notable changes in RocksDB are. What benefits do we get from upgrading? Are there breaking changes? For example, reduced platform support, or not being able to read databases that were created with our current version, etc.

@lu4
Copy link
Contributor Author

lu4 commented Sep 22, 2019

Ok, the error is finally reproduced, I have the same runtime error as one in Travis...

@lu4
Copy link
Contributor Author

lu4 commented Sep 22, 2019

@vweevers It seems that everything working, could you please validate?

P.S.
I couldn't pass --no-duplicate-basename-check into node-gyp-build (I see your are the maintainer of node-gyp-build maybe you have an idea how to deal with that) due to it I'm unable to build the following source files, they seem not to be used in this project but nevertheless it's better not accumulate those

      , 'rocksdb/utilities/cassandra/cassandra_compaction_filter.cc'
      , 'rocksdb/utilities/cassandra/format.cc'
      , 'rocksdb/utilities/cassandra/merge_operator.cc'

@vweevers
Copy link
Member

@lu4 Do we really need to include those cassandra/ files? Same question for test_util/, monitoring/, cuckoo/, etc. They increase build time (and possibly build size?).

(I'm a co-maintainer of node-gyp-build, btw)

@lu4
Copy link
Contributor Author

lu4 commented Sep 22, 2019

I'm afraid that in future it is possible that such issues may pileup and you would end up wasting half of the day searching for solution instead we just need to check whether it works now after half of the day was already spent on my behalf.

There's another thing to it that I wanted to suggest. I was thinking about rolling out yet another interface to RocksDB. It's just RocksDB has much richer functionality, there's no reason to hide it under LevelDB interface. So I was thinking to providing a parallel TypeScript based interface to RocksDB. Not sure whether it is reasonable to do within this repo, but I think it's possible to support both.

What's your thoughts on this?

@lu4
Copy link
Contributor Author

lu4 commented Sep 22, 2019

@vweevers as per including cassandra files maybe it's reasonable to at least make it working and then optimize build times?

@vweevers
Copy link
Member

You already got it working, right? Tests are passing, and we have good coverage. Nice job! It sounds like you had a long day. We're almost there, just need to tweak it. I'm mainly concerned about build size, because we bundle prebuilt binaries in the npm package and shouldn't increase its size by too much. I'm gonna take a break though; I'll come back to this in a few days to review. Thanks!

@lu4
Copy link
Contributor Author

lu4 commented Sep 23, 2019

@vweevers I was looking through binding.cc is there a reason why every operation to RocksDB is wrapped into Worker? As I understand it's not a web worker but a thing that rather places a call into queue on event loop?

@vweevers
Copy link
Member

I was thinking about rolling out yet another interface to RocksDB. It's just RocksDB has much richer functionality, there's no reason to hide it under LevelDB interface.

There is a reason: we follow the abstract-leveldown interface. We can add RocksDB-specific options to rocksdb as needed. New methods will meet more scrutiny, especially those that can't be replicated in other abstract-leveldown implementations. Please open new issues for specific features you'd like to see, we'll discuss case by case.

So I was thinking to providing a parallel TypeScript based interface to RocksDB. Not sure whether it is reasonable to do within this repo, but I think it's possible to support both.

The short answer is no, because the Level org is JavaScript-only. Typings are housed in DT, maintained by other folks. For background, see Level/community#16.

I was looking through binding.cc is there a reason why every operation to RocksDB is wrapped into Worker? As I understand it's not a web worker but a thing that rather places a call into queue on event loop?

It's used to perform work in a libuv thread, to avoid blocking the main Node.js thread.

@lu4
Copy link
Contributor Author

lu4 commented Oct 1, 2019

@vweevers as per your request I've updated rocksdb.gyp with changes that reduce the build times for the project, I've removed some monitoring, block based table support, blob_db, cassandra and many others, although due to severe time requirements necessary for such work I think the results are far from being complete.

Every removed file from rocksdb.gyp was checked using full project rebuild and run tests, for every missing symbol many files were uncommented. Also I can not guarantee that all the functionality that may be necessary for the project is present in the build. But the functionality which was covered by tests is working.

What I've also found that significant performance improvements are obtainable from using concurrent builds i.e. node-gyp -j 16 rebuild build option. So I suggest to look closer into your node-gyp-build child and check whether it is possible to use the above option for build perf improvement.

The provided work is my word of thank you for you personally, your help and the project you support.

However with intent to improve this project I must provide the other part of the feedback I gathered while working with the code from this repo:

  1. Wrapping every operation into async worker is slight overengineering, also there's no way to use alternative simpler blocking-calls concept. Passing every individual get / put / del operation into background thread is very much granular and introduces more overhead than performance improvement. Also it may result with significant growth of libuv queue size for IO heavy applications.

  2. Current version of NAPI supports promises, apart from being more favorable on the user's side they are drastically reducing the amount boilerplate code necessary to support callback life cycle.

  3. Current Iterator implementation is unacceptable

  4. General idea to fit RocksDB functionality into LevelDown interface is similar to an attempt to fit an elephant on a bedside table. Any potential benefit from using RocksDB as a faster (compared to LevelDB) database is diminished by LevelDown interface limitations. With all respect to LevelDown I must say that there's a lot room for improvement on LevelDown' interface side. However it's clear that it'll never happen due to backward compatibility concerns. Lastly the naming conventions used on LevelDown (such as AbstractLevelDown, ...) are not informative.

  5. The absence of strict type checks encourages developer to "guess" and extensively check for input value types that go into method. The worst thing about that any such attempt can never be successful there's no way to introduce same level of type safety through runtime checks compared to type safety of statically typed checks. Also statically typed checks cost nothing for end-user in terms of performance but run-time checks do.

  6. It would be much more simpler to handle garbage collection and object/callback lifecycle management on JS side rather than on C++ side.

  7. TypeScript is not a run-time language, like C++ it is being compiled into JavaScript before being packaged and uploaded into NPM. So there are no reasons to omit first-class support for this language as it can be used equivalently easy by both JS-only projects and TS-only projects.

So as per list of points above I must conclude that I don't agree with practices, architectural approaches and core views expressed on this project. So I decided to create a project spinoff based on some of the work used from this repository (mainly project infrastructure).

Just in case (with hope) if at some point the code below may become helpful for this project here are the the things I was able to pull together:

Here I have prepared a complete rewrite of binding.cc which provides only low-level interface for JS/TS RocksDB runtime

And here is JS/TS bindings file

In future I'm planning to cover full RocksDB interface with two versions of API (Low Level and High Level) and full async support.

Stay tuned :)

@lu4
Copy link
Contributor Author

lu4 commented Oct 1, 2019

Ok, it seems that not all of the fixes are in place, I will try to introduce fixes a bit later...

@vweevers
Copy link
Member

vweevers commented Oct 1, 2019

I'm not sure whether you want a response. If you do, please open new issue(s), to avoid yak shaving.

@vweevers
Copy link
Member

I cherry-picked and squashed the commits into #141 (and updated to RocksDB 6.17.3).

@vweevers vweevers closed this Apr 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants