-
-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement NVMe driver #1203
Comments
My "archeology" effort revealed that @imp implemented NVMe driver for OSv in 2015 but it looks like the branch and his fork -https://github.com/imp/osv/tree/nvme - are sadly gone :-( |
@wkozaczuk Wow! That work still lives in https://gitlab.com/osv/osv/-/commits/nvme/. Although I am not sure how applicable that code would be after all those years. |
Thanks a lot, @imp! I have very recently learned that one of the students has been working on adding NVMe support to OSv as part of his Bachelor thesis. So we may have 2 implementations soon! As you point out yours may be a little outdated but many ideas and concepts are not. It will for sure be useful to look at it for comparison at least. Please keep this branch around if you do not mind. |
I am currently using a NVMe driver on OSv that we implemented based on uNVMe. It works for my use case but it is not included in the OSv devfs nor in the vfs. Therefore, for now, I rely on calling the drivers functions directly from the application. |
@Meandres Thanks, I will take a look. |
BTW we also have another PR implementing nvme - #1284 |
This patch implements the NVMe block device driver. It is greatly based on the pull request submitted by Jan Braunwarth (see cloudius-systems#1284) so most credit goes to Jan. As his PR explains, OSv can be started with emulated NVMe disk on QEMU like so: ./scripts/run.py --nvme Compared to the Jan's PR, this patch is different in following ways: - removes all non-NVMe changes (various bug fixes or ioctl enhancements are part of separate PRs) - replaces most of the heap allocations by using stack which should reduce some contention - tweaks PRP-handling code to use lock-less ring buffer which should further reduce contention when allocating memory - fixes a bug in I/O queue CQ handling to correctly determine if SQ is not full - assumes single namespace - 1 (most logic to deal with more has been preserved) - reduces I/O queue size to 64 instead of 256 - makes code a little more DRY Please note that, as Jan points out, the block cache logic of splitting reads and writes into 512-byte requests causes very poor performance when stress testing at devfs level. However, this behavior is not NVMe specific and does not affect most applications that go through a VFS and filesystem driver (ZFS, EXT, ROFS) which use the strategy() method which does not use block cache. Based on my tests, the NVMe read performance (IOPs and bytes/s) is 60-70% of the virtio-blk on QEMU. I am not sure how much that is because of this implementation of the NVMe driver or it is because virtio-blk is by design much faster than anything emulated including NVMe. Closes cloudius-systems#1203 Signed-off-by: Jan Braunwarth <[email protected]> Signed-off-by: Waldemar Kozaczuk <[email protected]>
This patch implements the NVMe block device driver. It is greatly based on the pull request submitted by Jan Braunwarth (see #1284) so most credit goes to Jan. As his PR explains, OSv can be started with emulated NVMe disk on QEMU like so: ./scripts/run.py --nvme Compared to the Jan's PR, this patch is different in following ways: - removes all non-NVMe changes (various bug fixes or ioctl enhancements are part of separate PRs) - replaces most of the heap allocations by using stack which should reduce some contention - tweaks PRP-handling code to use lock-less ring buffer which should further reduce contention when allocating memory - fixes a bug in I/O queue CQ handling to correctly determine if SQ is not full - assumes single namespace - 1 (most logic to deal with more has been preserved) - reduces I/O queue size to 64 instead of 256 - makes code a little more DRY Please note that, as Jan points out, the block cache logic of splitting reads and writes into 512-byte requests causes very poor performance when stress testing at devfs level. However, this behavior is not NVMe specific and does not affect most applications that go through a VFS and filesystem driver (ZFS, EXT, ROFS) which use the strategy() method which does not use block cache. Based on my tests, the NVMe read performance (IOPs and bytes/s) is 60-70% of the virtio-blk on QEMU. I am not sure how much that is because of this implementation of the NVMe driver or it is because virtio-blk is by design much faster than anything emulated including NVMe. Closes #1203 Signed-off-by: Jan Braunwarth <[email protected]> Signed-off-by: Waldemar Kozaczuk <[email protected]>
This is one of the blocks of work to allow running OSv on KVM-based EC2 instances.
Here are some helpful resources:
Testing this should not be to difficult as QEMU supports NVMe.
Refs #924
The text was updated successfully, but these errors were encountered: