usls
is a Rust library integrated with ONNXRuntime that provides a collection of state-of-the-art models for Computer Vision and Vision-Language tasks, including:
- YOLO Models: YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv9, YOLOv10, YOLOv11
- SAM Models: SAM, SAM2, MobileSAM, EdgeSAM, SAM-HQ, FastSAM
- Vision Models: RTDETR, RTMO, DB, SVTR, Depth-Anything-v1-v2, DINOv2, MODNet, Sapiens, DepthPro
- Vision-Language Models: CLIP, BLIP, GroundingDINO, YOLO-World, Florence2
Click to expand Supported Models
Model | Task / Type | Example | CUDA f32 | CUDA f16 | TensorRT f32 | TensorRT f16 |
---|---|---|---|---|---|---|
YOLOv5 | Classification Object Detection Instance Segmentation |
demo | ✅ | ✅ | ✅ | ✅ |
YOLOv6 | Object Detection | demo | ✅ | ✅ | ✅ | ✅ |
YOLOv7 | Object Detection | demo | ✅ | ✅ | ✅ | ✅ |
YOLOv8 | Object Detection Instance Segmentation Classification Oriented Object Detection Keypoint Detection |
demo | ✅ | ✅ | ✅ | ✅ |
YOLOv9 | Object Detection | demo | ✅ | ✅ | ✅ | ✅ |
YOLOv10 | Object Detection | demo | ✅ | ✅ | ✅ | ✅ |
YOLOv11 | Object Detection Instance Segmentation Classification Oriented Object Detection Keypoint Detection |
demo | ✅ | ✅ | ✅ | ✅ |
RTDETR | Object Detection | demo | ✅ | ✅ | ✅ | ✅ |
FastSAM | Instance Segmentation | demo | ✅ | ✅ | ✅ | ✅ |
SAM | Segment Anything | demo | ✅ | ✅ | ||
SAM2 | Segment Anything | demo | ✅ | ✅ | ||
MobileSAM | Segment Anything | demo | ✅ | ✅ | ||
EdgeSAM | Segment Anything | demo | ✅ | ✅ | ||
SAM-HQ | Segment Anything | demo | ✅ | ✅ | ||
YOLO-World | Object Detection | demo | ✅ | ✅ | ✅ | ✅ |
DINOv2 | Vision-Self-Supervised | demo | ✅ | ✅ | ✅ | ✅ |
CLIP | Vision-Language | demo | ✅ | ✅ | ✅ Visual ❌ Textual |
✅ Visual ❌ Textual |
BLIP | Vision-Language | demo | ✅ | ✅ | ✅ Visual ❌ Textual |
✅ Visual ❌ Textual |
DB | Text Detection | demo | ✅ | ✅ | ✅ | ✅ |
SVTR | Text Recognition | demo | ✅ | ✅ | ✅ | ✅ |
RTMO | Keypoint Detection | demo | ✅ | ✅ | ❌ | ❌ |
YOLOPv2 | Panoptic Driving Perception | demo | ✅ | ✅ | ✅ | ✅ |
Depth-Anything v1 & v2 | Monocular Depth Estimation | demo | ✅ | ✅ | ❌ | ❌ |
MODNet | Image Matting | demo | ✅ | ✅ | ✅ | ✅ |
GroundingDINO | Open-Set Detection With Language | demo | ✅ | ✅ | ||
Sapiens | Body Part Segmentation | demo | ✅ | ✅ | ||
Florence2 | a Variety of Vision Tasks | demo | ✅ | ✅ | ||
DepthPro | Monocular Depth Estimation | demo | ✅ | ✅ |
You have two options to link the ONNXRuntime library
-
-
For detailed setup instructions, refer to the ORT documentation.
-
- Download the ONNX Runtime package from the Releases page.
- Set up the library path by exporting the
ORT_DYLIB_PATH
environment variable:export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.19.0
-
-
Just use
--features auto
cargo run -r --example yolo --features auto
cargo run -r --example yolo # blip, clip, yolop, svtr, db, ...
-
cargo add usls
Or use a specific commit:
[dependencies] usls = { git = "https://github.com/jamjamjon/usls", rev = "commit-sha" }
-
- Build model with the provided
models
andOptions
- Load images, video and stream with
DataLoader
- Do inference
- Retrieve inference results from
Vec<Y>
- Annotate inference results with
Annotator
- Display images and write them to video with
Viewer
example code
use usls::{models::YOLO, Annotator, DataLoader, Nms, Options, Vision, YOLOTask, YOLOVersion}; fn main() -> anyhow::Result<()> { // Build model with Options let options = Options::new() .with_trt(0) .with_model("yolo/v8-m-dyn.onnx")? .with_yolo_version(YOLOVersion::V8) // YOLOVersion: V5, V6, V7, V8, V9, V10, RTDETR .with_yolo_task(YOLOTask::Detect) // YOLOTask: Classify, Detect, Pose, Segment, Obb .with_ixx(0, 0, (1, 2, 4).into()) .with_ixx(0, 2, (0, 640, 640).into()) .with_ixx(0, 3, (0, 640, 640).into()) .with_confs(&[0.2]); let mut model = YOLO::new(options)?; // Build DataLoader to load image(s), video, stream let dl = DataLoader::new( // "./assets/bus.jpg", // local image // "images/bus.jpg", // remote image // "../images-folder", // local images (from folder) // "../demo.mp4", // local video // "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4", // online video "rtsp://admin:[email protected]:554/h264/ch1/", // stream )? .with_batch(2) // iterate with batch_size = 2 .build()?; // Build annotator let annotator = Annotator::new() .with_bboxes_thickness(4) .with_saveout("YOLO-DataLoader"); // Build viewer let mut viewer = Viewer::new().with_delay(10).with_scale(1.).resizable(true); // Run and annotate results for (xs, _) in dl { let ys = model.forward(&xs, false)?; // annotator.annotate(&xs, &ys); let images_plotted = annotator.plot(&xs, &ys, false)?; // show image viewer.imshow(&images_plotted)?; // check out window and key event if !viewer.is_open() || viewer.is_key_pressed(usls::Key::Escape) { break; } // write video viewer.write_batch(&images_plotted)?; // Retrieve inference results for y in ys { // bboxes if let Some(bboxes) = y.bboxes() { for bbox in bboxes { println!( "Bbox: {}, {}, {}, {}, {}, {}", bbox.xmin(), bbox.ymin(), bbox.xmax(), bbox.ymax(), bbox.confidence(), bbox.id(), ); } } } } // finish video write viewer.finish_write()?; Ok(()) }
- Build model with the provided
This project is licensed under LICENSE.