Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAR-5390][External] Add 'extract video-artifacts' command #980

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/workflows/JOB_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,23 @@ jobs:
pip install wheel && \
pip install --upgrade setuptools && \
pip install --editable '.[test,ml,medical,dev, ocv]'"

- name: Install ffmpeg (Ubuntu)
if: matrix.os == 'ubuntu-latest'
shell: bash
run: |
sudo apt-get update
sudo apt-get install -y ffmpeg

- name: Install ffmpeg (macOS)
if: matrix.os == 'macos-latest'
shell: bash
run: brew install ffmpeg

- name: Install ffmpeg (Windows)
if: matrix.os == 'windows-latest'
shell: pwsh
run: choco install ffmpeg -y

- name: Run pytest
shell: bash # Stops Windows hosts from using PowerShell
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ darwin_py.egg-info/PKG-INFO
*.jpg
*.bpm
*.mov
*.mp4
*.txt

# from https://raw.githubusercontent.com/github/gitignore/master/Python.gitignore
Expand Down
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
libssl-dev \
python3-dev \
curl \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*

# Install Poetry in a known location and add to PATH
Expand Down
21 changes: 12 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,16 @@ Darwin-py can both be used from the [command line](#usage-as-a-command-line-inte

Main functions are (but not limited to):

- Client authentication
- Listing local and remote datasets
- Create/remove datasets
- Upload/download data to/from remote datasets
- Direct integration with PyTorch dataloaders
- Client authentication
- Listing local and remote datasets
- Create/remove datasets
- Upload/download data to/from remote datasets
- Direct integration with PyTorch dataloaders
- Extracting video artifacts

Support tested for python 3.9 - 3.12

## 🏁 Installation
## 🏁 Installation

```
pip install darwin-py
Expand All @@ -43,11 +44,14 @@ If you wish to use video frame extraction, then you can use the `ocv` flag to in
pip install darwin-py[ocv]
```

If you wish to use video artifacts extraction, then you need to install [FFmpeg](https://www.ffmpeg.org/download.html)

To run test, first install the `test` extra package

```
pip install darwin-py[test]
```

### Development

See our development and QA environment installation recommendations [here](docs/DEV.md)
Expand Down Expand Up @@ -132,8 +136,8 @@ For videos, the frame rate extraction rate can be specified by adding `--fps <fr

Supported extensions:

- Video files: [`.mp4`, `.bpm`, `.mov` formats].
- Image files [`.jpg`, `.jpeg`, `.png` formats].
- Video files: [`.mp4`, `.bpm`, `.mov` formats].
- Image files [`.jpg`, `.jpeg`, `.png` formats].

```
$ darwin dataset push test /path/to/folder/with/images
Expand Down Expand Up @@ -174,7 +178,6 @@ A minimal example to download a dataset is provided below and a more extensive o

[./darwin_demo.py](https://github.com/v7labs/darwin-py/blob/master/darwin_demo.py).


```python
from darwin.client import Client

Expand Down
12 changes: 12 additions & 0 deletions darwin/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,18 @@ def _run(args: Namespace, parser: ArgumentParser) -> None:

elif args.command == "convert":
f.convert(args.format, args.files, args.output_dir)
elif args.command == "extract":
if args.extract_type == "video-artifacts":
f.extract_video_artifacts(
source_file=args.source_file,
output_dir=args.output_dir,
storage_key_prefix=args.storage_key_prefix,
fps=args.fps,
segment_length=args.segment_length,
repair=args.repair,
)
else:
parser.print_help()
elif args.command == "dataset":
if args.action == "remote":
f.list_remote_datasets(args.all, args.team)
Expand Down
41 changes: 40 additions & 1 deletion darwin/cli_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,16 +60,17 @@
)
from darwin.exporter import ExporterNotFoundError, export_annotations, get_exporter
from darwin.exporter.formats import supported_formats as export_formats
from darwin.extractor import video
from darwin.importer import ImporterNotFoundError, get_importer, import_annotations
from darwin.importer.formats import supported_formats as import_formats
from darwin.item import DatasetItem
from darwin.utils import (
BLOCKED_UPLOAD_ERROR_ALREADY_EXISTS,
find_files,
persist_client_configuration,
prompt,
secure_continue_request,
validate_file_against_schema,
BLOCKED_UPLOAD_ERROR_ALREADY_EXISTS,
)


Expand Down Expand Up @@ -1468,3 +1469,41 @@ def _console_theme() -> Theme:

def _has_valid_status(status: str) -> bool:
return status in ["new", "annotate", "review", "complete", "archived"]


def extract_video_artifacts(
source_file: str,
output_dir: str,
storage_key_prefix: str,
*,
fps: float = 0.0,
segment_length: int = 2,
repair: bool = False,
) -> None:
"""
Generate video artifacts (segments, sections, thumbnail, frames manifest).

Parameters
----------
source_file : str
Path to input video file
output_dir : str
Output directory for artifacts
storage_key_prefix : str
Storage key prefix for generated files
fps : float, optional
Desired output FPS (0.0 for native), by default 0.0
segment_length : int, optional
Length of each segment in seconds, by default 2
repair : bool, optional
Whether to attempt to repair video if errors are detected, by default False
"""

video.extract_artifacts(
source_file=source_file,
output_dir=output_dir,
storage_key_prefix=storage_key_prefix,
fps=fps,
segment_length=segment_length,
repair=repair,
)
1 change: 1 addition & 0 deletions darwin/extractor/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Video extraction functionality for Darwin."""
Loading