A RESTful API that allows for quickly searching images based on their content using natural language by leveraging CLIP, OpenAI's image and language multimodal model.
Code testing coverage is 98%.
- FastAPI
- PostgreSQL with pgvector as a vector index for storing and comparing embeddings.
- SQLAlchemy
- HuggingFace's instance of CLIP
- AWS S3 bucket for image storage
You will need to:
- Set up an AWS S3 bucket with appropriate permissions for an external service to make API calls to it.
- Specify environment variables in a
.env
file (see below). - Create a virtual environment in the root of the project:
python3 -m venv venv
. - Activate the virtual environment:
source venv/bin/activate
- Install the dependencies:
pip3 install -r requirements.txt
- Create a local relational database for storing image urls, embeddings, and metadata. Update the value of
SQLALCHEMY_DATABASE_URI
in the.env
file accordingly. - Enable the pgvector extension in the database by connecting to it and running
CREATE EXTENSION vector;
These are the environment variables you will need to specify in a .env
file:
# Database
SQLALCHEMY_DATABASE_URI = <your-database-uri> # postgresql://postgres:password@db:5433/semantic_pic if using this project's compose.yaml values
# App auth
ADMIN_PW = <strong-admin-password> # For authenticating the protected POST and DELETE routes via Authorization header.
# AWS
AWS_ACCESS_KEY = <your aws access key>
AWS_SECRET_ACCESS_KEY = <your aws secret access key>
REGION = <your aws region>
BUCKET_NAME = <your aws s3 bucket name>
# CORS
ALLOWED_ORIGINS=[<list>, <of>, <allowed>, <origins>]
- To run the dev server, run
uvicorn app.main:app --reload
- Once the dev server is running, go to localhost:8000/docs to view the Swagger docs. You can use this to test out the API locally.
Follow steps 1 and 2 from "Running Locally with a Virtual Environment" above. Then:
- Make sure you have Docker installed.
- Run
docker compose up --build
in the root directory to build the images and run the containers.
- Once the containers are running, go to http://0.0.0.0:8000/docs to view the Swagger docs. You can use this to test out the API locally.
Tests are configured to run against a test database and test S3 bucket. To run the tests, you will need to:
- Configure a test S3 bucket in AWS with appropriate permissions for an external service to make API calls to it.
- Create a test relational database locally. Enable the pgvector extension in the database by connecting to it and running
CREATE EXTENSION vector;
- Update the
pytest.ini
file in the root of the project. This contains environment variables that, during testing, will override those specified in the.env
:[pytest] env = BUCKET_NAME=<your-test-bucket-name> SQLALCHEMY_DATABASE_URI=<your-test-database-uri>
To run all tests, run pytest
in the root directory inside the virtual environment or running Docker container.
To run all tests and generate a coverage report, run pytest --cov --cov-report=html:coverage
.