This is a group repository for COLX 523 Project - Building an annotated corpus from scratch, with a browser interface for non-experts.
- Collect a large corpus from the web
- Carry out reliable corpus annotation using external annotators
- Make the corpus searchable through via an HTML/javascript frontend and Python backend, distributed using Docker
- Document how the corpus was created