-
Notifications
You must be signed in to change notification settings - Fork 8
Article Annotator
rmzi edited this page Oct 9, 2014
·
11 revisions
The Article Annotator is a simple tool that allows users to tag HTML documents to add to our Training Dataset.
The flow is as follows:
- Import HTML documents from server - AJAX call to MongoDB with crawled data, probably pull a few at once? - Needs: API endpoints to pull Article Models w/ HTML attached
- Render HTML w/ Annotator Tool overlay (Paintbrushes) - Render our frame with interface to cycle through articles, select different brushes and submit annotations. - Use JQuery to append HTML to our page after fetched
- Select a tool to annotate with (Title, Author, Date, Article Body, etc) - Cycle through brushes, each with its own color - Needs: Determine different types of brushes
- Highlight the html element underneath the cursor
- This will be the most challenging part. We will need to do some tinkering with highlighting the object itself. Perhaps, we could add a transparent, colored overlay to the parent element of the text we've highlighted
`(i.e. becomes
Lorem Ipsum
Lorem Ipsum
`i.e. clicking with Title_Brush and add ` - Question: What's the best way to edit the DOM in place? Do we need a separate representation of the DOM? 6. Present user with list of meta tags and tag them. 7. Export annotated HTML to Training Data MongoDB - Once annotation is complete, we'll add the final DOM to the original article document and save it to the Training Data MongoDB.
List of Possible Annotations
- Title
- Subtitle
- Section Title
- Author
- Date
- Location
- Image
- Image Caption
- Body
- metadata(non-visible)
ToDo:
- Setup MongoDB and simple node server to act as a gateway
- Use schema from @skillachie to model documents in the MongoDB
- Setup AnnotatorFrame w/ interface for fetching/cycling through articles, selecting brushes, and submitting results
- Experiment with different highlighting methods
- Ensure proper saving
- Test everything
Technology:
- Node.js Server
- MongoDB
- Mongoose (Node MongoDB driver)
- JQuery