-
Notifications
You must be signed in to change notification settings - Fork 62
Using the java api
Below are some tutorial exercises that demonstrate some of the cool things you can do using the Wikipedia Miner API.
You can get an exhaustive list of the available classes and methods from the [Javadoc]|(../../doc).
###Building a command line thesaurus
Tutorial 1: Build an application that lets users type in terms, and receive synonyms, definitions, related topics, and other things you would expect to get from an interactive thesaurus.
Tutorial 2: Extend the thesaurus to resolve conflation issues like CasE vAriaTions, âćçëňŧș, and plural(s).
Tutorial 3: Extend the thesaurus to get better lists of related topics.
Tutorial 4: Extend the thesaurus to get cleaner definitions.
###Building a command line document annotator
Tutorial 5: Build an application that allows users to type in snippets of text, and returns that text annotated with links to the relevant Wikipedia topics.
Tutorial 6: Build a workbench for annotation experiments, including generating training and testing data, and evaluating different settings and classifiers.
###Doing stuff at scale
Tutorial 7: Learn how to parallelize a task that takes Wikipedia xml dumps as input.
Tutorial 8: Learn how to parallelize a task in which each node must talk to the toolkit, without creating a bottleneck.