You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Mac-Morpho is a corpus of Brazilian Portuguese texts annotated with part-of-speech tags. Its first version was released in 2003 [1], and since then, two revisions have been made in order to improve the quality of the resource [2, 3].
The corpus is available for download split into train, development and test sections. These are 76%, 4% and 20% of the corpus total, respectively (the reason for the unusual numbers is that the corpus was first split into 80%/20% train/test, and then 5% of the train section was set aside for development). This split was used in [3], and new POS tagging research with Mac-Morpho is encouraged to follow it in order to make consistent comparisons possible.
Download Mac-Morpho
Download annotation manual (in Portuguese) NOTE: the manual was written for its original annotation, i.e., before the changes in the tagset were
introduced. Therefore, it does not reflect the current state of the corpus.
Disclaimer: Mac-Morpho versions 1, 2 and 3 are licensed under a Creative Commons Attribution 4.0 International License. This means you can distribute, remix, tweak, and build upon Mac-Morpho versions, even commercially, as long as you give us the credit for the original creation. Mac-Morpho License.
The text was updated successfully, but these errors were encountered:
The text was updated successfully, but these errors were encountered: