You probably want miso-belica's jusText, which is far better maintained than this repository.
This is a slight patching of the original repository from googlecode.
The original author is Jan Pomikalek, and the site is: code.google.com/p/justext/. However, I was having trouble sending his code unicode directly (it seems it assumes it always gets non-unicode), which wasn't working for my application.
So now I check whether the input is unicode.
- lxml (>=2.2.4)
To install the package type:
python setup.py install
Or
python setup.py develop
Or
pip install git+https://github.com/chbrown/justext.git
justext -s English YourPage.html
For usage information see:
justext --help
- Python
- Java