An intelligent spam filtering system built using a custom Naive Bayes classifier
This app is built directly on the work I did on https://github.com/prodicus/spammy
For more screenshots
Desktop view | Mobile View |
---|---|
desktop demo screens | mobile demo screens |
Yes, we do provide an API for our service!
General Syntax
$ curl -H "Content-Type: application/json" -X \
POST -d \
'{"email_text":"SAMPLE EMAIL TEXT"}' \
https://plino.herokuapp.com/api/v1/classify/
Show me an example
You thought I was lying!
$ curl -H "Content-Type: application/json" \
-X POST -d \
'{"email_text":"Dear Tasdik, I would like to immediately transfer 10000 thousand dollars to your account as my beloved husband has expired and I have nobody to ask for to transfer the money to your account. I come from the family of the royal prince of burkino fasa and I would be more than obliged to take your help on this matter. Would you care to share your bank account details with me in the next email conversation that we have? -regards -Liah herman"}' \
https://plino.herokuapp.com/api/v1/classify/
JSON response
{
"email_class": "spam",
"email_text": "Dear Tasdik, I would like to immediately transfer 10000 thousand dollars to your account as my beloved husband has expired and I have nobody to ask for to transfer the money to your account. I come from the family of the royal prince of burkino fasa and I would be more than obliged to take your help on this matter. Would you care to share your bank account details with me in the next email conversation that we have? -regards -Liah herman",
"status": 200
}
How can we forget our beloved requests
module!
>>> import requests
>>> import json
>>> import pprint
>>>
>>> api_url = "https://plino.herokuapp.com/api/v1/classify/"
>>> payload = \
{
'email_text': 'Dear Tasdik, I would like to immediately transfer 10000 '
'thousand dollars to your account as my beloved husband has '
'expired and I have nobody to ask for to transfer the money '
'to your account. I come from the family of the royal prince '
'of burkino fasa and I would be more than obliged to take '
'your help on this matter. Would you care to share your bank '
'account details with me in the next email conversation that '
'we have? -regards -Liah herman'
}
>>>
>>> headers = {'content-type': 'application/json'}
>>> # query our API
>>> response = requests.post(api_url, data=json.dumps(payload), headers=headers)
>>> response.status_code
200
>>> pprint.pprint(response.json())
{
'email_class': 'spam',
'email_text': 'Dear Tasdik, I would like to immediately transfer 10000 '
'thousand dollars to your account as my beloved husband has '
'expired and I have nobody to ask for to transfer the money '
'to your account. I come from the family of the royal prince '
'of burkino fasa and I would be more than obliged to take '
'your help on this matter. Would you care to share your bank '
'account details with me in the next email conversation that '
'we have? -regards -Liah herman',
'status': 200
}
>>>
requests module really makes our life easy and I use it all the time. But sigh, there should be an example using the standard library so here it is
>>> import urllib.request
>>> import json
>>> import pprint
>>>
>>> url = "https://plino.herokuapp.com/api/v1/classify/"
>>> req = urllib.request.Request(url)
>>> req.add_header(
'Content-Type',
'application/json; charset=utf-8'
)
>>>
>>> body = \
{'email_text': 'Dear Tasdik, I would like to immediately transfer 10000 '
'thousand dollars to your account as my beloved husband has '
'expired and I have nobody to ask for to transfer the money '
'to your account. I come from the family of the royal prince '
'of burkino fasa and I would be more than obliged to take '
'your help on this matter. Would you care to share your bank '
'account details with me in the next email conversation that '
'we have? -regards -Liah herman'
}
>>> json_data = json.dumps(body).encode('utf-8') # needs to be bytes
>>> req.add_header('Content-Length', len(json_data))
>>>
>>> with urllib.request.urlopen(req, json_data) as f:
... print(f.read().decode('utf-8'))
...
{
"email_class": "spam",
"email_text": "Dear Tasdik, I would like to immediately transfer 10000 thousand dollars to your account as my beloved husband has expired and I have nobody to ask for to transfer the money to your account. I come from the family of the royal prince of burkino fasa and I would be more than obliged to take your help on this matter. Would you care to share your bank account details with me in the next email conversation that we have? -regards -Liah herman",
"status": 200
}
>>>
Built upon the giant shoulders of (in no particular order)
- Flask because I ♥
Flask
more thanDjango
- Flask-Cache for caching
- nltk for text pre-processing
- gunicorn as the production server
- Jinja2 as the templating engine
- dill for de-serializing complex python objects
$ virtualenv env # Create virtual environment
$ source env/bin/activate # Change default python to virtual one
(env)$ git clone https://github.com/prodicus/plino.git
(env)$ cd plino
(env)$ pip install -r requirements.txt
$ make run
Refer CONTRIBUTING.md for detailed reference
- Nitesh Sharma (sinscary) : UI dev
- Sahil Dua (sahildua2305): Test cases
This repo is build directly on the work I did on prodicus/spammy
The pickled classifier was trained against a total of close to 33,000 emails picked from publicly available enron dataset. You can find the full_corpus
directory, which holds the training emails here
I will leave that to you to decide upon. But for the questions sake, decent enough! 😄
- Deploying to heroku
- Creating a REST API
- Improving the UI
- Writing tests
- Simple API authentication
Licensed under GNU GPLv3
plino: A spam filtering system
Copyright (C) 2016 Tasdik Rahman
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
You can find the full copy of the LICENSE
here