This is a simple library to interface with HN Search API (provided by Algolia).
Install | Basic Usage | Development | Roadmap
👉 Note: As an example, I used this library to download ALL Hacker News posts and made it available as a public dataset in Kaggle.
$ pip install python-hn
Check out Interactive Docs to try the library without installing it.
from hn import search_by_date
# Search everything (stories, comments, etc) containing the keyword 'python'
search_by_date('python')
# Search everything (stories, comments, etc) from author 'pg' and keyword 'lisp'
search_by_date('lisp', author='pg', created_at__lt='2018-01-01')
# Search only stories
search_by_date('lisp', author='pg', stories=True, created_at__lt='2018-01-01')
# Search stories *or* comments
search_by_date(q='lisp', author='pg', stories=True, comments=True, created_at__lt='2018-01-01')
Tags are part of HN Search API provided by Algolia. You can read more in their docs. They can form complex queries, for example:
# All the comments in the story `6902129`
tags = PostType('comment') & StoryID('6902129')
The available tags are:
PostType
: with optionsstory
,comment
,poll
,pollopt
,show_hn
,ask_hn
,front_page
.Author
: receives the username as param (Author('pg')
).StoryID
: receives the story id (StoryID('6902129')
)
Filters can be applied to restrict the search by:
- Creation Date:
created_at
- Points:
points
- Number of comments:
num_comments
They can accept >, <, >=, <=
operators with a syntax similar to Django's.
lt
(<
): Lower than. Exampleponts__lt=100
lte
(<=
): Lower than or equals to. Exampleponts__lte=100
gt
(>
): Greater than. Examplecreated_at__gt='2018'
(created after 2018-01-01).gte
(>=
): Greater than or equals to. Examplenum_comments__gte=50
.
Examples (See Algolia docs for more info):
# Created after October 1st, 2018
search_by_date(created_at__gt='2018-10')
# Created after October 1st, 2017 and before January 1st 2018
search_by_date(created_at__gt='2018-10', created_at__lt='2018')
# Stories with *exactly* 1000 points
search_by_date(tags=PostType('story'), points=1000)
# Comments with more than 50 points
search_by_date(tags=PostType('comment'), points__gt=50)
# Stories with 100 comments or more
search_by_date(tags=PostType('story'), num_comments__gt=100)
[TODO]
Current milestone: https://github.com/santiagobasulto/python-hacker-news/milestone/2
- V0.0.4: Other endpoints: /search, /users, /items (CURRENT)
- V0.0.3: Post type aliases, improved API
- V0.0.2: Functioning API
- V0.0.1: Initial Version