Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Projects List Feature #14

Open
wants to merge 57 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
d26d6b1
Add fork:false to Github queries
mrthankyou Feb 11, 2021
2e58640
Initial work setting up custom LGTM project list curation
mrthankyou Feb 17, 2021
d4098d3
Clean up code and get basic cache parsing file setup
mrthankyou Feb 17, 2021
e69003a
Continued work on custom project lists feature
mrthankyou Feb 17, 2021
02f16a3
Fix misc issues
mrthankyou Feb 17, 2021
d3898cb
Add comment and ignore cache files
mrthankyou Feb 17, 2021
8e839a8
Refactor code
mrthankyou Feb 17, 2021
5c996fc
Reword text
mrthankyou Feb 17, 2021
7bd1bf0
Revert stars to accurate count
mrthankyou Feb 17, 2021
3f8d336
Remove comment
mrthankyou Feb 17, 2021
ca2bbc6
Update README.md
mrthankyou Feb 17, 2021
69543fb
Add custom project list feature to search term script
mrthankyou Feb 17, 2021
a687d35
Save only real projects to LGTM project lists
mrthankyou Feb 17, 2021
a4133cc
Remove unnecessary modules
mrthankyou Feb 17, 2021
b5ecc8a
Create cache folder if it already doesn't exist
mrthankyou Feb 18, 2021
c770a2f
Add draft for build in progress guard clause
mrthankyou Feb 18, 2021
b83bacf
Accept both proto and real projects
mrthankyou Feb 18, 2021
1b2982a
Add ProjectBuild and ProjectBuilds classes
mrthankyou Feb 19, 2021
0bdd4cc
Remove logs and add new request for proto projects
mrthankyou Feb 19, 2021
88e3793
Save more project data to cache files
mrthankyou Feb 19, 2021
f515563
Refactor how we move repos to LGTM lists
mrthankyou Feb 19, 2021
a853b98
Update README with LGTM build process info
mrthankyou Feb 19, 2021
01842f7
Add Python documentation for functions
mrthankyou Feb 21, 2021
2313cc3
Add comment
mrthankyou Feb 22, 2021
32d4fd9
Remove unnecessary comments
mrthankyou Feb 22, 2021
6c825f6
Add guard clauses and improved project filtering
mrthankyou Feb 22, 2021
3922973
Increase timer
mrthankyou Feb 22, 2021
a3cf8e2
Uncomment code
mrthankyou Feb 22, 2021
50fc91e
Remove unnecessary comment
mrthankyou Feb 22, 2021
2cd04b5
Add HTTP retries
mrthankyou Feb 22, 2021
c8e33ae
Remove unnecessary prints
mrthankyou Feb 22, 2021
58b4d1e
Fix various issues with moving repos to lists
mrthankyou Feb 23, 2021
08f1b7c
Add HTTP retries when retrieving a project
mrthankyou Feb 24, 2021
429c9ba
Add check for protoprojects
mrthankyou Feb 24, 2021
ba0e6f4
Handle exceptions from LGTM
mrthankyou Mar 3, 2021
85b368e
Delete test.py
mrthankyou Mar 3, 2021
cbe5fa5
Clarify API call to LGTM
mrthankyou Mar 3, 2021
1e40f13
Refactor how we build SimpleProjects
mrthankyou Mar 3, 2021
aa14305
Rename method
mrthankyou Mar 3, 2021
bedc587
Remove useless code
mrthankyou Mar 3, 2021
e362ef8
Rename ProjectBuild#name and refactor code
mrthankyou Mar 3, 2021
fda2a9f
Add SimpleProject#project_type method
mrthankyou Mar 3, 2021
b51fced
Continue refactoring how we determine LGTM project types
mrthankyou Mar 3, 2021
0b182b9
Rename ProjectBuild#id to #key
mrthankyou Mar 3, 2021
c6db487
Update comment on refactoring
mrthankyou Mar 3, 2021
9eb9c4a
Refactor SimpleProject to store the project type
mrthankyou Mar 3, 2021
0ba1576
Simplify logic in determining project state
mrthankyou Mar 3, 2021
476faeb
Add comments
mrthankyou Mar 3, 2021
2c0d44b
Refactor logic with guard clauses
mrthankyou Mar 3, 2021
574d0f6
Add unfollow_all_followed_projects.py script
mrthankyou Mar 3, 2021
803ebd7
Convert ProjectBuild to a subclass of SimpleProject
mrthankyou Mar 3, 2021
69b6614
Refactor simple project build to not raise error
mrthankyou Mar 3, 2021
89a25d7
Add checks confirming LGTM project is valid
mrthankyou Mar 3, 2021
f44b959
Fix misc errors
mrthankyou Mar 3, 2021
9ba28cf
Reword comment
mrthankyou Mar 4, 2021
2e1d595
Remove unnecessary code
mrthankyou Mar 4, 2021
5ea2b9a
Remove comments
mrthankyou Mar 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,9 @@ python3 follow_repos_by_search_term.py <LANGUAGE> <SEARCH_TERM> <CUSTOM_LIST_NAM

# Finds top repositories that have a minimum 500 stars and use the provided programming language.
python3 follow_top_repos_by_star_count.py <LANGUAGE> <CUSTOM_LIST_NAME>(optional)

# Unfollows all projects you're currently following that are not in a custom list.
python3 unfollow_all_followed_projects.py
```

## The Custom Projects Lists Feature
Expand Down Expand Up @@ -104,7 +107,7 @@ LGTM can't move projects that are being processed into custom lists. To resolve

> The <CACHED_FILE_NAME> can't be processed at this time because a project build is still in progress.

If you receive this error, wait a few hours and run the script again.
If you receive this error, wait a few hours and run the script again.

## Legal

Expand Down
2 changes: 1 addition & 1 deletion auto_sort_projects.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@
project_list_name = gh_org_to_project_list_name[org]
project_list_id = site.get_or_create_project_list(project_list_name)
for project in org_to_projects[org]:
if project.is_protoproject:
if project.is_protoproject():
print('Unable to add project to project list since it is a protoproject. %s' % project)
continue
site.load_into_project_list(project_list_id, [project.key])
Expand Down
22 changes: 8 additions & 14 deletions follow_repos_by_search_term.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from typing import List
from lgtm import LGTMSite
from lgtm import LGTMSite, LGTMDataFilters

import utils.cacher
import utils.github_dates
Expand Down Expand Up @@ -28,7 +28,8 @@ def find_and_save_projects_to_lgtm(language: str, search_term: str) -> List[str]
for date_range in utils.github_dates.generate_dates():
repos = github.search_repositories(query=f'stars:>5 language:{language} fork:false created:{date_range} {search_term}')

# TODO: This occasionally returns requests.exceptions.ConnectionError which is annoying as hell. It would be nice if we built in exception handling.
# TODO: This occasionally returns requests.exceptions.ConnectionError which is annoying as hell.
# It would be nice if we built in exception handling.
for repo in repos:
# Github has rate limiting in place hence why we add a sleep here. More info can be found here:
# https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting
Expand All @@ -39,18 +40,11 @@ def find_and_save_projects_to_lgtm(language: str, search_term: str) -> List[str]

saved_project = save_project_to_lgtm(site, repo.full_name)

# TODO: This process is duplicated elsewhere and should be under one location
# We only save realProjects to the cache since those are the only
# ones we can actually process.
if "realProject" in saved_project:
saved_project_name = saved_project['realProject'][0]['displayName']
saved_project_id = saved_project['realProject'][0]['key']
saved_project_data.append(f'{saved_project_name},{saved_project_id},realProject')

if "protoproject" in saved_project:
saved_project_name = saved_project['protoproject']['displayName']
saved_project_id = saved_project['protoproject']['key']
saved_project_data.append(f'{saved_project_name},{saved_project_id},protoproject')
simple_project = LGTMDataFilters.build_simple_project(saved_project)

if simple_project.is_valid_project:
saved_data = f'{simple_project.display_name},{simple_project.key},{simple_project.project_type}'
saved_project_data.append(saved_data)

return saved_project_data

Expand Down
16 changes: 5 additions & 11 deletions follow_top_repos_by_star_count.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from typing import List
from lgtm import LGTMSite
from lgtm import LGTMSite, LGTMDataFilters

import utils.github_dates
import utils.github_api
Expand Down Expand Up @@ -37,17 +37,11 @@ def find_and_save_projects_to_lgtm(language: str) -> List[str]:
continue

saved_project = save_project_to_lgtm(site, repo.full_name)
simple_project = LGTMDataFilters.build_simple_project(saved_project)

# TODO: This process is duplicated elsewhere and should be under one location
if "realProject" in saved_project:
saved_project_name = saved_project['realProject'][0]['displayName']
saved_project_id = saved_project['realProject'][0]['key']
saved_project_data.append(f'{saved_project_name},{saved_project_id},realProject')

if "protoproject" in saved_project:
saved_project_name = saved_project['protoproject']['displayName']
saved_project_id = saved_project['protoproject']['key']
saved_project_data.append(f'{saved_project_name},{saved_project_id},protoproject')
if simple_project.is_valid_project:
saved_data = f'{simple_project.display_name},{simple_project.key},{simple_project.project_type}'
saved_project_data.append(saved_data)

return saved_project_data

Expand Down
101 changes: 62 additions & 39 deletions lgtm.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ def _make_lgtm_get(self, url: str) -> dict:

def get_my_projects(self) -> List[dict]:
'''
Returns a user's projects.
Returns a user's followed projects that are not in a custom list.

Returns:
data (List[dict]): Response data from LGTM
Expand Down Expand Up @@ -134,7 +134,7 @@ def force_rebuild_all_proto_projects(self):
org_to_projects = LGTMDataFilters.org_to_ids(self.get_my_projects())
for org in org_to_projects:
for project in org_to_projects[org]:
if not project.is_protoproject:
if not project.is_protoproject():
continue
self.force_rebuild_project(project)

Expand Down Expand Up @@ -186,15 +186,15 @@ def unfollow_proto_repository_by_id(self, project_id: str):
self._make_lgtm_post(url, data)

def unfollow_repository(self, simple_project: 'SimpleProject'):
url = "https://lgtm.com/internal_api/v0.2/unfollowProject" if not simple_project.is_protoproject \
url = "https://lgtm.com/internal_api/v0.2/unfollowProject" if not simple_project.is_protoproject() \
else "https://lgtm.com/internal_api/v0.2/unfollowProtoproject"
data = simple_project.make_post_data()
self._make_lgtm_post(url, data)

def unfollow_repository_by_org(self, org: str, include_protoproject: bool = False):
projects_under_org = self.get_my_projects_under_org(org)
for project in projects_under_org:
if not include_protoproject and project.is_protoproject:
if not include_protoproject and project.is_protoproject():
print("Not unfollowing project since it is a protoproject. %s" % project)
continue
print('Unfollowing project %s' % project.display_name)
Expand Down Expand Up @@ -280,17 +280,24 @@ def create_from_file() -> 'LGTMSite':


@dataclass
# TODO: this SimpleProject is no longer 'simple'. Some refactoring here could be nice.
class SimpleProject:
display_name: str
key: str
is_protoproject: bool
project_type: str
is_valid_project: bool
org: str
state: str

def make_post_data(self):
data_dict_key = 'protoproject_key' if self.is_protoproject else 'project_key'
data_dict_key = 'protoproject_key' if self.is_protoproject() else 'project_key'
return {
data_dict_key: self.key
}

def is_protoproject(self):
# The values for project_type should be hardcoded in one central location
return self.project_type == "protoproject"
Comment on lines +298 to +300
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there other project types other than protoprojects and non-protoprojects?

Copy link
Contributor Author

@mrthankyou mrthankyou Mar 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, no. But I will take a deeper look and get back to you on this.


class LGTMDataFilters:

Expand All @@ -302,43 +309,17 @@ def org_to_ids(projects: List[Dict]) -> Dict[str, List[SimpleProject]]:
"""
org_to_ids = {}
for project in projects:
org: str
display_name: str
key: str
is_protoproject: bool
if 'protoproject' in project:
the_project = project['protoproject']
if 'https://github.com/' not in the_project['cloneUrl']:
# Not really concerned with BitBucket right now
continue
display_name = the_project['displayName']
org = display_name.split('/')[0]
key = the_project['key']
is_protoproject = True
elif 'realProject' in project:

the_project = project['realProject'][0]
if the_project['repoProvider'] != 'github_apps':
# Not really concerned with BitBucket right now
continue
org = str(the_project['slug']).split('/')[1]
display_name = the_project['displayName']
key = the_project['key']
is_protoproject = False
else:
raise KeyError('\'realProject\' nor \'protoproject\' in %s' % str(project))
simple_project = LGTMDataFilters.build_simple_project(project)
if not simple_project.is_valid_project:
continue

ids_list: List[SimpleProject]
if org in org_to_ids:
ids_list = org_to_ids[org]
if simple_project.org in org_to_ids:
ids_list = org_to_ids[simple_project.org]
else:
ids_list = []
org_to_ids[org] = ids_list
ids_list.append(SimpleProject(
display_name=display_name,
key=key,
is_protoproject=is_protoproject
))
org_to_ids[simple_project.org] = ids_list
ids_list.append(simple_project)

return org_to_ids

Expand All @@ -348,3 +329,45 @@ def extract_project_under_org(org: str, projects_sorted: Dict[str, List[SimplePr
print('org %s not found in projects list' % org)
return []
return projects_sorted[org]

@staticmethod
def build_simple_project(project: dict) -> SimpleProject:
org: str
display_name: str
key: str
project_type: str
is_valid_project: bool = True
state: str = ""

if 'protoproject' in project:
the_project = project['protoproject']
if 'https://github.com/' not in the_project['cloneUrl']:
# Not really concerned with BitBucket right now
is_valid_project = False
display_name = the_project['displayName']
state = the_project['state']
org = display_name.split('/')[0]
key = the_project['key']
project_type = 'protoproject'
elif 'realProject' in project:
the_project = project['realProject'][0]
if the_project['repoProvider'] != 'github_apps':
# Not really concerned with BitBucket right now
is_valid_project = False
org = str(the_project['slug']).split('/')[1]
display_name = the_project['displayName']
key = the_project['key']
project_type = "realProject"
else:
# We raise this in cases where we can't intrepret the data we get
# back from LGTM.
is_valid_project = False

return SimpleProject(
display_name=display_name,
key=key,
project_type=project_type,
is_valid_project=is_valid_project,
org=org,
state=state
)
36 changes: 0 additions & 36 deletions test.py

This file was deleted.

11 changes: 11 additions & 0 deletions unfollow_all_followed_projects.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from lgtm import LGTMSite, SimpleProject, LGTMDataFilters
import time

site = LGTMSite.create_from_file()

projects = site.get_my_projects()

for project in projects:
simple_project = LGTMDataFilters.build_simple_project(project)
if simple_project.is_valid_project:
site.unfollow_repository(simple_project)
Loading