-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: auto-fetching (initial iteration) #196
base: master
Are you sure you want to change the base?
Conversation
For the error, do you have a specific testing procedure you would use to get the weird behaviour? |
@jdalrymple No, other than what I wrote on #187, it's basically just running the |
I think ive fixed this up, check out the latest PR! |
* fix: Adjusting file access that was clashing with uncontrolled promises * review: Adding code review changing and linting * fix(cli): rectify an async call Co-authored-by: Maximilian Berkmann <[email protected]>
Note (to anyone stumbling upon this PR): the ML model isn't well equipped, as the dataset used is still quite imbalanced (fortunately, not as much as before) so results may not be amazing from the get-go. |
@Berkmann18 @gr2m this one is the last stale milestone. Can you guide me on what are the missing details? so we can complete this one. Thanks! 🎉 |
Well, other than solidifying the ML model, it would be to test the command (see if #187 doesn't occur for you) and see if on repos (dummy or real ones), the contributors/type list is fetched properly and look correct. |
* fix: Adjusting file access that was clashing with uncontrolled promises * review: Adding code review changing and linting * fix(cli): rectify an async call Co-authored-by: Maximilian Berkmann <[email protected]>
and some visibly forgotten DidYouMean stuff from the person who added that feature
@@ -0,0 +1,28 @@ | |||
const {existsSync} = require('fs') | |||
const Learner = require('ac-learn') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 on directly including a machine learning approach such as ac-learn
for a couple reasons:
- It's not 100% predictable, only high 90s at best. For dev tooling I personally prefer simpler, predictable patterns that can be debugged & changed more directly & transparently
- It requires more compute & storage per-repository (see the large
learner.json
file here)
It feels ... unusual? to me to start off by including the more heavyweight, targeted approach directly in the CLI (or at least its repository).
From my (admittedly not very informed) perspective, it would be nice to see a bit more trying it out before fully onboarding. Could we add it to the docs first as a third party approach? See existing repos try it out successfully?
By the way, sorry for not posting serious thoughts on this till now 😞. I'd been meaning to say something and it slipped my mind. But I'm very up for discussing more - and am not very high confidence that what I'm saying is at all reasonable!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you're coming from and initially started that feature with some non-ML pattern matching and stuff, which were... okay but not scalable enough to support enough of the contributions.
I guess the ML-driven categorisation could be put behind a command-line flag (like a feature flag) and when it's off it will only categorise the minimal set of contribution types (like review, code, bug and docs).
What:
Adds an auto-fetching mechanism accessible via the
fetch
command (which requires aPRIVATE_TOKEN
env. variable to be set) like #186 but with a cleaner PR that isn't too far from the master branch (which made the other PR hard to rebase).As noted in all-contributors/all-contributors#18, not all 27 categories can be picked from a GH repo alone (cf. #186 for a table listing what it was able to detect so far).
Why:
To resolve #117 and partly all-contributors/all-contributors#18 (TL;DR: auto adding contributors from a repo).
Re mntnr/name-your-contributors#45
How:
Using
name-your-contributors
andac-learn
.Checklist:
TODO:
[email protected]
(or later if I find a way to get it to return a list of PR titles and issue titles for eachprCreators
andissueCreators
respectively without having to get the full output and doing some of the summarisation here).labels
array forprCreators
(only keeping those withstate = 'MERGED
if looking at thefull
output withpullRequests
in it instead of the summarised one) objects, then feed that to thelearner
(similar or better than how it's done forissues
).title
field ofpullRequests
objects (or what comes out of 1.) and try to parse it to get info on the PR (it will be assumed to becode
by default).