-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Handling old issues #13890
Comments
Also, the wiki should be mentioned by the issue format bot when contributors make new issues ("Be advised that new domains requested here will be transferred into the GitHub wiki at [URL here] and this issue will be closed"). Contributors should also be gently discouraged from creating PRs directly for domains that are still issues, instead of in the wiki, so as to avoid circumventing the issue-to-wiki process. |
I don't think the wiki is a good idea. It vastly increases the workload of maintainers. It also makes impossible to add any useful information (for example, that only some paths can be rewritten). Instead, each domain should be submitted in a separate issue and contributors should be incited to use the GitHub auto-closing functionality. To prevent issues from cluttering the issue list, we should improve the bot. All rulesets issue should be tagged as such. I also noticed that most issue reporters don't follow the template or the bot instructions, which means that it should be revised (but that's another topic). |
I probably shouldn't have added the discussion about domain tracking and the wiki. It's been discussed at length before (#6307, #6322). For example in #6322 (comment) over six months ago I estimate without proof that manually tracking issues is "15 minutes a month of manual work" which also applies to @Bisaloo 's comment about "vastly increases the workload of maintainers". For whatever reason there seems to be no interest other than from me to use the wiki for this and I don't want to have that discussion again. For this discussion I'm more interested in ideas for resolving other types of issues. |
I think moving the list to the wiki is a great improvement. To avoid additional maintenance overhead a bot can be used to update the wiki (the wiki is just another git repo, so this should work).
Currently every GitHub user should have access to the wiki. Let's see if it works.
Again, I hope a bot can do this. New contributors may not aware of the list in the wiki. Edit: |
A big advantage to me of moving the domain list to the wiki is that it would be restricted-access to collaborators and above and maybe a few others. I want to avoid Wikipedia-style public edit wars over the list, or to have to watch it to make sure people aren't making malicious or simply bad edits. As you know we have way too many issues/PRs and too few people reviewing them, so someone is not going to be happy. Issues are a lot worse than PRs to resolve. They are often poorly written, or require a login or some complicated interaction with a site, or that you use some other extension, or whatever. You know what I mean and there are hundreds of examples open right now. And, to fix something that is actually broken will require a new PR. Some sort of automated handling of old issues is essential. At a minimum we should have a bot go through and ask the issue contributor to confirm they are still active and that the problem still exists. If they don't respond soon, we close it. It's not ideal but that's probably the nicest way to start. It needs to be a bot and not a person because otherwise the person who pings them can be put on the spot to help them out. I'm finding this as I ping people on old PRs (#13859). We should also only ping small groups of people at a time in case there is some overwhelming response. |
@jeremyn I think #3069 belongs in the wiki, it makes the most sense in dealing with clutter. As to @Bisaloo's concerns, if there are idiosyncrasies with a proposed ruleset, a table in the wiki can link to that issue for additional information. As long as the row is still present in the list of proposed rulesets, we can additionally close that issue as well. This would also be useful for de-duplication of proposed rulesets - a I do think that the complexity of such a bot increases - we're not just talking about parsing an issue and adding tags, we're now tasking the bot with editing a live document. That's bound to be more difficult than what we currently have in place. As a temporary measure, I propose we:
This will unify the issue types from the template with the issue labels, and should make filtering easier, since I'm happy to get started on this work, pending feedback. |
Forgot to explain the asterisk:
|
I don't want my list to be moved. I started the list with sites I came across myself, so I could keep them all in one place. Slowly, others added other sites to the issue so I added them to my list as well if it fit my criteria. The issues section is the perfect place for it; it consist of things that should be added to HTTPSE. A wiki is for information, reading material, etc. |
If we're going to use the wiki, I think we should do it manually for a while before bringing the bot into it. There are only about 20 open issues with the I disagree with having all these Who uses the labels? People who feel like working on issues. I guess most people who feel like tackling some problems want to work on either code problems, or ruleset problems. Once they've decided which of these two categories they're interested in, they'll go through the relevant list of open issues themselves and figure out what to do. I doubt they will care much which are |
Project issues aren't the best place to keep private lists. List issues are fine if they are limited in scope, for example issue #9842 "Audit default_off rulesets" is appropriate. But an open-ended issue about "Sites that I'm interested in" is not great. There's also a need for an official list of "Sites that people want covered", and your list will compete with that, which is also not great. I recognize that some of that is my fault, since I've advocated using #3069 as a dumping ground for these domains for a while, since it's the closest thing we've had to an official list. |
It's not a private list, it's public, available to anyone who wants to make new rulesets. Maybe you can pin it to the top of the issues list for more visibility. There won't be any competition between my list and any official list as long as no official list is made. Or go ahead and copy my list if you really want, but I would like to retain control over the list I've compiled over the years. |
@terrorist96 Either the list in #3069 is purely private to you, in which case it shouldn't be a GitHub issue, or the list is important to the project, in which case it should be under the control of people officially attached to the project, for the reasons I gave above in #13890 (comment). |
Who says it has to be either or? Having it private to me is of no use. What am I gonna do with a private list if it's not available for others to see and make rules from it? Having it under the control of the maintainers of this repo is akin to taking my work. I said I'm fine if you wanna copy it; it's not a copyrighted list. Feel free to "fork" it. But what you're proposing is taking away my ownership of it, which I'm not ok with. By your logic "if it's important to the project, it should be under EFF's official control", then all issues posted in this repo should be under EFF's control. To address your concerns: |
I'm using the label taxonomy that we already have in place. Are you proposing that we change this taxnomy to merge |
There may be some confusion here. As far as domain-coverage issues go, we're really talking about two things. One thing we're talking about is reducing the number of issues by funneling some of them toward a centralized list of domains that people want covered. This is a win all around, because it reduces the backlog and makes less trivial issues easier to find, and it creates a place where volunteers can go to find domains if they feel like making rulesets. This list should be controlled by the EFF for the reasons I gave earlier. (Certainly if we were starting this project from scratch, the list would start under EFF control, and would not simply be assigned to one contributor for them to maintain exclusively.) The closest thing we have to such a list is issue #3069, so to kickstart a new official list, we can copy over the contents of that issue. The other thing we're talking about, which is less important to the project but perhaps important to you, is what to do with #3069. It's not urgent to close one issue out of five hundred. No one wants to directly take over that issue, like by making hostile edits to it or anything. But, what would be the point of keeping it going if an official list exists elsewhere, in a place that is more visible to contributors? How would having two lists help the project? |
Maybe? I'm not proposing we change these labels right now, just that if we decide to rework other processes related to |
Unfortunately, GitHub wikis are relatively feature-sparse. There does not seem to be a way to lock specific pages to collaborators & maintainers only. There is a setting on the project-wide level:
@terrorist96 I tend to think that the canonical place for a list of sites that should have HTTPS enforced should be the wiki, but no one is stopping you from maintaining your own list. As long as we're able to derive our entries from each other, in the best spirit of open-source collaboration 😄. I think @jeremyn's distinction makes sense, and if you feel strongly about cultivating your own list in a single issue, this doesn't bother me. |
@jeremyn I tend to think if issues are mislablled Note: I'm distinguishing issuer and contributor here as:
This is not to imply issuers do not contribute to the project, but just to draw a symantic distinction between two distinct groups. |
I guess my concern is that stuffing it in a wiki will reduce its visibility. Who actually goes into the wiki looking for issues that need to be resolved? How about you copy my list into the wiki, and we retain my issue and I'll link to the wiki at the top? People may still add sites to 3069. |
We don't use the wiki for anything else right now anyway, so locking the whole thing to members and collaborators is the same as locking just one page. Anyway I can't think of anything that should go in the wiki that I want random people editing, for the reasons I gave at the top of #13890 (comment), so locking it all works for me. For the label distinctions, the question as usual is who benefits and is it worth the cost? The cost is paid by contributors ("issuers") trying to figure out how write their issue, maintainers fixing label problems and educating contributors, and bot programmers coding these various distinctions. The only benefit is to the hypothetical PR contributor who cares about creating rulesets vs updating existing ones, which I guess can happen but seems unlikely. I assume most volunteers thinking about adding a domain care about stuff like whether they use the site, its Alexa ranking, its language, its cultural relevance. If I were thinking about creating vs updating, what I'd want to know is: for creating, is this a small site or is it going to end up being really complicated? For updating, is it an easy edit or will I need to work with something like that 2000+ line One label I'd like to see is a label for PRs that makes it clear that that PR resolves an issue ( The wiki is visible in a tab at the top of every HTTPS Everywhere GitHub page, plus if we had an official list we would probably mention it in I'm okay with a semi-private list for you and your friends/colleagues/etc, but if we have two lists and people get confused about where to report a domain (example: #3069 (comment)) then that's a problem. It's also a problem if maintainers are expected/obligated to manually sync the official list from your list. |
Why don't you give me write access to the wiki and I can keep the list updated there? |
@terrorist96 To restate, my intent wasn't just to move the list into the wiki, but also to open ownership of the list to the EFF/collaborator team to maintain as a group, possibly with a bot involved, similar to how issues and PRs are handled as a group and with a bot involved. The proposed process is:
So even if you were given access to the wiki, it would not be your personal list. You would just be one of several people, maybe with a bot, who are transferring domains from issues to the wiki. Is that what you're looking to do? Also from @Hainish 's comment at #13890 (comment) it sounds like GitHub doesn't have a way to allow individual access to the wiki: it's either restricted to collaborators and above, or open to everyone. But this doesn't really matter because with the proposed process, you also need to be able to close issues, and you would need to be a collaborator to do that. |
So if I want to add a new site to the list, would I need to create a new issue and include |
@terrorist96 Yes, that's the behavior that would be nice for a bot to have, but in practice I don't think we get enough of these issues to justify the work to develop and maintain this bot functionality compared to just doing the transfer manually. |
Should I start a wiki page with domains from the Tor bug tracker (after checking which ones haven't been added since the issue was opened)? |
I am talking about individual bugs. |
I think for now domains should be listed as comments in issue #3069 . That's where I put them. If and when we want to move to the wiki, we can transfer stuff from the issue. |
I'm going to wait until we find a better solution then. I'm not a big fan of #3069. |
Me neither, but at least it's a place where can put domains so we can close the trivial "Add example.com" issues, and to close I'm not particularly interested in cleaning up the Tor tracker, nor do I expect there are any extremely valuable domains reported there that are not also reported here, so for what it's worth I'm fine with leaving the Tor issues alone too. |
This isn't true. The page listing downstream dependencies (maintained mostly by @Bisaloo) was created on December 1st. Since then, we've had @cschanaj also create a new wiki page. It seems to me that a wiki isn't the optimal solution for this. To enumerate the list of desired features for a page containing all requested rulesets, we'd have:
We can consider this feature set an MVP for the next iteration of the issues bot. It seems to me the easiest place to put the list is in a file at the top level of the project, linked to from the @strugee would you have any cycles opening up for amending the bot in #11615? |
Having the list in a repository seems reasonable. Maybe it should be in a new repository, not this one, to keep the number of commits down, and to limit the damage that an (unintentionally) misbehaving bot with commit access could do. |
I'm closing this issue with these observations: There is some interest is revisiting/improving how we process simple requests to add coverage for some domains, but not any clear consensus. Perhaps interested people can open specific issues or pull requests along those lines, for updating the bot or wiki or similar. There has been little discussion here about solving the substantive problems in the current 500+ issue backlog. |
I'd just like to add that the issue backlog is not only a burden from a technical point of view (it is hard to find the actual/current issues among the noise). This also has some real impact on HTTPS Everywhere image to potential users. Of course, we are not trying to sell anything and technical relevance should always prevail over marketing. But the end goal of HTTPS Everywhere and (I assuming) most of its contributors is to make the web a safer place. If we can increase HTTPS adoption, we should strive for it. Increasing our userbase is one way to do so. I've seen people linking to our bug tracker claiming this was a proof that HTTPS Everywhere broke way too many sites to be pratictal. While I don't deny we sometimes make mistakes, the huge outdated issue backlog blow this problem way out of proportions. |
Some ideas to fix this situation:
|
One possible community-oriented thing we can do to relieve the backlog of issues is to convene an HTTPS Everywhere hack-a-thon, reaching out to the technical community around EFF and gather those interested in participating in a little house-cleaning. I'm envisioning volunteers commenting on issues, soliciting feedback from those who originally opened them if there are still open questions, tagging ruleset maintainers where appropriate. This would drive engagement with the project and perhaps promote a longer-term useful participation. |
For the hack-a-thon idea, if the EFF is going to arrange a burst of effort over a couple days, it should also organize its own maintainers for the event too. I don't want to discover one day that I've been pinged on a couple hundred issues. |
Hey @Hainish, sorry I missed your comment above. In principal I am happy to work more on the bot, though I've been without a working laptop recently so it'd have to wait until that was repaired. I'm hoping to get it fixed soon. I just looked and apparently there are already labels matching the bot's
As then it would be more obvious that most of the issues are about new rulesets (since they'd be labeled as such).
I wonder if we should make it so the bot allows people to do this on their own. I.e. you could say "@https-everywhere-bot type: new ruleset, domain: eff.org" and the bot would edit the issue description to add the metadata, then label the issue appropriately. That way maintainers wouldn't get a million pings. People could deliberately mess with the bot to add bad metadata but that seems kinda unlikely. especially because a human would notice right away? Plus we could turn it off again right after the hackathon or whatever |
@strugee, if you want, you could open a new issue to gather suggestions for the bot. I have some ideas but I am not sure if they are possible with the GitHub API. @Hainish, a hackathon sounds exciting but I'd like to see the two following points solved beforehand:
|
I am not sure where to post this but another argument to clean up our tracker and standardize issue format: duckduckgo privacy essentials apparently uses it to determine which rulesets are broken: |
On a related note: is is acceptable to ping rulesets maintainers or @Hainish if we find an issue that has been fixed server side? |
@Bisaloo I'm not sure what you are asking. |
Can we ping you to close an issue after we checked it is fixed? |
@Bisaloo You mean issues that aren't fixed by a PR, that is, you want to ping us for issues that seem to have "fixed themselves" or that you can't reproduce? The ideal case for any issue is for the issue reporter to close the issue themselves. So if an issue seems fixed, you can add a note to the issue saying you think it's fixed and then try to convince the reporter to close it. If that doesn't go anywhere then I personally am not interested in getting a ping for it. Whoever looks at the issue notes later on will see what you wrote, and what the reporter wrote, and can decide what to do with it then, no ping needed. |
@Bisaloo I'm happy to be pinged with issues that I've resolved, but haven't yet been closed (as you've been doing). |
Type: other
Ping @brainwane @gloomy-ghost @Hainish @J0WI @wonderchook but this discussion is open to anyone.
I want to have a discussion on what we should do about the large number of old (year+) issues.
Old issues are trickier to deal with than old PRs:
So, what should we do?
One thing that would help is if EFF staff could go through their own open issues, especially ancient issues, and close or update them as appropriate. I don't want to put too much attention on these EFF issues -- there aren't a ton of them -- but there are more than a few.
The text was updated successfully, but these errors were encountered: