Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsidering auto-detecting Yarn files as auto-generated #4348

Closed
4 tasks done
arcanis opened this issue Dec 9, 2018 · 28 comments · Fixed by #4459 or #4445
Closed
4 tasks done

Reconsidering auto-detecting Yarn files as auto-generated #4348

arcanis opened this issue Dec 9, 2018 · 28 comments · Fixed by #4459 or #4445
Assignees

Comments

@arcanis
Copy link
Contributor

arcanis commented Dec 9, 2018

Preliminary Steps

Please confirm you have...

Problem Description

#3431 added support for generated file detection for Yarn. I'm not sure this is a good idea.

While generated, the lockfile contains important information regarding the dependencies used by a project; maybe it would make sense to display them so that they can be properly reviewed? For example, it would be important for a reviewer to understand that a PR adds N packages to the lockfile (which might be much higher than the number of lines added to the package.json file).

This might also have security implication. A malicious author could easily change some resolved versions inside the yarn.lock file while also upgrading a package.json dependency to a innocent release. Since Github won't show the lockfile content by default the reviewer might forget to check it, accept the apparently harmless upgrade, and let the malicious override make its way to the lockfile.

Note that the same concerns might apply to the package-lock.json file from npm (except that they are much harder to review due to their nested design).

@mohamedmansour
Copy link

Its very unpleasant to review lockfiles, many devs glance over it without paying attention. I agree with you that marking it as auto generate would bring in more vulnerable packages.

Is there a way to validate the lockfile was derived from the package.json for example, make sure the PR doesn't downgrade a package from the previous already committed file given the version within package.json. Github should create such server side hook by default instead of devs placing that action in their build definition.

@arcanis
Copy link
Contributor Author

arcanis commented Dec 9, 2018

I think it could be solved in interesting ways by offering a dedicated diff view for lockfiles that would pretty-print the changes (similar to how images are directly displayed rather than being shown as binary diff). That would require development time on the Github side, though, so in the meantime removing the file from the generated list would be a good start.

@lildude
Copy link
Member

lildude commented Dec 10, 2018

Your reasoning makes sense to me, however it's important to keep in mind that this has been in place for nearly two years so will potentially startle devs when they start to see these appearing in their diffs. After recent furores around simple things like colours, I'm apprehensive making such a noticeable change without larger community buy-in and support first.

Is there a general Yarn community discussion or consensus on this change that we can reference when making such a change?

If there isn't, are you able to raise such a discussion on the appropriate community forum that can be used as qualification for this?

@arcanis
Copy link
Contributor Author

arcanis commented Dec 10, 2018

Is there a general Yarn community discussion or consensus on this change that we can reference when making such a change?

Nope, but this one is as good as any! I maintain Yarn on a daily basis so on our side I don't think it'd be an issue. I can also tweet this discussion from the Yarn twitter account if you wish?

Paging some other contributors if you want to shim in: @rally25rs / @Gudahtt / @bestander

@pchaigno
Copy link
Contributor

Let's also ping the person who made the change we want to revert, in case they have something to add: @sunderls.

@rally25rs
Copy link

Trying to catch up on the discussion... It doesn't look like #3431 was ever merged, so I'm guessing this functionality isn't actually in-place?


My take on this is that the behvior yarn.lock should follow whatever other lockfiles do; npm's package-lock.json, Ruby bundler's Gemfile.lock, etc...

Personally, I want to know when my lockfile changes. Changes to these lock files represent actual functional changes to the final system (a version change to a dependency can have breaking changes).


The other thing to note here is that I assume the intent behind the "generated files" behaviors is that "this generated file change is the result of a different file changing" for example changing a coffeescript file will result in a corresponding "generated"/compiled js file change.

Though lockfiles are generated from package.json, a change to one does not always mean a change to the other. A yarn.lock file can change without package.json changing, which could indicate that someone changed the version of a dependency incorrectly.


To @mohamedmansour 's comment "Its very unpleasant to review lockfiles" that can be true, however to me it's more of a "am I expecting this file to have changes?" thing. If I don't expect that file to change but it shows up in my diff, then I know something is wrong. If the changes are so massive that its "unpleasant" to review, then you probably make some dependency updates and are expecting a big change to that file. That's the case where I don't actually inspect it line by line.

If you are talking about it from a package maintainer / PR reviewer perspective, then I do think it's even more important to show that change for the reasons @arcanis mentioned. If someone is manipulating your dependencies, they should have a good reason. Sure they can still hide a subtle version change amongst some other "noise" in the file changes, but that's no excuse for a maintainer to say " 🤷‍♂️ eh, too complex. Approve/merge. Whatever."

@pchaigno
Copy link
Contributor

It doesn't look like #3431 was ever merged, so I'm guessing this functionality isn't actually in-place?

It was, in a different pull request, #3432.

@arcanis
Copy link
Contributor Author

arcanis commented Jan 10, 2019

Ping? 🙂

@Alhadis
Copy link
Collaborator

Alhadis commented Jan 10, 2019

@arcanis's reasoning is sound. NPM is plagued with enough security concerns as it is, I don't think we should be adding one more potential attack vector for malicious users.

In most cases, yarn.lock files will be listed at the end of a PR's file diffs, making them pretty easy to overlook. Since dependencies don't get changed/added by contributors too often, it's unlikely that yarn.lock files will cause too much noise in the long run.

@sunderls
Copy link

+1 for removing yarn.lock from the list.
since large diff is already auto collapsed, its not gonna be a problem.

@edmorley
Copy link
Contributor

Another datapoint: when I review Renovate bot dependency update PRs on mobile, there is no way to view the contents of the automatically hidden yarn.lock changes without first switching to desktop view. This means the I have to use the frustrating workflow of:
open PR at files tab -> switch to desktop view -> click the expand link for yarn.lock and review the changes -> switch back to mobile view -> approve PR -> ...

Having yarn.lock changes be visible by default would save having to do this.

@pchaigno
Copy link
Contributor

Would one of you like to open the pull request to revert #3431?

@stale
Copy link

stale bot commented Feb 16, 2019

This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.

@stale stale bot added the Stale label Feb 16, 2019
@stale
Copy link

stale bot commented Mar 2, 2019

This issue has been automatically closed because it has not had activity in a long time. Please feel free to reopen it or create a new issue.

@stale stale bot closed this as completed Mar 2, 2019
@Alhadis Alhadis removed the Stale label Mar 3, 2019
@Alhadis Alhadis self-assigned this Mar 3, 2019
@Alhadis Alhadis reopened this Mar 3, 2019
Alhadis added a commit that referenced this issue Mar 3, 2019
This reverts commit b36ea7a for reasons previously discussed in #4348.

Closes #4348.
@Alhadis
Copy link
Collaborator

Alhadis commented Mar 3, 2019

Would one of you like to open the pull request to revert #3431?

Done. See #4445.

Alhadis added a commit that referenced this issue Mar 6, 2019
This reverts commit b36ea7a for reasons previously discussed in #4348.

Closes #4348.
@edmorley
Copy link
Contributor

edmorley commented Mar 12, 2019

@Alhadis hi! Thank you for opening #4445. However it seems to have only reverted the tests, rather than the logic that implements the behaviour?

@edmorley
Copy link
Contributor

I've opened a follow-up, PR #4459.

@Alhadis
Copy link
Collaborator

Alhadis commented Mar 13, 2019

Damn, sorry about that. 😕 I kind of assumed reverting a merge commit would undo everything else in a PR, but I guess not. Lesson learnt.

@billyjanitsch
Copy link

@Alhadis have you considered following suit for npm lockfiles (package-lock.json and npm-shrinkwrap.json), as suggested in the OP? It seems odd to me for Yarn's lockfiles to behave differently than npm's.

@Ghazgkull
Copy link

If the linguist maintainers prefer to treat package-lock.json files as auto-generated, is there a way for Github users to override that preference for individual repos?

@lildude
Copy link
Member

lildude commented Feb 12, 2020

@Ghazgkull
Copy link

Ghazgkull commented Feb 18, 2020

@lildude Thanks for taking the time to reply. As a poor soul who just uses Git and isn't really familiar with Linguist, I'm not sure what the inverse of that looks like. Unless I'm missing it, that doc doesn't seem to explain how to inverse a statement.

Could I bother you for a concrete example of what I would need to add to my .gitattributes file in order to not treat package-lock.json files as auto-generated?

Something like... ?

!package-lock.json linguist-generated
package-lock.json !linguist-generated
package-lock.json NOT linguist-generated
package-lock.json not-linguist-generated```

@pchaigno
Copy link
Contributor

@Ghazgkull See the third example for linguist-detectable: a dash negates your attribute; =false should also work:

package-lock.json -linguist-generated
package-lock.json linguist-generated=false

@silverwind
Copy link

I have to strongly disagree with marking yarn lockfiles as non-generated because that's what they are: generated by a machine, not meant for human consumption. They are certainly polluting diffs, e.g. on medium sized project a dependency update it's still a 1-2k line change.

The fact that yarn lockfile is the only lockfile that is not considered generated is rather disturbing, but I guess it may be time to migrate off of it anyways 😉.

@arcanis
Copy link
Contributor Author

arcanis commented Dec 11, 2020

not meant for human consumption

You're meant to review them - so insofar as you are a human, they are in part meant for human consumption.

@silverwind
Copy link

silverwind commented Dec 11, 2020

In an ideal world, yes. But realistically no one is really going to review a 2k lockfile change in a big react project for example.

If the author is trusted, I just look at package.json and skip over the lockfile. If not, I pull the branch and regenerate the lockfile to see if it matches the provided one.

@arcanis
Copy link
Contributor Author

arcanis commented Dec 11, 2020

Yarn's changesets have very rarely that many changed lines - since we don't encode the recursive dependency tree via a nested structure, adding a package is often just 5-10 lines. Having more than that is actually a useful signal, since it shows that perhaps you're installing more packages than you expected.

That being said we also have more improvements in the pipeline to improve the general security even further, and having a reviewable lockfile is important to this goal.

@silverwind
Copy link

Yeah, yarn diffs are usually not that bad. For example order of dependencies in the file seems stable (as opposed to npm which sometimes reorders for no apparent reason) and YAML also reduces line noise.

But still I think it should be collapsed by default in diffs because while I generally fly over lockfiles while reviewing, I prefer to have such files openable on-demand.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet