Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: replace fast-glob with tinyglobby #10533

Open
4 tasks done
benmccann opened this issue Dec 21, 2024 · 20 comments
Open
4 tasks done

Enhancement: replace fast-glob with tinyglobby #10533

benmccann opened this issue Dec 21, 2024 · 20 comments
Labels
blocked by another issue Issues which are not ready because another issue needs to be resolved first enhancement New feature or request

Comments

@benmccann
Copy link
Contributor

Before You File a Proposal Please Confirm You Have Done The Following...

Relevant Package

typescript-estree

My proposal is suitable for this project

  • I believe my proposal would be useful to the broader TypeScript community (meaning it is not a niche proposal).

Description

fast-glob has 17 dependencies with 12 people having publish access to those libraries and takes 505kb after installation
https://npmgraph.js.org/?q=fast-glob
https://pkg-size.dev/fast-glob

tinyglobby has only 2 dependencies with 6 people having publish access to those libraries and takes 155kb after installation
https://npmgraph.js.org/?q=tinyglobby
https://pkg-size.dev/tinyglobby

I would be happy to send a PR for this

Additional Info

This was discussed a bit in #9453, but I thought it would be a good time to revisit as tinyglobby usage has grown 30x since that time having been adopted by projects like vite, nx, and nuxt. tinyglobby is likely to dedupe in user projects now. E.g. this repo already pulls in tinyglobby multiple times (via nx and cspell). And other projects in the eslint ecosystem like eslint-import-resolver-typescript (downloaded 10m times per week) depend on tinyglobby already. While fast-glob is still downloaded more frequently, 80% of those downloads come from this project, so whatever library this project uses will end up being the most used.

@benmccann benmccann added enhancement New feature or request triage Waiting for team members to take a look labels Dec 21, 2024
@james-pre
Copy link

james-pre commented Dec 23, 2024

@benmccann

I'm not a maintainer, but I think replacing fast-glob would be great. Going off of the replacement of is-number in https://github.com/micromatch/to-regex-range (which hasn't been released after 5 months), I think this could save a few Terabytes of weekly traffic on npm, which is a huge win.

Just a thought, it may be worth-while to use fs.globSync rather than another package- this has some pretty big benefits considering it completely eliminates the dependency. It would require a change to the Node.js version needed though (>= 22.0.0).

I'd also be willing to work on a PR for this, please let me know if you're interested.

@bradzacher
Copy link
Member

it may be worth-while to use fs.globSync rather than another package- this has some pretty big benefits considering it completely eliminates the dependency. It would require a change to the Node.js version needed though (>= 22.0.0).

We cannot do that until our minimum version is v22 - which won't happen until v20 is out of LTS

@james-pre
Copy link

Thanks for the quick reply.

It looks like Node v20 is out of active LTS (as of Oct. 2024) and will be in maintenance until April 2026.

It was trivial for me to make the change to using the built-in globSync, so I went ahead and made the change locally. We can always put this on ice: james-pre@d555589.

I'll look into switching to tinyglobby tomorrow if I have time.

@JoshuaKGoldberg
Copy link
Member

I'm mildly 👍 on this. It's a fine idea. I see no downside now that the library is widely depended-upon.

@ronami
Copy link
Member

ronami commented Dec 23, 2024

I like this idea, having less dependencies is always nice.

However, I think it may take some time for tinyglobby to be as stable as fast-glob, and using it now may potentially introduce bugs (as in 1, 2, 3).

From what I can see, tinyglobby is getting more popular, and those bugs are getting fixed, so I think it depends how stable it currently is.

@JoshuaKGoldberg
Copy link
Member

Oof, yeah, good call @ronami. I think we should wait a few months for this to all settle down.

@JoshuaKGoldberg JoshuaKGoldberg added blocked by another issue Issues which are not ready because another issue needs to be resolved first and removed triage Waiting for team members to take a look labels Dec 23, 2024
@james-pre
Copy link

It looks like the problems referenced by @ronami have either been resolved or don't affect this project, since all of the tests in the draft PR passed.

@JoshuaKGoldberg
Copy link
Member

True, but the fact that there were problems in the first place shakes confidence in the project.

This isn't a pressing issue IMO. It's nice to reduce dependency sizes, but given how many projects are still using fast-glob -including those linked ones- I don't think we're the only blocker to a bunch of npm savings.

@benmccann
Copy link
Contributor Author

I think waiting a few months is fair as tinyglobby is still a somewhat new library.

I will say that the tinyglobby author has fixed any issues very quickly as they've been reported and has added a test each time to prevent any regressions. E.g. in that link to the cspell repo, the issue was promptly fixed and then they upgraded to the latest rather than rolling back and vitest is also in process of switching back to tinyglobby again. I'm also talking to the tinyglobby author about seeing if we can automatically run the test suites of all the major projects that depend on it to instill more confidence as well (this is something we've done for Svelte and Vite).

Btw, globby / fast-glob are not without issues. I've hit multiple issues with it when switching where tinyglobby is returning something different and it's very clearly a bug in globby / fast-glob, but people have just gotten used to working around it. New bugs tend to be more noticeable than old bugs though.

Anyway, I think it makes sense for a project as large as this one to be more conservative and wait for any issues to shake out with other consumers first. Let's check back in a few months from now and see if everything has been quiet with others adopting the library.

@mrmlnc
Copy link
Contributor

mrmlnc commented Jan 5, 2025

Btw, globby / fast-glob are not without issues. I've hit multiple issues with it when switching where tinyglobby is returning something different and it's very clearly a bug in globby / fast-glob, but people have just gotten used to working around it. New bugs tend to be more noticeable than old bugs though.

To me, this looks like too aggressive marketing. In many issues and PR's mentioning fast-glob, I see your comment. New glob library come along every year, that has only a subset of the functionality of the old one and claims to be better. When these libraries start to be used, people start asking to add the same options that are available in fast-glob. In your case, it will be brace expansion, stream API, error handling, smart complex pattern processing when reading deeply, …. Then the maintainers face a choice – to write a solution themselves (without dependencies) or to use already existing and tested solutions (like braces, merge2, …).

I do not share the opinion that the number of dependencies (under reasonable assumptions) is a selling point. Stability, functionality and maintainability – yes. For me. Some rational balance is needed here. People may think differently.

…it's very clearly a bug in globby / fast-glob

Please report any issues you find with fast-glob. This is the main purpose of this comment. Yes, I don't fix them as quickly as people would like, but the critical problems have long been resolved.

No negativity. Just my thoughts. Having competitors is a positive thing. That's how I started this project [fast-glob] in 2017. I was not happy with the speed, complexity and maintainability of the existing solutions.

@bradzacher
Copy link
Member

My 2c:
It was not even 6 months ago that we switched from globby to fast-glob.
We did this for two reasons:

  • globby is built on top of fast-glob, but we weren't using the extra features globby added.
  • globby (as with all sindresorhus packages) switched to ESM - meaning we were unable to upgrade and get new patches.

These were good reasons to migrate - with a quick, 1:1 migration we cut one layer of indirection from our dependencies and we unlocked patches for a dependency.

Note that install size simply isn't a concern for us - we're a local, dev-only tool. You realistically install our dependencies once and never again as we very rarely actively bump our required minimum ranges. We're also talking about a 353,122b difference. That's nothing - it won't impact anyone's download/install times.
For reference when you install the typescript-eslint package the total installed size is 40,367,158b. So by cutting 353,122b we've reduced the total weight by just 0.8% -- and that's IF AND ONLY IF no other package in the repo's closure depends on fast-glob (which is unlikely - eg eslint-import-resolver-typescript and @next/eslint-plugin-next both depend on it and are widely used).

In order for me personally to back this -- what I'd want to see is some benefit beyond just "it's smaller". Is it faster? Does it provide some killer API that makes our lives better? Is it safer and more secure? etc.
A minimal shift in install size is not good enough reason for us to action this, IMO.

@benmccann
Copy link
Contributor Author

In many issues and PR's mentioning fast-glob, I see your comment.

I think in general that many projects have more dependencies than they need because they haven't paid much attention to it or prioritized it. In my own projects, I've often been able to reduce the number of dependencies by 80-90% just by spending some time on it (example, example). I'm not affiliated with tinyglobby, but I have sent PRs to many projects reducing their dependency count and switching them to tinyglobby as I think there's benefit to it. I think fast-glob is great and have nothing negative to say about it. I think most projects don't need everything it offers though and can get by with less.

When these libraries start to be used, people start asking to add the same options that are available in fast-glob. In your case, it will be brace expansion, stream API, error handling, smart complex pattern processing when reading deeply, …. Then the maintainers face a choice – to write a solution themselves (without dependencies) or to use already existing and tested solutions (like braces, merge2, …).

Yes, I agree that fast-glob is far more feature filled than tinyglobby. Most projects don't need most of the features that fast-glob has. tinyglobby has committed to staying small and not adding many of the features in fast-glob. Some may be added to tinyglobby, but most won't.

Note that install size simply isn't a concern for us - we're a local, dev-only tool.
Is it safer and more secure?

Less code and fewer dependencies means less attack surface.

Just a few months ago there was a malware that affected a package that eslint developers install. Reducing the number of dependencies by a good percentage reduces your risk of facing these issues by a fair amount.

Also, braces which gets pulled in by fast-glob had a recent security issue. Whether or not this posed real security risk to users, they're notified and then have to spend the time to figure out whether there's some risk they face, upgrade their project to a newer version, etc. If you drop a large percentage of your dependencies, you also reduce the time you spend dealing with these sorts of issues.

IF AND ONLY IF no other package in the repo's closure depends on fast-glob (which is unlikely - eg eslint-import-resolver-typescript and @next/eslint-plugin-next both depend on it and are widely used).

eslint-import-resolver-typescript, the more widely used of these packages, uses tinyglobby (source)

@bradzacher
Copy link
Member

Just a few months ago there was a malware that affected a package that eslint developers install.

The article you linked is from 2021 - not "just a few months ago".
And it's about UAParser.js which most eslint users wouldn't install AFAIK (what popular eslint plugin is doing UA parsing??).

Also, braces which gets pulled in by fast-glob hada recent security issue.

Reading that issue:

An attacker can cause the application to allocate excessive memory and potentially crash by sending imbalanced braces as input

Which is a non-issue for eslint usecases. Because the only user inputs are from a user's config. So worst case a user could craft a glob that crashes their eslint run...
It's the same as any REDOS "vuln" - a complete non-issue for eslint usecases.

If you drop a large percentage of your dependencies, you also reduce the time you spend dealing with these sorts of issues.

Does it? What makes tinyglobby immune to such issues? Does it do advanced fuzz testing or something to look for and catch such vulnerabilities before a release? I would say that it's no less vulnerable than any other package. One could argue it's more vulnerable because the maintainers are responsible for a wider breadth of code due to not using focused dependencies like braces and so they are more likely to accidentally introduce vulns.

eslint-import-resolver-typescript, the more widely used of these packages, uses tinyglobby

That version is yet to be published.
For the foreseeable future people are likely going to be on a version that uses fast-glob.

As an example - we currently have over 20 paths that fast-glob is transitively depended on in this repo. So if we remove our direct dep on it in /typescript-estree then our total install size will not change.

@benmccann
Copy link
Contributor Author

The article you linked is from 2021 - not "just a few months ago".

Ah, my mistake on the date. Google shows the date as 2024 in the search results and the article says "The incident was detected on Friday, October 22" without mentioning the year, so I didn't notice it was a few years ago. As for whether eslint repo installs it, it appears that it doesn't currently, but did at that time. I happened to have a copy of eslint on my disk, which is what I checked when I saw the article, and it turns out that my fork is a few years old. ua-parser-js was installed in the eslint repo via karma at that time. My point was that even if a package is installed as a dev dependency, the number of dependencies it has still matters. And some eslint developers and contributors themselves very possibly could have been hit by such a malicious package given that eslint was installing this one. Note that I was referring to developers and contributors of eslint and not end users of eslint. My point isn't that eslint users have been hit by such an issue, but that these issues are common (rspack was hit just 2 weeks ago) and even may have affected eslint contributors.

Screenshot from 2025-01-05 15-05-02

It's the same as any REDOS "vuln" - a complete non-issue for eslint usecases

My point wasn't that it posed a security risk, but that users need to take the time to evaluate whether it does and handle it accordingly. And even if they do research it and come to the conclusion that there's no risk, many are still going to do an upgrade to make dependabot or a vulerability scanner quiet down.

That version is yet to be published.

Sure, but if this project were to adopt tinyglobby it will have been published and users are likely to have updated it before they upgrade this library or to do so at the same time.

@james-pre
Copy link

james-pre commented Jan 5, 2025

I think its important to point out that fast-glob's dependency tree includes is-number:

fast-glob

(fast-glob -> micromatch -> braces -> fill-range -> to-regex-range -> is-number).

is-number was removed from to-regex-range back in July to save 440GB of weekly NPM traffic (to-regex-range#17), but Jon Schlinkert (the maintainer with NPM publish permissions) is refusing to release this because of his personal opinions (to-regex-range#19). This means to-regex-range will no longer receive updates, so any security issues would go unpatched.

From some quick research, it looks like Jon is actively trying to prevent projects from moving away from his packages. Additionally, Jon is a maintainer of over 1,500 packages (npm), most of which haven't been updated in over a decade. Even if he wanted to maintain these packages, I seriously doubt any programmer would have the time to. 8/17 (47%) of fast-glob's dependencies are maintained by Jon.

I see numerous problems similar to the ones leading up to the xzutils incident. A package (to-regex-range) is very widely depended on, is not being released on time (or at all), and has frustration around the maintainer. While there isn't any malicious code yet, there is the real possibility of a supply chain attack.

But other popular projects, like X still use fast-glob

This can go in circles... project A doesn't switch to tinyglobby because project B is still using fast-glob and project B doesn't switch to tinyglobby because project A is still using fast-glob, and so on. This argument creates a deadlock, preventing meaningful changes to the dependency tree. Realistically, both fast-glob and tinyglobby are popular enough that it doesn't really matter.

But tinyglobby isn't as stable as fast-glob, it may have bugs

In my draft PR, all of the checks passed (https://github.com/typescript-eslint/typescript-eslint/pull/10544/checks). Sure, some extreme edge case may not work, but fast-glob will also have extreme edge cases that don't work. While tinyglobby may have the occasional bug, it will get fixed quickly. While fast-glob does okay with releases, its dependencies are another story- 9/17 (53%) haven't been updated in years: reusify, queue-microtask, run-parallel, is-extglob, is-glob, glob-parent, merge2, to-regex-range, is-number.

fs.glob and fs.globSync

fs.glob and fs.globSync were introduced in Node v22, which is the current "active" LTS. Even though Node v20 will be maintenance-only until it is deprecated next year, it is worth considering whether we can just use the native implementation instead of a dependency, which would have the best impact on supply chain security.

Wrapping up

As @mrmlnc is fast-glob's maintainer, I feel like they are biased towards keeping fast-glob. I don't maintain any of the current or proposed dependencies, so I don't have anything to gain or lose. My only motivation is improving the project and its supply chain.

If all that wasn't enough, tinyglobby comes with some notable performance improvements. For example, it uses fdir, which "can easily crawl a directory containing 1 million files in < 1 second."

Overall, there are more risks associated with fast-glob than tinyglobby, especially in the long term.

It would be incredibly short-sighted to stay with fast-glob given the clear issues with fast-glob's dependencies, the performance and package size improvements provided by tinyglobby, and most importantly the massive supply chain security impact of switching.

Please let me know if there are any other things concerning fast-glob or tinyglobby that I didn't address.

@SuperchupuDev
Copy link
Contributor

SuperchupuDev commented Jan 6, 2025

As an example - we currently have over 20 paths that fast-glob is transitively depended on in this repo. So if we remove our direct dep on it in /typescript-estree then our total install size will not change.

@bradzacher correct me if I'm wrong (I don't develop ts-eslint after all) but I suppose that all those 20 paths come from dev dependencies, because according to npmgraph, typescript-eslint only uses fast-glob in one place in production deps (the ones users will install) which is in @typescript-eslint/typescript-estree and no other subdependency. So install size would go down for users https://npmgraph.js.org/?q=typescript-eslint#select=exact%3Afast-glob%403.3.3

@Josh-Cena
Copy link
Member

I'm +1 to keeping fast-glob just because it's already in production and we have verified that "it works". I don't think the benefits offered here, while definitely desirable, outweigh the overhead of a migration and dealing with possible incompatibilities with existing usage. After we require Node 22+ we can just go to FS.glob since that seems like the same level of potential breakage as migrating to another library.

@bradzacher
Copy link
Member

bradzacher commented Jan 6, 2025

but I suppose that all those 20 paths come from dev dependencies, because according to npmgraph, typescript-eslint only uses fast-glob in one place in production deps (the ones users will install) which is in @typescript-eslint/typescript-estree and no other subdependency. So install size would go down for users

My point wasn't that our transitive deps come from dev deps -- my point was that our repo functions pretty well as an example of a pretty idiomatic TypeScript repo -- it has webpack plugins to bundle react code, it uses knip, etc.

I.e. our repo has some 20+ transitive paths through which fast-glob is currently depended on - so it's likely that the average TS repo also has at least 5 or more similar transitive dependency paths. So removing it from our production dep closure would provide zero improvement for the average TS repo because they would still be installing fast-glob for the remaining transitive requirements.

As an example - I just looked at my company's lockfile and I could quickly see over 30 transitive dependency pathways to fast-glob. Funnily enough too - there are ZERO references to tinyglobby - which means that if we make this change we don't reduce the install size or the total dependency count for the repo -- we will INCREASE both numbers -- we'll regress the very metric we're looking to improve!

@bradzacher
Copy link
Member

As @Josh-Cena said - we have a working solution right now.
If we have a truly better library because it is meaningfully faster for users - then I'm all for it.

But if the metric we're working to improve is purely:

  • "dependency count"
  • "disk install size"
  • security by reducing dependency count

But these are a false metrics to optimise for because there are so many usages of globby and fast-glob that removing just our usage won't actually improve things for anyone and could actually regress these very metrics for some users.

What is the improvement that end users will actually see?

@benmccann
Copy link
Contributor Author

Fair enough. We can create some benchmarks to show what speed differences there are. I'd like to wait for the next release of tinyglobby before doing so as the upcoming release is significantly faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked by another issue Issues which are not ready because another issue needs to be resolved first enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants