Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some much needed refactoring #59

Merged
merged 9 commits into from
May 21, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Architecture

This document describes the high-level architecture of `fdir`. If you want to familiarize yourself with the code base, you are in the right place!

---

On the highest level, `fdir` is a library that accepts a path to a directory as input and outputs all the file paths in that directory recrusively.

More specifically, input data consists of a path to a directory (`rootDirectory`) and different flags and filters to control the walking process. To increase performance, `fdir` builds internal functions conditionally based on the passed flags. Since these "conditional" functions are tiny, they are inlined by the Javascript Engine reducing branching & allocations.

## Entry Points

`index.js` exports the main `fdir` class and it is the main entry point. However, there is nothing of importance in this file aside from the export.

`src/builder/index.js` contains the main API of `fdir` exposed via a `Builder` class. This is where all the flags & filters are built and passed (as an `options` Object) onto the core of `fdir`.

## Code Map

This section talks briefly about all the directories and what each file in each directory does.

### `src/api`

This is the core of `fdir`.

**`walker.js`:** This contains the `Walker` class which is responsible for controlling and maintaining the state of the directory walker. It builds the conditional functions, processes the `Dirents` and delegates the actual filesystem directory reading to sync/async APIs.

**`async.js`** This contains the asynchronous (`fs.readdir`) logic. This is the starting point of the async crawling process.

**`queue.js`** This contains a tiny `Queue` class to make sure `fdir` doesn't early exit during walking. It increments a counter for each "walk" queued and decrements it when it finishes. Once the counter hits 0, it calls the callback which returns the output to the user.

**`sync.js`** This contains the synchronous (`fs.readdirSync`) logic. This is the starting point of the sync crawling process.

**`fns.js`** This contains the implementations of all the conditional functions.

### `src/builder`

This is what gets exposed to the developer and contains 2 builders that aid in building an `options` object to control various aspects of the walker.

### `src/compat`

Since `fdir` supports Node <= 10.0, this directory contains the compatibility code to bridge the newer (v10.0) filesystem API with the older (v8.0) filesystem API.
92 changes: 92 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Contributing

When contributing to this repository, please first discuss the change you wish to make via issue,
email, or any other method with the owners of this repository before making a change.

Please note we have a code of conduct, please follow it in all your interactions with the project.

## Pull Request Process

1. Ensure any install or build dependencies are removed before the end of the layer when doing a
build.
2. Update the README.md with details of changes to the interface, this includes new environment
variables, exposed ports, useful file locations and container parameters.
3. Increase the version numbers in any examples files and the README.md to the new version that this
Pull Request would represent. The versioning scheme we use is [SemVer](http://semver.org/).
4. You may merge the Pull Request in once you have the sign-off of two other developers, or if you
do not have permission to do that, you may request the second reviewer to merge it for you.

## Code of Conduct

### Our Pledge

In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, gender identity and expression, level of experience,
nationality, personal appearance, race, religion, or sexual identity and
orientation.

### Our Standards

Examples of behavior that contributes to creating a positive environment
include:

- Using welcoming and inclusive language
- Being respectful of differing viewpoints and experiences
- Gracefully accepting constructive criticism
- Focusing on what is best for the community
- Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

- The use of sexualized language or imagery and unwelcome sexual attention or
advances
- Trolling, insulting/derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or electronic
address, without explicit permission
- Other conduct which could reasonably be considered inappropriate in a
professional setting

### Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.

### Scope

This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.

### Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at [INSERT EMAIL ADDRESS]. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.

### Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at [http://contributor-covenant.org/version/1/4][version]

[homepage]: http://contributor-covenant.org
[version]: http://contributor-covenant.org/version/1/4/
13 changes: 7 additions & 6 deletions __tests__/fdir.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -137,15 +137,16 @@ describe.each(["withPromise", "sync"])("fdir %s", (type) => {

test("recurse root (files should not contain multiple /)", async () => {
mock({
"/": {
etc: {
hosts: "dooone",
},
"/etc": {
hosts: "dooone",
},
});
const api = new fdir().normalize().crawl("/");
const api = new fdir()
.withBasePath()
.normalize()
.crawl("/");
const files = await api[type]();
expect(files.every((file) => !file.includes("/"))).toBe(true);
expect(files.every((file) => !file.includes("//"))).toBe(true);
mock.restore();
});

Expand Down
55 changes: 42 additions & 13 deletions src/api/async.js
Original file line number Diff line number Diff line change
@@ -1,41 +1,70 @@
const { readdir } = require("../compat/fs");
const Queue = require("./queue");
const { makeWalkerFunctions, readdirOpts } = require("./shared");
const { Walker, readdirOpts } = require("./walker");

function promise(dir, options) {
/**
* This is basically a `promisify` around the callback function.
* @param {string} directoryPath Directory path to start walking from
* @param {Object} options The options to configure the Walker
* @returns {Promise} Promise that resolves to Output
*/
function promise(directoryPath, options) {
return new Promise((resolve, reject) => {
callback(dir, options, (err, output) => {
callback(directoryPath, options, (err, output) => {
if (err) return reject(err);
resolve(output);
});
});
}

function callback(dirPath, options, callback) {
const { init, walkSingleDir } = makeWalkerFunctions();
/**
* Register a Walker and start walking asynchronously until we reach
* the end (or maxDepth); then call the callback function and exit.
* @param {string} directoryPath Directory path to start walking from
* @param {Object} options The options to configure the Walker
* @param {(error: Object, output: Object) => void} callback
*/
function callback(directoryPath, options, callback) {
let walker = new Walker(options, callback);
walker.registerWalker(walkDirectory);
walker.state.queue = new Queue(walker.callbackInvoker);

const { state, callbackInvoker, dir } = init(dirPath, options, callback);
state.queue = new Queue(callbackInvoker);

// perf: we pass everything in arguments to avoid creating a closure
walk(state, dir, options.maxDepth, walkSingleDir);
const root = walker.normalizePath(directoryPath);
walker.walk(walker, root, options.maxDepth);
}

function walk(state, dir, currentDepth, walkSingleDir) {
/**
* Walk a directory asynchronously. This function is called internally
* by the Walker whenever it encounters a sub directory.
*
* Since this is async, we use a custom queue system to track all concurrent
* fs.readdir calls. Once the queue counter hits 0, we call the callback and exit.
* @param {Walker} walker The core Walker that controls the whole walking process (we don't use `this` to keep things explicit)
* @param {string} directoryPath Path to the directory
* @param {number} currentDepth The depth walker is at currently (value starts from options.maxDepth and decreases every time a sub directory is encountered)
* @returns
*/
function walkDirectory(walker, directoryPath, currentDepth) {
const { state } = walker;

state.queue.queue();

if (currentDepth < 0) {
state.queue.dequeue(null, state);
return;
}

readdir(dir, readdirOpts, function (error, dirents) {
// Perf: Node >= 10 introduced withFileTypes that helps us
// skip an extra fs.stat call.
// Howver, since this API is not availble in Node < 10, I had to create
// a compatibility layer to support both variants.
readdir(directoryPath, readdirOpts, function(error, dirents) {
if (error) {
state.queue.dequeue(error, state);
return;
}

walkSingleDir(walk, state, dir, dirents, currentDepth);
walker.processDirents(dirents, directoryPath, currentDepth);
state.queue.dequeue(null, state);
});
}
Expand Down
81 changes: 28 additions & 53 deletions src/api/fns.js
Original file line number Diff line number Diff line change
@@ -1,93 +1,70 @@
const { sep } = require("path");
const fs = require("fs");

/* GET ARRAY */
module.exports.getArray = function(state) {
return state.paths;
};

module.exports.getArrayGroup = function() {
return [""].slice(0, 0);
};

/** PUSH FILE */
module.exports.pushFileFilterAndCount = function(filters) {
return function(filename, _files, _dir, state) {
if (filters.every((filter) => filter(filename, false)))
state.counts.files++;
};
module.exports.pushFileFilterAndCount = function(walker, filename) {
if (walker.options.filters.every((filter) => filter(filename, false)))
module.exports.pushFileCount(walker);
};

module.exports.pushFileFilter = function(filters) {
return function(filename, files) {
if (filters.every((filter) => filter(filename, false)))
files.push(filename);
};
module.exports.pushFileFilter = function(walker, filename, files) {
if (walker.options.filters.every((filter) => filter(filename, false)))
files.push(filename);
};

module.exports.pushFileCount = function(_filename, _files, _dir, state) {
state.counts.files++;
module.exports.pushFileCount = function(walker) {
walker.state.counts.files++;
};
module.exports.pushFile = function(filename, files) {
module.exports.pushFile = function(_walker, filename, files) {
files.push(filename);
};

/** PUSH DIR */
module.exports.pushDir = function(dirPath, paths) {
module.exports.pushDir = function(_walker, dirPath, paths) {
paths.push(dirPath);
};

module.exports.pushDirFilter = function(filters) {
return function(dirPath, paths) {
if (filters.every((filter) => filter(dirPath, true))) {
paths.push(dirPath);
}
};
module.exports.pushDirFilter = function(walker, dirPath, paths) {
if (walker.options.filters.every((filter) => filter(dirPath, true))) {
paths.push(dirPath);
}
};

/** JOIN PATH */
module.exports.joinPathWithBasePath = function(filename, dir) {
return `${dir}${sep}${filename}`;
return `${dir}${dir.endsWith(sep) ? "" : sep}${filename}`;
};
module.exports.joinPath = function(filename) {
return filename;
};

/** WALK DIR */
module.exports.walkDirExclude = function(exclude) {
return function(walk, state, path, dir, currentDepth, walkSingleDir) {
if (!exclude(dir, path)) {
module.exports.walkDir(
walk,
state,
path,
dir,
currentDepth,
walkSingleDir
);
}
};
};

module.exports.walkDir = function(
walk,
state,
module.exports.walkDirExclude = function(
walker,
path,
_dir,
currentDepth,
walkSingleDir
directoryName,
currentDepth
) {
state.counts.dirs++;
walk(state, path, currentDepth, walkSingleDir);
if (!walker.options.excludeFn(directoryName, path)) {
module.exports.walkDir(walker, path, directoryName, currentDepth);
}
};

module.exports.walkDir = function(walker, path, _directoryName, currentDepth) {
walker.state.counts.dirs++;
walker.walk(walker, path, currentDepth);
};

/** GROUP FILES */
module.exports.groupFiles = function(dir, files, state) {
state.counts.files += files.length;
state.paths.push({ dir, files });
};
module.exports.empty = function() {};

/** CALLBACK INVOKER */
module.exports.callbackInvokerOnlyCountsSync = function(state) {
return state.counts;
};
Expand All @@ -111,8 +88,6 @@ function callbackInvokerBuilder(output) {
};
}

/** SYMLINK RESOLVER */

module.exports.resolveSymlinksAsync = function(path, state, callback) {
state.queue.queue();

Expand Down
12 changes: 9 additions & 3 deletions src/api/queue.js
Original file line number Diff line number Diff line change
@@ -1,14 +1,20 @@
/**
* This is a custom stateless queue to track concurrent async fs calls.
* It increments a counter whenever a call is queued and decrements it
* as soon as it completes. When the counter hits 0, it calls onQueueEmpty.
* @param {(error: any, output: any)} onQueueEmpty the callback to call when queue is empty
*/
function Queue(onQueueEmpty) {
this.onQueueEmpty = onQueueEmpty;
this.queuedCount = 0;
}

Queue.prototype.queue = function () {
Queue.prototype.queue = function() {
this.queuedCount++;
};

Queue.prototype.dequeue = function (...args) {
if (--this.queuedCount === 0) this.onQueueEmpty(...args);
Queue.prototype.dequeue = function(error, output) {
if (--this.queuedCount === 0 || error) this.onQueueEmpty(error, output);
};

module.exports = Queue;
Loading