Skip to content

fs.readdir() with filter, recursion, absolute paths, promises, streams, and more!

License

Notifications You must be signed in to change notification settings

JS-DevTools/readdir-enhanced

Repository files navigation

Enhanced fs.readdir()

Cross-Platform Compatibility Build Status

Coverage Status Dependencies

npm License Buy us a tree

Features

Example

import readdir from "@jsdevtools/readdir-enhanced";
import through2 from "through2";

// Synchronous API
let files = readdir.sync("my/directory");

// Callback API
readdir.async("my/directory", (err, files) => { ... });

// Promises API
readdir.async("my/directory")
  .then((files) => { ... })
  .catch((err) => { ... });

// Async/Await API
let files = await readdir.async("my/directory");

// Async Iterator API
for await (let item of readdir.iterator("my/directory")) {
  ...
}

// EventEmitter API
readdir.stream("my/directory")
  .on("data", (path) => { ... })
  .on("file", (path) => { ... })
  .on("directory", (path) => { ... })
  .on("symlink", (path) => { ... })
  .on("error", (err) => { ... });

// Streaming API
let stream = readdir.stream("my/directory")
  .pipe(through2.obj(function(data, enc, next) {
    console.log(data);
    this.push(data);
    next();
  });

Installation

Install using npm:

npm install @jsdevtools/readdir-enhanced

Pick Your API

Readdir Enhanced has multiple APIs, so you can pick whichever one you prefer. Here are some things to consider about each API:

Function Returns Syntax Blocks the thread? Buffers results?
readdirSync()
readdir.sync()
Array Synchronous yes yes
readdir()
readdir.async()
readdirAsync()
Promise async/await
Promise.then()
callback
no yes
readdir.iterator()
readdirIterator()
Iterator for await...of no no
readdir.stream()
readdirStream()
Readable Stream stream.on("data")
stream.read()
stream.pipe()
no no

Blocking the Thread

The synchronous API blocks the thread until all results have been read. Only use this if you know the directory does not contain many items, or if your program needs the results before it can do anything else.

Buffered Results

Some APIs buffer the results, which means you get all the results at once (as an array). This can be more convenient to work with, but it can also consume a significant amount of memory, depending on how many results there are. The non-buffered APIs return each result to you one-by-one, which means you can start processing the results even while the directory is still being read.

Alias Exports

The example above imported the readdir default export and used its properties, such as readdir.sync or readdir.async to call specific APIs. For convenience, each of the different APIs is exported as a named function that you can import directly.

  • readdir.sync() is also exported as readdirSync()
  • readdir.async() is also exported as readdirAsync()
  • readdir.iterator() is also exported as readdirIterator()
  • readdir.stream() is also exported as readdirStream()

Here's how to import named exports rather than the default export:

import { readdirSync, readdirAsync, readdirIterator, readdirStream } from "@jsdevtools/readdir-enhanced";

Enhanced Features

Readdir Enhanced adds several features to the built-in fs.readdir() function. All of the enhanced features are opt-in, which makes Readdir Enhanced fully backward compatible by default. You can enable any of the features by passing-in an options argument as the second parameter.

Crawl Subdirectories

By default, Readdir Enhanced will only return the top-level contents of the starting directory. But you can set the deep option to recursively traverse the subdirectories and return their contents as well.

Crawl ALL subdirectories

The deep option can be set to true to traverse the entire directory structure.

import readdir from "@jsdevtools/readdir-enhanced";

readdir("my/directory", {deep: true}, (err, files) => {
  console.log(files);
  // => subdir1
  // => subdir1/file.txt
  // => subdir1/subdir2
  // => subdir1/subdir2/file.txt
  // => subdir1/subdir2/subdir3
  // => subdir1/subdir2/subdir3/file.txt
});

Crawl to a specific depth

The deep option can be set to a number to only traverse that many levels deep. For example, calling readdir("my/directory", {deep: 2}) will return subdir1/file.txt and subdir1/subdir2/file.txt, but it won't return subdir1/subdir2/subdir3/file.txt.

import readdir from "@jsdevtools/readdir-enhanced";

readdir("my/directory", {deep: 2}, (err, files) => {
  console.log(files);
  // => subdir1
  // => subdir1/file.txt
  // => subdir1/subdir2
  // => subdir1/subdir2/file.txt
  // => subdir1/subdir2/subdir3
});

Crawl subdirectories by name

For simple use-cases, you can use a regular expression or a glob pattern to crawl only the directories whose path matches the pattern. The path is relative to the starting directory by default, but you can customize this via options.basePath.

NOTE: Glob patterns always use forward-slashes, even on Windows. This does not apply to regular expressions though. Regular expressions should use the appropraite path separator for the environment. Or, you can match both types of separators using [\\/].

import readdir from "@jsdevtools/readdir-enhanced";

// Only crawl the "lib" and "bin" subdirectories
// (notice that the "node_modules" subdirectory does NOT get crawled)
readdir("my/directory", {deep: /lib|bin/}, (err, files) => {
  console.log(files);
  // => bin
  // => bin/cli.js
  // => lib
  // => lib/index.js
  // => node_modules
  // => package.json
});

Custom recursion logic

For more advanced recursion, you can set the deep option to a function that accepts an fs.Stats object and returns a truthy value if the starting directory should be crawled.

NOTE: The fs.Stats object that's passed to the function has additional path and depth properties. The path is relative to the starting directory by default, but you can customize this via options.basePath. The depth is the number of subdirectories beneath the base path (see options.deep).

import readdir from "@jsdevtools/readdir-enhanced";

// Crawl all subdirectories, except "node_modules"
function ignoreNodeModules (stats) {
  return stats.path.indexOf("node_modules") === -1;
}

readdir("my/directory", {deep: ignoreNodeModules}, (err, files) => {
  console.log(files);
  // => bin
  // => bin/cli.js
  // => lib
  // => lib/index.js
  // => node_modules
  // => package.json
});

Filtering

The filter option lets you limit the results based on any criteria you want.

Filter by name

For simple use-cases, you can use a regular expression or a glob pattern to filter items by their path. The path is relative to the starting directory by default, but you can customize this via options.basePath.

NOTE: Glob patterns always use forward-slashes, even on Windows. This does not apply to regular expressions though. Regular expressions should use the appropraite path separator for the environment. Or, you can match both types of separators using [\\/].

import readdir from "@jsdevtools/readdir-enhanced";

// Find all .txt files
readdir("my/directory", {filter: "*.txt"});

// Find all package.json files
readdir("my/directory", {filter: "**/package.json", deep: true});

// Find everything with at least one number in the name
readdir("my/directory", {filter: /\d+/});

Custom filtering logic

For more advanced filtering, you can specify a filter function that accepts an fs.Stats object and returns a truthy value if the item should be included in the results.

NOTE: The fs.Stats object that's passed to the filter function has additional path and depth properties. The path is relative to the starting directory by default, but you can customize this via options.basePath. The depth is the number of subdirectories beneath the base path (see options.deep).

import readdir from "@jsdevtools/readdir-enhanced";

// Only return file names containing an underscore
function myFilter(stats) {
  return stats.isFile() && stats.path.indexOf("_") >= 0;
}

readdir("my/directory", {filter: myFilter}, (err, files) => {
  console.log(files);
  // => __myFile.txt
  // => my_other_file.txt
  // => img_1.jpg
  // => node_modules
});

Get fs.Stats objects instead of strings

All of the Readdir Enhanced functions listed above return an array of strings (paths). But in some situations, the path isn't enough information. Setting the stats option returns an array of fs.Stats objects instead of path strings. The fs.Stats object contains all sorts of useful information, such as the size, the creation date/time, and helper methods such as isFile(), isDirectory(), isSymbolicLink(), etc.

NOTE: The fs.Stats objects that are returned also have additional path and depth properties. The path is relative to the starting directory by default, but you can customize this via options.basePath. The depth is the number of subdirectories beneath the base path (see options.deep).

import readdir from "@jsdevtools/readdir-enhanced";

readdir("my/directory", { stats: true }, (err, stats) => {
  for (let stat of stats) {
    console.log(`${stat.path} was created at ${stat.birthtime}`);
  }
});

Base Path

By default all Readdir Enhanced functions return paths that are relative to the starting directory. But you can use the basePath option to customize this. The basePath will be prepended to all of the returned paths. One common use-case for this is to set basePath to the absolute path of the starting directory, so that all of the returned paths will be absolute.

import readdir from "@jsdevtools/readdir-enhanced";
import { resolve } from "path";

// Get absolute paths
let absPath = resolve("my/dir");
readdir("my/directory", {basePath: absPath}, (err, files) => {
  console.log(files);
  // => /absolute/path/to/my/directory/file1.txt
  // => /absolute/path/to/my/directory/file2.txt
  // => /absolute/path/to/my/directory/subdir
});

// Get paths relative to the working directory
readdir("my/directory", {basePath: "my/directory"}, (err, files) => {
  console.log(files);
  // => my/directory/file1.txt
  // => my/directory/file2.txt
  // => my/directory/subdir
});

Path Separator

By default, Readdir Enhanced uses the correct path separator for your OS (\ on Windows, / on Linux & MacOS). But you can set the sep option to any separator character(s) that you want to use instead. This is usually used to ensure consistent path separators across different OSes.

import readdir from "@jsdevtools/readdir-enhanced";

// Always use Windows path separators
readdir("my/directory", {sep: "\\", deep: true}, (err, files) => {
  console.log(files);
  // => subdir1
  // => subdir1\file.txt
  // => subdir1\subdir2
  // => subdir1\subdir2\file.txt
  // => subdir1\subdir2\subdir3
  // => subdir1\subdir2\subdir3\file.txt
});

Custom FS methods

By default, Readdir Enhanced uses the default Node.js FileSystem module for methods like fs.stat, fs.readdir and fs.lstat. But in some situations, you can want to use your own FS methods (FTP, SSH, remote drive and etc). So you can provide your own implementation of FS methods by setting options.fs or specific methods, such as options.fs.stat.

import readdir from "@jsdevtools/readdir-enhanced";

function myCustomReaddirMethod(dir, callback) {
  callback(null, ["__myFile.txt"]);
}

let options = {
  fs: {
    readdir: myCustomReaddirMethod
  }
};

readdir("my/directory", options, (err, files) => {
  console.log(files);
  // => __myFile.txt
});

Backward Compatible

Readdir Enhanced is fully backward-compatible with Node.js' built-in fs.readdir() and fs.readdirSync() functions, so you can use it as a drop-in replacement in existing projects without affecting existing functionality, while still being able to use the enhanced features as needed.

import { readdir, readdirSync } from "@jsdevtools/readdir-enhanced";

// Use it just like Node's built-in fs.readdir function
readdir("my/directory", (er,  files) => { ... });

// Use it just like Node's built-in fs.readdirSync function
let files = readdirSync("my/directory");

A Note on Streams

The Readdir Enhanced streaming API follows the Node.js streaming API. A lot of questions around the streaming API can be answered by reading the Node.js documentation.. However, we've tried to answer the most common questions here.

Stream Events

All events in the Node.js streaming API are supported by Readdir Enhanced. These events include "end", "close", "drain", "error", plus more. An exhaustive list of events is available in the Node.js documentation.

Detect when the Stream has finished

Using these events, we can detect when the stream has finished reading files.

import readdir from "@jsdevtools/readdir-enhanced";

// Build the stream using the Streaming API
let stream = readdir.stream("my/directory")
  .on("data", (path) => { ... });

// Listen to the end event to detect the end of the stream
stream.on("end", () => {
  console.log("Stream finished!");
});

Paused Streams vs. Flowing Streams

As with all Node.js streams, a Readdir Enhanced stream starts in "paused mode". For the stream to start emitting files, you'll need to switch it to "flowing mode".

There are many ways to trigger flowing mode, such as adding a stream.data() handler, using stream.pipe() or calling stream.resume().

Unless you trigger flowing mode, your stream will stay paused and you won't receive any file events.

More information on paused vs. flowing mode can be found in the Node.js documentation.

Contributing

Contributions, enhancements, and bug-fixes are welcome! Open an issue on GitHub and submit a pull request.

Building

To build the project locally on your computer:

  1. Clone this repo
    git clone https://github.com/JS-DevTools/readdir-enhanced.git

  2. Install dependencies
    npm install

  3. Run the tests
    npm test

License

Readdir Enhanced is 100% free and open-source, under the MIT license. Use it however you want.

This package is Treeware. If you use it in production, then we ask that you buy the world a tree to thank us for our work. By contributing to the Treeware forest you’ll be creating employment for local families and restoring wildlife habitats.

Big Thanks To

Thanks to these awesome companies for their support of Open Source developers ❤

Travis CI SauceLabs Coveralls