wrap-segments

Wrap lines at Unicode word boundaries, using Intl.Segmenter.

Existing wrapping libraries tend to work very well on plain ASCII-7 text. However, the world has lots of other text that needs to be wrapped.

You might want to turn this:

f̵̩̣̺ö̶̧̧̢o̶̥̩̗̹ ̶̨̢͔̳b̷̧̥͍̥a̷̛̦͓̜r̴̡͕̳̪

into this:

f̵̩̣̺ö̶̧̧̢o̶̥̩̗̹ ̶̨̢͔̳
b̷̧̥͍̥a̷̛̦͓̜r̴̡͕̳̪

by wrapping every 4 grapheme clusters.

Installation

npm install wrap-segments

API

None of the options are required, and you can omit the options entirely to take all of the defaults. The below example shows the default options:

import {SegmentWrapper} from '../lib/index.js'

const w = new SegmentWrapper({
  escape: identityTransform, // Escape inputs before proessing
  indent: '', // Can be a string or number
  indentChar: ' ', // If indent is a number, repeat this that many times
  indentEmpty: false, // If the input is empty, still indent?
  indentFirst: true, // Indent the first line?
  isEmpty: /^\s*$/u, // Is a given text segment empty?  Only applies to non-wordLike segments.
  isNewline: /((?![\r\n\v\f\x85\u2028\u2029])\s)*[\r\n\v\f\x85\u2028\u2029]+(\s*)/gu, // Replace newlines matching this with newlineReplacement
  locale: DEFAULT_LOCALE, // Default is calculated by the JS runtime
  newline: '\n', // Insert this at the end of every line
  newlineReplacement: ' ', // What to replace isNewline with
  trim: true, // Trim whitespace from the end of the input
  width: 80, // In grapheme clusters, *including* indent
})

const wrapped = w.wrap('Lorem Ipsum...')

Generated API documentation is available.

Command line

A CLI is available as a separate package.

Caveats

This hasn't been tested with enough languages. Please submit an issue or PR if you speak Korean, a language that uses the Devanagari script, a language that uses a right-to-left script such as Arabic or Hebrew, etc.
This does not implement the full line breaking algorithm from Unicode TR14. I'm hoping that the Intl.Segmenter word boundaries are "close enough" for most cases. It's hard to get access to all of the needed properties from the JS runtime without including version-specific Unicode data, which I don't want to do. However, there are some rules in that algorithm that would be worth adding, with some careful thought.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
cli		cli
docs		docs
lib		lib
test		test
types		types
.c8rc		.c8rc
.editorconfig		.editorconfig
.eslintrc.cjs		.eslintrc.cjs
.gitattributes		.gitattributes
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wrap-segments

Installation

API

Command line

Caveats

About

Releases

Packages

Languages

License

hildjj/wrap-segments

Folders and files

Latest commit

History

Repository files navigation

wrap-segments

Installation

API

Command line

Caveats

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages