Skip to content

xie392/lexer

Repository files navigation

It is a C language lexer built using typescript

Document :中文 / English

Lexical Analyzer

A lexical analyzer (Lexer, also known as a scanner) is a part of a compiler or interpreter that breaks down the input source code string into small pieces called tokens. Each token typically represents a basic syntactical unit in the source code, such as keywords, identifiers, operators, constants, etc. Lexical analysis is the first phase of the compilation process, and its primary task is to transform complex source code strings into an easily manageable stream of tokens.

Principles of Lexical Analyzer

The principles of a lexical analyzer are as follows:

  • The lexical analyzer reads a lexical unit from the source code string.
  • The lexical analyzer converts the lexical unit into a stream of tokens.

Implementation of Lexical Analyzer

A Position class is used to represent the position of a lexical unit, which includes the line number and column number of the lexical unit. It is used to advance through lexical units. The next() method is called to get the next lexical unit. A Deterministic Finite Automaton (DFA) is used to determine different lexical units. Finally, the lexical units are converted into a stream of tokens.

Usage of Lexical Analyzer

npm install lexers
import { createTokenizer } from 'lexers'

const code = `int a = 1;`

const tokenizer = createTokenizer(code)
const tokens = tokenizer.lexer()

Contributions

(1) Project Statistics

Q&A

If you have any problems or questions, please submit an issue

About

C language parser, implementing lexical analyzer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published