Skip to content
jayconrod edited this page Jun 14, 2011 · 1 revision

The Tungsten language is intended to be simple, uniform, and verbose. It consists of the following (in order of increasing complexity):

Overview

Tungsten is an intermediate language modeled after LLVM. Its main difference with LLVM is that it provides high level language constructs like classes and type parameterization. These constructs make it easier for developers to implement new compilers since less lowering and validation code needs to be written. It also makes code written in different languages interoperable, since the low-level representation for these higher level constructs is always the same.

Tungsten code is packaged in units called modules. A module is just a set of definitions with an optional name and version. Modules may be generated from individual definitions, files, or groups of files. A module may be treated as a program, a library, or an intermediate file (like an object file in C).

Definitions are the basic element of the language. Certain kinds of definitions (globals, functions, structs, classes, interfaces) are always global. Tungsten has a very limited concept of scope. There is no way to nest classes or functions. All definitions have unique names within the scope of a module. Most top-level definitions have internal definitions. For instance, classes and structs have fields, and functions have parameters and blocks.

Code in Tungsten is encoded as instructions within basic blocks, which are defined in functions. A block is a list of instructions which are executed sequentially. Execution of a block always starts at the beginning and goes to the end; there is no way to jump into the middle of a block or exit early. Instructions and blocks are definitions, so they also have unique names within the scope of a module. Each instruction performs some very simple task, such as adding two numbers, loading a value from memory, or calling a function. Each instruction produces a value, which can be used by instructions which come after it. These values act like variables in a high level language, but the value produced never changes once the assignment is made. This is because Tungsten is a single static assignment (SSA) language. An SSA representation makes compiler transformations and analyses much easier to implement.

Every value in Tungsten (whether it comes from an instruction, a parameter, or a global) has a type. Tungsten is strongly, statically typed (although it is likely we will add some dynamic features in the future). Values are never implicitly cast to other types, even when it is always safe to do so.

An example

Here is the canonical factorial example. See the full source code in the examples folder for explanatory comments.

function int64 @fac(int64 %n) {
  block %entry {
    branch @fac.loop_cond(int64 @fac.n, int64 1)
  }
  block %loop_cond(int64 %n, int64 %p) {
    boolean %c = relop int64 %n != int64 0
    cond boolean %c ? @fac.loop_body(int64 %n, int64 %p) : @fac.exit(int64 %p)
  }
  block %loop_body(int64 %n, int64 %p) {
    int64 %p#1 = binop int64 %n * int64 %p
    int64 %n#1 = binop int64 %n - int64 1
    branch @fac.loop_cond(int64 %n#1, int64 %p#1)
  }
  block %exit(int64 %p) {
    return int64 %p
  }
}
Clone this wiki locally