Disclaimer:
- These docs are unofficial and may be inaccurate or incomplete.
- Please file bugs at https://github.com/ntrel/cpp2/issues.
- At the time of writing, Cpp2 is an unstable experimental language, see:
Note: Some examples are snipped/adapted from: https://github.com/hsutter/cppfront/tree/main/regression-tests
Note: Examples here use C++23 std::println
instead of std::cout
.
If you don't have it, you can use this definition:
std: namespace = {
println: (args...) = (std::cout << ... << args) << "\n";
}
- Declarations
- Variables
- Modules
- Types
- Memory Safety
- Expressions
- Statements
- Functions
- User-Defined Types
- Templates
- Aliases
These are of the form:
- declaration:
- identifier
:
type?=
initializer
- identifier
type can be omitted for type inference (though not at global scope).
x: int = 42;
y := x;
A global declaration can be used before the line declaring it.
Cpp1 declarations can be mixed in the same file.
// Cpp2
x: int = 42;
// Cpp1
int main() {
return x; // use a Cpp2 definition
}
A Cpp2 declaration cannot use Cpp1 declaration format internally:
// declare a function
f: () = {
int x; // error
}
Note: cppfront
has a -p
switch to only allow pure Cpp2.
Use of an uninitialized variable is statically detected.
When the variable declaration specifies the type, initialization can be
deferred to a later statement.
Both branches of an if
statement must
initialize a variable, or neither.
x: int;
y := x; // error, x is uninitialized
if f() {
x = 1; // initialization, not assignment
} else {
x = 0; // initialization required here too, otherwise an error
}
x = 2; // assignment
x: const int;
x = 5; // initialization
x = 6; // error
y: int = 7;
z: const _ = y; // z is a `const int`
Note that x
does not need to be initialized immediately, it can deferred.
This is particularly useful when using if
branches to initialize the
constant.
https://github.com/ntrel/cppfront/wiki/Design-note:-const-objects-by-default
A variable is implicitly moved on its last use when the use site syntax may accept an rvalue. This includes passing an argument to a function, but not an assignment to the last use of a variable.
inc: (inout v: int) = v++;
test2: () = {
v := 42;
inc(v); // OK, lvalue
inc(v); // error, cannot pass rvalue
}
This can be suppressed by adding a statement _ = v;
after the final inc
call.
Cpp2 files have the file extensions .cpp2
and .h2
.
C++23 will support:
import std;
This will be implicitly done in Cpp2. For now common std
headers are imported.
See also: User-Defined Types.
Use:
std::array
for fixed-size arrays.std::vector
for dynamic arrays.std::span
to reference consecutive elements from either.
A pointer to T
has type *T
. Pointer arithmetic is illegal.
Address of and dereference operators are postfix:
x: int = 42;
p: *int = x&;
y := p*;
This makes p->
obsolete - use p*.
instead.
To distinguish these from binary &
and *
, use preceeding whitespace.
new<T>
gives unique_ptr
by default:
p: std::unique_ptr<int> = new<int>;
q: std::shared_ptr<int> = shared.new<int>;
Note: gc.new<T>
will allocate from a garbage collected arena.
There is no delete
operator. Raw pointers cannot own memory.
Initialization or assignment from null is an error:
q: *int = nullptr; // error
Instead of using null for *T
, use std::optional<*T>
.
By default, cppfront
also detects a runtime null dereference.
For example when dereferencing a pointer created in Cpp1 code.
int *ptr;
f: () -> int = ptr*;
Calling f
above produces:
Null safety violation: dynamic null dereference attempt detected
Cpp2 will not enforce a memory-safety subset 100%. It will diagnose or prevent type, bounds, initialization, and common lifetime memory-safety violations. This is done by:
- Runtime bounds checks
- Requiring each variable is initialized before use in every possible branch
- Not implemented yet: Compile-time tracking of a set of 'points-to' information for each pointer. When a pointed-to variable goes out of scope, the set is updated to replace the variable with an invalid item. Dereferencing a pointer with a set containing an invalid item is a compile-time error. See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1179r1.pdf.
See:
- https://github.com/hsutter/cppfront#2015-lifetime-safety
- https://www.reddit.com/r/cpp/comments/16ummo8/cppfront_autumn_update/k2r3fto/
By default, cppfront
does runtime bound checks when indexing:
v: std::vector = (1, 2);
i := v[-1]; // aborts program
s: std::string = ("hi");
i = s[2]; // aborts program
Besides the pointer operators, Cpp2 also only uses postfix instead of prefix form for:
++
--
~
Unlike Cpp1, the immediate result of postfix increment/decrement is the new value.
i := 0;
assert(i++ == 1);
https://github.com/hsutter/cppfront/wiki/Design-note:-Postfix-operators
A bracketed expression with a trailing $
inside a string will
evaluate the expression, convert it to string and insert it into the
string.
a := 2;
b: std::optional<int> = 2;
s: std::string = "a^2 + b = (a * a + b.value())$\n";
assert(s == "a^2 + b = 6\n");
Note: $
means 'capture' and is also used in closures
and postconditions:
https://github.com/hsutter/cppfront/wiki/Design-note%3A-Capture
- anonymousVariable:
:
type?=
expression
f: (i: int) = { std::println("int"); }
f: (i: short) = { std::println("short"); }
main: () = {
f(5); // int
f(:short = 5); // short
}
The last statement is equivalent to tmp: short = 5; f(tmp);
.
- identifierExpression:
- identifier
- identifier
<
expressions>
- expression
::
identifierExpression
Whenever any kind of identifier expression is used where it could parse as a type, it must be enclosed in parentheses:
id1
- type(id1)
- expression
An identifier expression does not need parentheses where a type would not be valid. Other expressions never need parentheses as they could not be parsed as a valid type, e.g. literals, unary expressions etc.
- asExpression:
- expression
as
type
- expression
x as T
attempts:
- type conversion (if the type of
x
implicitly converts toT
) - customized conversion (using
operator as<T>
), useful forstd::optional
,std::variant
etc. - construction of
T(x)
- dynamic casting (equivalent to Cpp1
dynamic_cast<T>(x)
whenx
is a base class ofT
)
An exception is thrown if the expression is well-formed but the conversion is invalid.
c := 'A';
i: int = c as int;
assert(i == 65);
v := std::any(5);
i = v as int;
s := "hi" as std::string;
assert(s.length() == 2);
- isExpression:
- type
is
(type | template) - expression
is
(type | expression | template)
- type
Not implemented yet.
Test a type T
matches another type - T is Target
attempts:
true
whenT
is the same type asTarget
.true
ifT
is a type that inherits fromTarget
.
Test a type against a template - T is Template
attempts:
true
ifT
is an instance ofTemplate
.Template<T>
if the result is convertible tobool
.
Note: Testing an identifier expression needs to use parentheses.
Test type of an expression - (x) is T
attempts:
true
when the type ofx
isT
x.operator is<T>()
(x) is void
meansx
is empty
assert(5 is int);
i := 5;
assert((i) is int);
assert(!((i) is long));
v := std::any();
assert((v) is void); // `v.operator is<void>()`
v = 5;
assert((v) is int); // `v.operator is<int>()`
Test expression has a particular value - (x) is v
attempts:
x.operator is(v)
x == v
x as V == v
whereV
is the type ofv
v(x)
if the result isbool
i := 5;
assert((i) is 5);
v := std::any(i);
assert((v) is 5);
The last lowering allows to test a value by calling a predicate function:
pred: (x: int) -> bool = x < 20;
test_int: (i: int) = {
if (i) is (pred) {
std::println("(i)$ is less than 20");
}
}
main: () = {
test_int(5);
test_int(15);
test_int(25);
}
Note that pred
is not a type identifier so it must be parenthesized.
Test an expression against a template - (x) is Template
attempts:
true
if the type ofx
is an instance ofTemplate
.Template<(x)>
if the result is convertible tobool
.
- inspectExpression:
inspect
constexpr
? expression->
type{
alternative+}
- alternative:
- alt-name? pattern
=
statement - alt-name? pattern
{
alternative+}
- alt-name? pattern
- alt-name:
- identifier
:
- identifier
- pattern:
is
(type | expression | template)as
typeif
expression- pattern
||
pattern - pattern
&&
pattern
Only is
alternatives without alt-name are implemented ATM.
v : std::any = 12;
main: () = {
s: std::string;
s = inspect v -> std::string {
is 5 = "five";
is int = "some other integer";
is _ = "not an integer";
};
std::println(s);
}
An inspect
expression must have an is _
case.
Unimplemented: an inspect
statement has the same grammar except
there must be no ->
type after the expression.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2392r2.pdf
A variable can be explictly moved. The move constructor of z
will destroy x
:
x: std::string = "hi";
z := (move x);
assert(z == "hi");
assert(x == "");
See also Implicit Move on Last Use.
A condition expression does not require parentheses in Cpp2, though when a statement immediately follows a condition, a blockStatement is required.
- ifStatement:
if
constexpr
? expression blockStatement elseClause?
- elseClause:
else
blockStatementelse
ifStatement
if c1 {
...
} else if c2 {
...
} else {
...
}
x := 1
assert(x == 1);
- parameterizedStatement:
- parameterList statement
A parameterized statement declares one or more variables that are defined only for the scope of statement.
(tmp := some_complex_expression) func(tmp, tmp);
// tmp no longer in scope
Valid parameterStorage keywords are in
, copy
, inout
.
- whileStatement:
while
expression nextClause? blockStatement
- nextClause:
next
expression
If next
is present, its expression will be evaluated at the
end of each loop iteration.
// prints: 0 1 2
(copy i := 0) while i < 3 next i++ {
std::println(i);
}
Note: The above is a parameterizedStatement.
- doWhileStatement:
do
blockStatement nextClause?while
expression;
// prints: 0 1 2
i := 0;
do {
std::println(i);
} next i++ while i < 3;
- forStatement:
for
expression nextClause?do
(
parameter)
statement
The first expression must be a range.
parameter is initialized from each element of the
range. The parameter type is inferred.
parameter can have inout
parameterStorage.
vec: std::vector<int> = (1, 2, 3);
for vec do (inout e)
e++;
assert(vec[0] == 2);
for vec do (e)
std::println(e);
The target of these statements can be a labelled loop.
outer: while true {
j := 0;
while j < 3 next j++ {
if done() {
break outer;
}
}
}
- functionType:
- parameterList returnSpec
- parameterList:
(
parameter?)
(
parameter (,
parameter)+)
- parameter:
- parameterStorage? type.
- parameterStorage? identifier
...
?:
type.
- returnSpec:
->
(forward
|move
)? type->
parameterList
E.g. (int, float) -> bool
.
- functionDeclaration:
- identifier?
:
parameterList returnSpec?;
- identifier?
:
parameterList returnSpec? contracts?=
functionInitializer - identifier?
:
parameterList expression;
- identifier?
Function declarations extend the declaration form. Each parameter must have an identifier.
If returnSpec is missing with the first two forms, the function returns void
.
The return type can be inferred from the initializer by using -> _
.
See also Template Functions.
- functionInitializer:
- (expression
;
| statement)
- (expression
A function is initialized from a statement or an expression.
d: (i: int) = std::println(i);
e: (i: int) = { std::println(i); } // same
If the function has a returnSpec, the expression form implies a return
statement.
f: (i: int) -> int = return i;
g: (i: int) -> int = i; // same
Lastly, -> _ =
together can be omitted:
h: (i: int) i; // same as f and g
This form is useful for lambda functions.
When a function returns a parameterList, each parameter must be named. A function with multiple named return parameters returns a struct with a member for each parameter.
f: () -> (i: int, s: std::string) = {
i = 10;
s = "hi";
}
main: () = {
t := f();
assert(t.i == 5);
assert(t.s == "hi");
}
- Unless a return parameter has a default value, it must be initialized in the function body.
- When only one return parameter is declared, the caller does not use member syntax to access the result.
f: () -> (ret: int = 42) = {}
main: () = {
assert(f() == 42);
}
- mainFunction:
main
:
(
args
?)
(->
int
)?=
functionInitializer
If args
is declared, it is a std::vector<std::string_view>
containing
each command-line argument to the program.
If a method doesn't exist when using method call syntax, and there is a function whose first parameter can take the type of the 'object' expression, then that function is called instead.
main: () -> int = {
// call C functions
myfile := fopen("xyzzy", "w");
myfile.fprintf("Hello %d!", 2); // fprintf(myfile, "Hello %d!", 2)
myfile.fclose(); // fclose(myfile)
}
in
- default, read-only. Will pass by reference when more efficient, otherwise pass by value.inout
- pass by mutable reference.out
- must be written to. Can accept an uninitialized argument, otherwise destroys the argument. The first assignment constructs the parameter. Used for constructors.move
- argument can be moved from. Used for destructors.copy
- argument can be copied from.forward
- accepts lvalue or rvalue, pass by reference.
e: (i: int) = i++; // error, `i` is read-only
f: (inout i: int) = i++; // mutate argument
g: (out i: int) = {
v := i; // error, `i` used before initialization
// error, `i` was not initialized
}
Functions can return by reference:
first: (forward v: std::vector<int>) -> forward int = v[0];
main: () -> int = {
v : std::vector = (1,2,3);
first(v) = 4;
}
vec: std::vector<int> = ();
insert_at: (where: int, val: int)
pre(0 <= where && where <= vec.ssize())
post(vec.ssize() == vec.ssize()$ + 1) = {
vec.insert(vec.begin() + where, val);
}
The postcondition compares the vector size at the end of the function call with an expression that captures the vector size at the start of the function call.
A single named return is useful to refer to a result in a postcondition:
f: () -> (ret: int)
post(ret > 0) = {
ret = 42;
}
A function literal is declared like a named function, but omitting the leading identifier. Variables can be captured:
s: std::string = "Got: ";
f := :(x) = { std::println(s$, x); };
f(5);
f("str");
s$
means captures
by value.s&$*
can be used to dereference the captured address ofs
.
A template function declaration can have template parameters:
- functionTemplate:
- identifier?
:
templateParameterList? parameterList returnSpec? requiresClause?
- identifier?
E.g. size: <T> (v: T) -> _ = v.length();
When a function parameter type is _
, this implies a template with a
corresponding type parameter.
A template function parameter can also be just identifier
.
f: (x: _) = {}
g: (x) = {} // same
print: (a0) = std::print(a0);
print: (a0, args...) = {
print(a0);
print(", ");
print(args...);
}
main: () = print(1, 2, 3);
type
declares a user-defined type with data members and member functions.
When the first parameter is this
, it is an instance method.
myclass : type = {
data: int = 42;
more: std::string = std::to_string(42);
// method
print: (this) = {
std::println("data: (data)$, more: (more)$");
}
// non-const method
inc: (inout this) = data++;
}
main: () = {
x: myclass = ();
x.print();
x.inc();
x.print();
}
Data members are private
by default, whereas methods are public
.
Member declarations can be prefixed with private
or public
.
Official docs: https://github.com/hsutter/cppfront/wiki/Cpp2:-operator=,-this-&-that.
operator=
with an out this
first parameter is called for construction.
When only one subsequent parameter is declared, assignment will also
call this function.
operator=: (out this, i: int) = {
this.data = i;
}
...
x: myclass = 99;
x = 1;
With only one parameter move this
, it is called to destroy the object:
operator=: (move this) = {
std::println("destroying (data)$ and (more)$");
}
Objects are destroyed on last use, not end of scope.
base: type = {
operator=: (out this, i: int) = {}
}
derived: type = {
this: base = (5); // declare parent class & construct with `base(5)`
}
- typeTemplate:
- identifier?
:
templateParameterList?type
requiresClause?
- identifier?
- templateParameterList:
<
templateParameters>
- templateParameter
- identifier
...
? (:
type
)? - identifier
:
type
- identifier
The first parameter form accepts a type.
The second parameter form accepts a value. To use a constant identifier as a template parameter, enclose it in parentheses:
f: <i: int> () -> _ = i;
n: int == 5;
...
std::println(f<(n)>());
n
is a constant alias.
- requiresClause:
requires
constExpression
defaultValue: <T> () -> T requires std::regular<T> = { v: T = (); return v; }
...
assert(defaultValue<int>() == 0);
Note: Using an inline concept for a type parameter is not supported yet.
- concept:
- identifier
:
templateParameterListconcept
requiresClause?=
constExpression;
- identifier
arithmetic: <T> concept = std::integral<T> || std::floating_point<T>;
...
assert(arithmetic<i32>);
assert(arithmetic<float>);
Aliases are defined using ==
rather than =
.
- alias:
- identifier
:
templateParameterList? type?==
constExpression - identifier
:
templateParameterList? functionType==
functionInitializer - identifier
:
templateParameterList?type
==
type - identifier
:
namespace
==
identifierExpression
- identifier
The forms above are equivalent to the following Cpp1 declarations:
constexpr
variableconstexpr
functionusing
type aliasnamespace
alias
// constant template
size: <T> size_t == sizeof(T);
// compile-time function
init: <T> () -> T == ();
main: () = {
static_assert(size<char> == 1);
// constant aliases
v := 5;
//n :== v; // error, cannot read `v` at compile-time
n :== 6; // OK
myfunc :== main;
static_assert(init<int>() == 0);
view: type == std::string_view;
N4: namespace == std::literals;
}