Introduction¶

Synopsis¶

To get an idea of the language syntax, you can take a look at the test/test_intro.wumps example file. As you read it over, you can examine the resulting parse tree in test/test_results.txt to help understand how the language works. The wumps/lark/grammar.lark file may also be of interest. The Lark grammar documentation describes the format of this grammar file.

Walk-Through¶

Let us now discuss the test/test_intro.wumps example in detail.

Lines starting with the -- character sequence are classified as comments and are completely ignored by the parser.

5	-- This is a comment.

Blank lines are also discarded.

The following simple expression contains only a single identifier.

expr_1A

Simple identifiers like the one shown above are very similar to identifiers in most programming languages (e.g. C and C++) and have similar restrictions. Complex identifiers containing more unusual characters can be formed using single quotes.

14	'Weird + ID?'

Top-level expressions are separated by new lines or semicolons.

expr2
expr3a; expr3b

Indentation is significant, so each of these lines is indented the same amount. The above code is not equivalent to what you see below. Don’t worry about exactly what the code means for now. We will address that later.

expr2
  expr3a; expr3b

Sequence elements are separated by commas. Also note that partial line comments are permitted.

27	e1, e2, e3 -- This is a trailing line comment.

Expressions can span multiple lines by escaping the newline character with a backslash. Indentation of the continued line is insignificant.

line1, line1, \
  line2, \
line3

Another way to span multiple lines is by placing a continuation marker at the end of the first line. In this case, the continuation line(s) must be indented. If more than one continuation line is required, the subsequent continuation lines must be aligned with the first continuation line. No more continuation markers are required on the continuation lines, however.

line1, line1, ...
  line2,
  line3

Expressions can be named like this:

name4: e4

Even an empty expression can be named.

49	emptyname:

Naming has a higher precedence than sequence construction.

52	name5: e5a, e5b

Parentheses can be used for controlling precedence.

55	name6: (e6a, e6b)

Empty parentheses can also be used to construct an empty sequence.

()

Note that inside parentheses, new lines and alignment are ignored.

(
 i1,
     i2,
   i3, i4)

Calls with a single argument can be written like this.

f1 arg1

Multiple arguments can be passed as a sequence.

73	f4(arg1, arg2, arg3)

A call with no arguments can be made by using an empty sequence as an argument.

f_empty()

Multiple argument calls can also nest.

80	f4a(f4b arg4b, arg2, fa4c(arg4c))

Named expressions can be used to form keyword arguments.

83	f5(pos1, pos2, key1: value1, key2: value2)

The positional arguments are not required to be first.

86	f6(key1: value1, pos1, key2: value2, pos2, pos3)

Keyword arguments might not have a value.

89	f6b(key1: value2, pos1, key2:, key3: value3, pos2)

Although multiple positional arguments cannot be passed to a function without using parentheses, additonal keyword arguments can be tacked on the end of a function call.

94	f7 key1: value1 key2: value2

The above expression is equivalent to this one:

97	f7(key1: value1, key2: value2)

One positional argument and many keyword arguments can be called without using parentheses.

101	f8 pos1 key1: value1 key2: value2 key3: value3 key4: value4

Some of the expressions above are not easy to understand, so don’t write things like that just because you can. Hang in there and you will hopefully see the reason for some of these features very soon.

Whitespace is significant, like in YAML or Python. In Wumps, indentation can be used to create a sequence. An increase in indentation begins a sequence of newline-separated expressions, and a decrease in indentation ends the sequence.

fn1
  arg1
  arg2

The above code is equivalent to this:

129	fn1(arg1, arg2)

Nesting works.

fn2
  fn2a
    arg2a
  arg2
  key1: value1

The above expression is equivalent to this:

139	fn2(fn2a(arg2a), arg2, key1: value1)

This allows us to build flow-control constructs using a function call syntax instead of using language keywords.

if (greater(arg1, arg2)) ...
  then:
    do_this(arg1)
    do_that(arg2)
  else:
    do_the_other()

A partial unindent is seen as an implicit continuation of the parent expression, allowing things like this:

if done then:
    finish()
  else:
    keep_going()

For some constructs, named arguments might be repeated.

case value ...
  of: A then:
    do_1a()
    do_1b()
  of: B then:
    do_2()

Another way to create a sequence is to use braces. Inside the braces, expressions are separated by semicolons, but new lines and indentation are ignored.

if (done) then: {
    do_thing1() } else: {do_other1(); 
  do_other2()}

We have just been using literals so far, but native types like integers, floats, and strings are also supported.

174	0xff, 0x10, 10_000, +2.3e3, -0.000_001, "string"

The syntax is good for declarative uses as well.

Drawing
  Circle
    radius: 5.0
    color: "green"
    x: 10
    y: 10
  Rectangle
    width: 20
    height: 20
    color: 0xff0000
    x: 100
    y: 100

Here is an example with nested named items:

top
  constrains:
    a: b: c: d 5
    5 seconds

More Examples¶

Some more contrived examples (mostly for testing purposes) can be found in the test directory. These examples may be of interest to those wanting to understand the language syntax in more detail. The resulting parse trees are also included in test/test_results.txt.