Introduction

Synopsis

To get an idea of the language syntax, you can take a look at the test/test_intro.wumps example file. As you read it over, you can examine the resulting parse tree in test/test_results.txt to help understand how the language works. The wumps/lark/grammar.lark file may also be of interest. The Lark grammar documentation describes the format of this grammar file.

Walk-Through

Let us now discuss the test/test_intro.wumps example in detail.

Lines starting with the -- character sequence are classified as comments and are completely ignored by the parser.

5
-- This is a comment.

Blank lines are also discarded.

6

The following simple expression contains only a single identifier.

10
expr_1A

Simple identifiers like the one shown above are very similar to identifiers in most programming languages (e.g. C and C++) and have similar restrictions. Complex identifiers containing more unusual characters can be formed using single quotes.

14
'Weird + ID?'

Top-level expressions are separated by new lines or semicolons.

17
18
expr2
expr3a; expr3b

Indentation is significant, so each of these lines is indented the same amount. The above code is not equivalent to what you see below. Don’t worry about exactly what the code means for now. We will address that later.

23
24
expr2
  expr3a; expr3b

Sequence elements are separated by commas. Also note that partial line comments are permitted.

27
e1, e2, e3 -- This is a trailing line comment.

Expressions can span multiple lines by escaping the newline character with a backslash. Indentation of the continued line is insignificant.

31
32
33
line1, line1, \
  line2, \
line3

Another way to span multiple lines is by placing a continuation marker at the end of the first line. In this case, the continuation line(s) must be indented. If more than one continuation line is required, the subsequent continuation lines must be aligned with the first continuation line. No more continuation markers are required on the continuation lines, however.

41
42
43
line1, line1, ...
  line2,
  line3

Expressions can be named like this:

46
name4: e4

Even an empty expression can be named.

49
emptyname:

Naming has a higher precedence than sequence construction.

52
name5: e5a, e5b

Parentheses can be used for controlling precedence.

55
name6: (e6a, e6b)

Empty parentheses can also be used to construct an empty sequence.

58
()

Note that inside parentheses, new lines and alignment are ignored.

61
62
63
64
(
 i1,
     i2,
   i3, i4)

Calls with a single argument can be written like this.

67
f1 arg1

Multiple arguments can be passed as a sequence.

73
f4(arg1, arg2, arg3)

A call with no arguments can be made by using an empty sequence as an argument.

77
f_empty()

Multiple argument calls can also nest.

80
f4a(f4b arg4b, arg2, fa4c(arg4c))

Named expressions can be used to form keyword arguments.

83
f5(pos1, pos2, key1: value1, key2: value2)

The positional arguments are not required to be first.

86
f6(key1: value1, pos1, key2: value2, pos2, pos3)

Keyword arguments might not have a value.

89
f6b(key1: value2, pos1, key2:, key3: value3, pos2)

Although multiple positional arguments cannot be passed to a function without using parentheses, additonal keyword arguments can be tacked on the end of a function call.

94
f7 key1: value1 key2: value2

The above expression is equivalent to this one:

97
f7(key1: value1, key2: value2)

One positional argument and many keyword arguments can be called without using parentheses.

101
f8 pos1 key1: value1 key2: value2 key3: value3 key4: value4

Some of the expressions above are not easy to understand, so don’t write things like that just because you can. Hang in there and you will hopefully see the reason for some of these features very soon.

Whitespace is significant, like in YAML or Python. In Wumps, indentation can be used to create a sequence. An increase in indentation begins a sequence of newline-separated expressions, and a decrease in indentation ends the sequence.

124
125
126
fn1
  arg1
  arg2

The above code is equivalent to this:

129
fn1(arg1, arg2)

Nesting works.

132
133
134
135
136
fn2
  fn2a
    arg2a
  arg2
  key1: value1

The above expression is equivalent to this:

139
fn2(fn2a(arg2a), arg2, key1: value1)

This allows us to build flow-control constructs using a function call syntax instead of using language keywords.

143
144
145
146
147
148
if (greater(arg1, arg2)) ...
  then:
    do_this(arg1)
    do_that(arg2)
  else:
    do_the_other()

A partial unindent is seen as an implicit continuation of the parent expression, allowing things like this:

152
153
154
155
if done then:
    finish()
  else:
    keep_going()

For some constructs, named arguments might be repeated.

158
159
160
161
162
163
case value ...
  of: A then:
    do_1a()
    do_1b()
  of: B then:
    do_2()

Another way to create a sequence is to use braces. Inside the braces, expressions are separated by semicolons, but new lines and indentation are ignored.

168
169
170
if (done) then: {
    do_thing1() } else: {do_other1(); 
  do_other2()}

We have just been using literals so far, but native types like integers, floats, and strings are also supported.

174
0xff, 0x10, 10_000, +2.3e3, -0.000_001, "string"

The syntax is good for declarative uses as well.

177
178
179
180
181
182
183
184
185
186
187
188
Drawing
  Circle
    radius: 5.0
    color: "green"
    x: 10
    y: 10
  Rectangle
    width: 20
    height: 20
    color: 0xff0000
    x: 100
    y: 100

Here is an example with nested named items:

191
192
193
194
top
  constrains:
    a: b: c: d 5
    5 seconds

More Examples

Some more contrived examples (mostly for testing purposes) can be found in the test directory. These examples may be of interest to those wanting to understand the language syntax in more detail. The resulting parse trees are also included in test/test_results.txt.