3. Language Reference

This section gives a systematic overview of the Statix language. Statix specifications are organized in modules, consisting of signatures and predicate rules.

Warning

This section is currently incomplete. The information that is there is up-to-date, but many constructs are not yet documented.

3.1. Lexical matters

3.1.1. Identifiers

id    = [A-Za-z][a-zA-Z0-9\_]*
lc-id = [a-z][a-zA-Z0-9\_]*
uc-id = [A-Z][a-zA-Z0-9\_]*

Most identifiers in Statix fall into one of the following categories:

  • Identifiers, that start with a character, and must match the regular expression [A-Za-z][a-zA-Z0-9\_]*.
  • Lowercase identifiers, that start with a lowercase character, and must match the regular expression [a-z][a-zA-Z0-9\_]*.
  • Uppercase identifiers, that start with an uppercase character, and must match the regular expression [A-Z][a-zA-Z0-9\_]*.

3.1.2. Comments

Comments in Statix follow the C-style:

  • // ... single line ... for single-line comments
  • /* ... multiple lines ... */ for multi-line comments

Multi-line comments can be nested, and run until the end of the file when the closing */ is omitted.

3.2. Terms

term = uc-id "(" {term ","}* ")"
     | "(" {term ","}* ")"
     | "[" {term ","}* "]"
     | "[" {term ","}* "|" term "]"
     | numeral
     | string
     | id
     | "_"

 numeral = "-"? [1-9][0-9]*

 string = "\"" ([^\"\\]|"\\"[nrt\\])* "\""

3.3. Modules and Tests

Statix specifications are organized in named modules and test files. Modules and test files can import named modules to extend and use the predicates from the imported modules.

3.3.1. Modules

module module-id

  section*

Statix specifications are organized in modules. A module is identified by a module identifier. Module identifiers consist of one or more names separated by slashes, as in {name /}+. The names must match the regular expression [a-zA-Z0-9\_][a-zA-Z0-9\_\.\-]*.

Every module is defined in its own file, with the extensions .stx. The module name and the last components of the file path must coincide.

Example. An empty module analysis/main, defined in a file .../analysis/main.stx.

module analysis/main

// work on this

Modules consist of sections for imports, signatures, and rule definitions. The rest of this section describes imports, and subsequents sections deal with signatures and rules.

3.3.2. Imports

imports

  module-ref*

A module can import definitions from other modules be importing the other module. Imports are specified in an imports section, which lists the modules being imported.

Imports make predicates defined in the imported module visible. The importing module can use the imported predicates, and extend the predicates with new rules. Imports are not transitive, and locally defined elements (e.g., sorts or predicates) shadow imported elements of the same kind and the same name.

Example. A main module importing several submodules.

module main

imports

   signatures/MyLanguage-sig

   types

3.3.3. Tests

resolve constraint

section*

Apart from named modules, stand-alone test can be defined in .stxtest files. All sections that are allowed in named modules are allowed in tests as well. This means tests can have signatures, rules, and import named modules.

Example. A test using the predicate concat imported from a named module.

resolve {xs} concat([1,2,3], [4,5,6], xs)

imports

  lists

Statix tests can be executed in Eclipse with the Spoofax > Evaluate > Evaluate Test menu action. The test output contains the values of top-level variables in the test constraint (i.e., xs in this example), as well as any errors from failed constraints.

3.4. Signatures

signatures

  signature*

3.4.1. Terms

Sorts

signature = ...
          | "sorts" sort-decl*

sort-decl = uc-id
          | uc-id "=" sort

sort = "string"
     | "int"
     | "list" "(" sort ")"
     | "(" {sort "*" }* ")"
     | "scope"
     | "occurrence"
     | "path"
     | "label"
     | "astId"
     | uc-id

Statix uses algebraic data types to validate term well-formedness. First, Statix has several built-in scalar data types, such as int, string and scope. In addition, Statix also has two built-in composite data types: tuples and lists. Next to the built-in sorts, custom syntactic categories can be defined by adding a name to a sorts subsection of the signatures section.

Example. Declaration of a custom sort Exp and a sort alias for identifiers.

signature
  sorts
    Exp
    ID = string

Constructors

signature = ...
          | "constructors" cons-decl*

cons-decl = uc-id ":" uc-id
          | uc-id ":" {sort "*"}* "->" uc-id

In order to construct or match actual data terms, constructors for these terms need to be declared. Constructors without arguments are declared by stating the constructor name and its sort, separated by a colon. For constructors with arguments, the argument sorts are separated by an asterisk, followed by an arrow operator and the target sort.

Example. Declaration of various constructors for the Exp sort.

signature
  sorts
    ID = string
    Exp

  constructors
    True : Exp
    Var  : ID -> Exp
    Plus : Exp * Exp -> Exp

The example above states three constructors for the Exp sort. The True constructor has no arguments, the Var constructor has a single name as argument, while the Plus constructor takes two subexpressions as arguments.

3.4.2. Name binding

Relations

signature = ...
          | "relations" rel-decl*

rel-decl = uc-id ":" {sort "*"}*
         | uc-id ":" {sort "*"}* "->" sort

In Statix, relations associate data with a scope. All used relations and the type of their data must be declared in the relations section. Each relation declaration consist of the name of the relation and the arguments it accepts.

Relation declarations come in two flavors. There is a predicative variant and a functional variant. The functional variant has the last two arguments separated with a ->, which indicates that the relation is intended to map the first terms to the last term (as in the regular notion of functions).

Apart from intended semantics, the difference between the variants has to do with the formulation of the predicates of queries. Please refer to the Queries section for more information on querying relations. Otherwise, both ways of declaring relations are equivalent. In fact, during compilation, the functional variant is normalized to the predicate variant.

Example. Declaration of a predicative relation this and a functional relation var.

signature
  sorts
    TYPE
    ID = string

  relations
    this : TYPE
    var  : ID -> TYPE

Namespaces

Warning

Usage of namespaces is strongly discouraged and will be removed or revised in a future version of Statix.

Name resolution

Warning

Usage of namespaces is strongly discouraged and will be removed or revised in a future version of Statix.

3.5. Predicates and Rules

rules

  rule-def*

Predicates and their rules make up the main part of a Statix specification.

3.5.1. Predicate rules

lc-id : {sort " * "}*

rule-name? lc-id(<{term ", "}*>).
rule-name? lc-id(<{term ", "}*>) :- constraint.
rule-name = "[" id "]"

Predicates are defined with the sorts of their arguments. Rules define the meaning of the predicate for different cases of the arguments. Rule patterns can be non-linear, i.e., variables can appear multiple times, in which case the terms in those positions must be equal.

Committed choice rule selection

Statix has a committed-choice semantics. This means that once a rule is selected, the solver does never backtrack on that choice. That is different from logic languages like Prolog, where rules are optimistically selected and the solver backtracks when the rule does not work out.

Committed choice evaluation has consequences for inference during constraint solving. If a predicate has multiple rules, a rule is only selected once the constraint arguments are sufficiently instantiated.

Rule order

The order in which the rules of a predicate apply is determined by the patterns it matches on, not by the order in which the rules appear in the specification. Most specific rules apply before more general rules. The parameter patterns are considered from left to right when determining this order. It is an error to have rules with overlapping patterns, where neither is more general than the other. These rules are marked with an error.

Example. An or predicate that computes a logical or, with its last argument the result.

or : Bool * Bool * Bool

or(True(), _, b) :- b == True().
or(_, True(), b) :- b == True().
or(_, _, b) :- b == False().

In the example above, the rules are considered in the order they are presented above. Beware that changing the rule order would not change the specifications behaviour. The last rule is the most general, and therefore comes last, as it matches any arguments. The first rule is more specific than the second because of the left-to-right nature of the ordering.

Non-linear patterns

Non-linear patterns are patterns in which at least one pattern variable occurs multiple times. Such patterns only match on terms that have equal subterms at the positions where such a variable occurs.

Example. An xor predicate that computes a logical exclusive or, with its last argument the result.

xor: Bool * Bool * Bool

xor(B, B, b) :- b == False().
xor(_, _, b) :- b == True().

In the example above, the first rule for xor has a non-linear pattern, because the variable B occurs both at the first and at the second position. In this way, the first rule only matches on equal input terms (either True(), True(), b or False(), False(), b).

Regarding the ordering of rules by specificity, it holds that an occurrence of a variable that is seen earlier is regarded as more specific than a free variable. Therefore, the first rule of xor takes precedence over the second rule. Bound variables are as specific as concrete constructors.

Ordering rules with non-linear patterns

Careful attention to rule order needs to be paid when non-linear patterns and concrete constructor patterns are mixed. For example, consider a subtype predicate with rules for record types and equal types:

subtype: TYPE * TYPE

subtype(REC(s_rec1), REC(s_rec2)) :- /* omitted */.
subtype(T, T).

In this example, equal record types match on both rules. Because of the left to right nature of the rule application, the first rule will be chosen, because for the first argument, the REC constructor is regarded as more specific than the (at that position free) T variable.

If that behavior is not desired, an explicit rule for the intersection of the domains of the pair of rules in question needs to be added. This rule is more specific than both of the other rules, and is therefore selected for any matching input. For example, consider this augmented subtype predicate with an additional rule for equal record types:

subtype: TYPE * TYPE

subtype(REC(s_rec), REC(s_rec)).
subtype(REC(s_rec1), REC(s_rec2)) :- /* omitted */.
subtype(T, T).

In this example, we added a rule that declares that a record type is a subtype of itself. This rule ensures that equal record types are regarded as subtypes without verifying additional constraints. So, while it seems that the first and the third rules are equivalent, and the first one superfluous, this is not the case because the rule ordering will choose the second rule when the behavior of the third rule is desired.

3.5.2. Functional rules

lc-id : {sort " * "}* -> sort

rule-name? lc-id(<{term ", "}*>) = term.
rule-name? lc-id(<{term ", "}*>) = term :- constraint.

Predicates can be defined in functional style as well. Functional predicates can be understood in terms of regular predicates. For example, the or predicate can be written in functional style as follows:

or : Bool * Bool -> Bool

or(True(), _) = True().
or(_, True()) = True().
or(_, _) = False().

This form is equivalent to the definition given above, however its use in the specification is slightly different. Function predicates are used in term positions, where they behave as a term of the output type.

Example. Rule for a functional predicate to type check expressions. The functional predicate typeOfExp is used in two term positions: as the result of a fucntional rule, and in an equality constraint.

typeOfExp : scope * Exp -> TYPE

typeOfExp(s, Int()) = INT().

typeOfExp(s, IfThen(c, e)) = typeOfExp(s, e) :-
  typeOfExp(s, c) == BOOL().

Every specification with functional predicates is normalized to a form with only regular predicates. To show the normal form of a specification in Eclipse, use the Spoofax > Syntax > Format normalized AST menu action.

3.5.3. Mapping rules

3.6. Constraints

3.6.1. Base constraints

3.6.2. Term equality

3.6.3. Name binding

Scope graph

Queries

Occurrences

3.6.4. Arithmetic