Vadalog Grammar Reference

This document provides a comprehensive reference for the Vadalog grammar, covering all syntax rules, operators, and language constructs.

Program Structure

A Vadalog program consists of a sequence of clauses:

program: (clause)*

Clause Types

A clause can be one of three types:

clause: annotation | fact | rule

Annotation: Configuration directives (e.g., @bind, @output)
Fact: Ground truth statements (e.g., person("john", 25).)
Rule: Logical inference rules (e.g., adult(X) <- person(X, Age), Age >= 18.)

Facts

Facts are ground statements that establish base knowledge:

fact: annotationBody* atom '.'

Examples:

person("alice", 30).
employee("bob", "engineering", 75000).
company("acme", "technology").

Rules

Rules define logical inference patterns:

rule: annotationBody* head '<-' body '.'

Rule Head

The head can be:

Regular atom: result(X, Y)
EGD (Equality Generating Dependency): X = Y
False head: #F (for constraints)

head: atom (',' atom)* | egdHead | falseHead

Examples:

% Regular head
adult_customer(Name) <- customer(Name, Age), Age >= 18.

% Multiple heads
high_value(Name), vip(Name) <- customer(Name, _, Balance), Balance > 10000.

% EGD head
X = Y <- person(X, Age1), person(Y, Age2), Age1 = Age2.

% Constraint (false head)
#F <- employee(Name, Dept), Dept = "invalid".

Rule Body

The body consists of literals, conditions, and functions:

body: (literal | condition | function) (',' (literal | condition))*

Literals

Literals can be positive, negative, or domain predicates:

literal: atom          % Positive literal
       | 'not' atom    % Negative literal  
       | 'dom(*)'      % Domain literal

Examples:

% Positive literal
employee(Name, Dept) <- person(Name), works_in(Name, Dept).

% Negative literal
available(Name) <- employee(Name), not on_vacation(Name).

% Domain literal
all_values(X) <- dom(*), X = value.

Conditions

Conditions define constraints and comparisons:

Comparison Operators

gtCondition: varTerm GT expression     % >
ltCondition: varTerm LT expression     % <
geCondition: varTerm GE expression     % >=
leCondition: varTerm LE expression     % <=
eqCondition: varTerm EQ expression     % =
eqeqCondition: varTerm EQEQ expression % ==
neqCondition: varTerm NEQ expression   % != or <>

Set Membership

inCondition: varTerm IN expression      % in
notInCondition: varTerm NOTIN expression % !in

Examples:

high_salary(Name) <- employee(Name, Salary), Salary > 100000.
young_adult(Name) <- person(Name, Age), Age >= 18, Age < 30.
tech_employee(Name) <- employee(Name, Dept), Dept == "technology".
valid_dept(Name) <- employee(Name, Dept), Dept in {"engineering", "sales", "marketing"}.

Expressions

Expressions support arithmetic, logical, and functional operations:

Arithmetic Expressions

expression PLUS expression    % +
expression MINUS expression   % -
expression PROD expression    % *
expression DIV expression     % /
MINUS expression             % unary minus

Logical Expressions

expression AND expression     % &&
expression OR expression      % ||
NOT expression               % !

Set Operations

expression UNION expression        % |
expression INTERSECTION expression % &

Comparison Expressions

expression LT expression      % <
expression LE expression      % <=
expression GT expression      % >
expression GE expression      % >=
expression EQEQ expression    % ==
expression NEQ expression     % !=

Examples:

total_compensation(Name, Total) <- 
    employee(Name, Salary, Bonus), 
    Total = Salary + Bonus.

senior_employee(Name) <- 
    employee(Name, Years), 
    Years >= 5.

department_union(Dept) <- 
    engineering_dept(Dept) | sales_dept(Dept).

Aggregation Functions

Vadalog supports both monotonic and non-monotonic aggregations:

Monotonic Aggregations

msum(expression, varList?)     % Incremental sum
mprod(expression, varList?)    % Incremental product
mcount(expression?, varList?)  % Incremental count
munion(expression, varList?)   % Incremental union
mmax(expression)               % Incremental maximum
mmin(expression)               % Incremental minimum
mavg(expression)               % Incremental average
mmedian(expression, "variant") % Incremental median (exact/p2_algorithm/reservoir_sampling)

Standard Aggregations

sum(expression)                % Sum
prod(expression)               % Product
avg(expression)                % Average
count(expression, varList?)    % Count
min(expression)                % Minimum
max(expression)                % Maximum
maxcount(expression?)          % Maximum count

Examples:

% Calculate total salary by department (Dept is the group-by variable)
dept_total(Dept, Total) <- 
    employee(Name, Dept, Salary), 
    Total = msum(Salary).

% Average age by department
dept_avg_age(Dept, AvgAge) <- 
    employee(Name, Dept, Age), 
    AvgAge = mavg(Age).

% Median salary by department (robust to outliers)
dept_median_salary(Dept, Median) <- 
    employee(Name, Dept, Salary), 
    Median = mmedian(Salary, "exact").

% Approximate median for large datasets
dept_approx_median(Dept, Median) <- 
    employee(Name, Dept, Salary), 
    Median = mmedian(Salary, "reservoir_sampling").

% Count employees per department
dept_count(Dept, Count) <- 
    employee(Name, Dept), 
    Count = mcount().

% Global aggregation (no group-by variables in head)
total_employees(Count) <- 
    employee(Name, _, _), 
    Count = mcount().

Group-by variables appear in both the head and the body. The aggregation function takes only the expression to aggregate.

% Correct:
avg_age(Dept, Avg) <- employee(_, Dept, Age), Avg = mavg(Age).
median_age(Dept, Med) <- employee(_, Dept, Age), Med = mmedian(Age, "exact").

String Operations

substring(string, start, length?)  % Extract substring
contains(string, substring)        % Check if contains
contains_any(string, keywords_array) % Check if contains any keyword
rlike(string, pattern)             % Check if matches regex
starts_with(string, prefix)        % Check if starts with
ends_with(string, suffix)          % Check if ends with
concat(str1, str2, ...)           % Concatenate strings
concat_ws(separator, str1, str2, ...) % Concatenate with separator
string_length(string)              % Get string length
is_empty(string)                   % Check if empty
to_lower(string)                   % Convert to lowercase
to_upper(string)                   % Convert to uppercase
split(string, delimiter)           % Split string
index_of(string, substring)        % Find substring index
replace(string, old, new)          % Replace substring
join(array, separator)             % Join array elements
strip(string)                      % Remove leading/trailing whitespace

Examples:

full_name(FullName) <- 
    person(FirstName, LastName), 
    FullName = concat(FirstName, " ", LastName).

email_domain(Domain) <- 
    user(Email), 
    Parts = split(Email, "@"), 
    Domain = collections:get(Parts, 2).

Logical Operations

and(expr1, expr2, ...)     % Logical AND
or(expr1, expr2, ...)      % Logical OR
not(expression)            % Logical NOT
xor(expr1, expr2)          % Exclusive OR
nand(expr1, expr2)         % NOT AND
nor(expr1, expr2)          % NOT OR
xnor(expr1, expr2)         % NOT XOR
implies(expr1, expr2)      % Implication
iff(expr1, expr2)          % If and only if
if(condition, then, else)  % Conditional expression

Interval Operations

between(value, min, max)    % Exclusive bounds
_between(value, min, max)   % Left inclusive
between_(value, min, max)   % Right inclusive  
_between_(value, min, max)  % Both inclusive

Examples:

young_adult(Name) <- 
    person(Name, Age), 
    _between_(Age, 18, 25).

working_hours(Hour) <- 
    time_entry(Hour), 
    _between_(Hour, 9, 17).

Data Type Conversions

as_string(expression)              % Convert to string
as_double(expression)              % Convert to double
as_int(expression)                 % Convert to integer
as_long(expression)                % Convert to long
as_float(expression)               % Convert to float
as_boolean(expression)             % Convert to boolean
as_list(expression, expression)    % Convert to list
as_set(expression, expression)     % Convert to set
as_map(key, value, expression)     % Convert to map
as_date(expression)                % Convert to date
as_timestamp(expression)           % Convert to timestamp
as_json(expression)                % Convert to JSON

Null Handling

is_null(expression)        % Check if null
is_not_null(expression)    % Check if not null
nullManagement:ifnull(expression, valueIfNull, valueIfNotNull)  % Conditional null handling
nullManagement:coalesce(expr1, expr2, ...)  % First non-null value

Terms and Constants

Variable Terms

Variables: Start with uppercase (Name, Age, X)
Anonymous variables: Start with underscore (_, _1, _temp)

Constant Terms

Strings: Double-quoted ("hello", "john doe")
Integers: Numeric (42, -10)
Doubles: Decimal (3.14, -2.5)
Booleans: #T (true), #F (false)
Dates: 2024-01-15 or 2024-01-15 14:30:00

Collection Terms

Lists: [1, 2, 3] or ["a", "b", "c"]
Sets: {1, 2, 3} or {"a", "b", "c"}
Empty collections: [] (list), {} (set)

External Functions

Call external functions using namespace syntax:

functionCall: ID('(' (expression (',' expression)*)? ')')

Examples:

% Math functions
sqrt_value(Result) <- number(X), Result = math:sqrt(X).

% Date functions  
tomorrow(Date) <- today(Today), Date = date:next_day(Today).

% Hash functions
user_hash(Hash) <- user(Data), Hash = hash:sha1(Data).

Parameter Operations

Dynamic parameter substitution:

paramOperation: '${' paramTerm '}'

Example:

filtered_data(X) <- data(X, Value), Value > ${threshold}.

Comments

Line comments start with %:

% This is a comment
person("alice", 30).  % End-of-line comment

Operator Precedence

From highest to lowest precedence:

Parentheses: ()
Unary minus: -
Multiplication/Division: *, /
Addition/Subtraction: +, -
Comparison: <, <=, >, >=, ==, !=
Logical NOT: !
Logical AND: &&
Logical OR: ||
Set operations: |, &

Best Practices

Use meaningful predicate names: customer_analysis not ca
Follow naming conventions: Variables uppercase, constants lowercase
Group related rules: Keep similar logic together
Comment complex logic: Explain non-obvious rules
Use proper aggregations: mavg() for averages, msum() for sums
Handle edge cases: Consider null values and empty results

Introduction

Get Started

Platform

Integrations

Examples

Vadalog Reference

Resources

​Vadalog Grammar Reference

​Program Structure

​Clause Types

​Facts

​Rules

​Rule Head

​Rule Body

​Literals

​Conditions

​Comparison Operators

​Set Membership

​Expressions

​Arithmetic Expressions

​Logical Expressions

​Set Operations

​Comparison Expressions

​Aggregation Functions

​Monotonic Aggregations

​Standard Aggregations

​String Operations

​Logical Operations

​Interval Operations

​Data Type Conversions

​Null Handling

​Terms and Constants

​Variable Terms

​Constant Terms

​Collection Terms

​External Functions

​Parameter Operations

​Comments

​Operator Precedence

​Best Practices

Vadalog Grammar Reference

Program Structure

Clause Types

Facts

Rules

Rule Head

Rule Body

Literals

Conditions

Comparison Operators

Set Membership

Expressions

Arithmetic Expressions

Logical Expressions

Set Operations

Comparison Expressions

Aggregation Functions

Monotonic Aggregations

Standard Aggregations

String Operations

Logical Operations

Interval Operations

Data Type Conversions

Null Handling

Terms and Constants

Variable Terms

Constant Terms

Collection Terms

External Functions

Parameter Operations

Comments

Operator Precedence

Best Practices