annotationName indicates the specific annotation
and each of them accepts a specific list of parameters. In the following
sections we present the supported annotations.
@output
It specifies that the facts for an atom of the program will be exported to an external target, for example the standard output or a relational database. The syntax is the following:atomName is the atom for which the facts have to be exported into an
external target.
It is assumed that an atom annotated with @output:
- does not have any explicit facts in the program.
@output annotation is used without any @bind annotation, it is
assumed that the default target is the standard output. Annotations @model,
@bind and @mapping can be used to customize the target system.
@model
The@model annotation is used to create and enforce a schema for a predicate,
ensuring the data adheres to a specified structure. This annotation not only
supports simple predicate schema definitions but also extends to handle complex
concepts such as superclass relationships and triple-based entity relationships.
The annotation syntax is as follows:
- predicate_name: The name of the predicate to which the schema is applied.
- [‘field_name:type’, ‘field_name:type’, ’…’]: A list defining the schema, where each argument specifies a field name and its corresponding type.
optional_description: (Optional) A natural language description of the predicate, providing a readable explanation of what the predicate represents.
showLineNumbers
- first: string
- second: string
- third: double
- fourth: string
Workflow
Assume to have a parquet dataset containing the following row:-
Define a schema for the input predicate:
This ensures that predicate
badheres to the specified schema. -
Bind the predicate to a data source:
This reads data from the specified Parquet file into predicate
b. -
Define and enforce a schema for the output predicate:
This writes the Parquet file and casts the input data type fields to the output data type fields int, int, double, and string.
-
Define the rules using the schema-defined predicates:
This writes the following row in the parquet file:
Natural Language Descriptions
You can include a natural language description within the@model annotation to
describe what the predicate represents. This description provides human-readable
context for predicates in addition to their schema definition.
If both a model annotation and a glossary file provide descriptions for a
certain predicate, the description in the glossary file takes precedence.
You may refer to terms of the predicate, which will be substituted with
values in the chase graph. Predicate terms referred to in this way must be
enclosed in square brackets
[] in the description.Automatic Generation of natural language description
If a model annotation does not include a natural language description for the predicate and if an LLM is available, the description will be automatically generated during the compilation phase.- This autogenerated description is based on the schema and fields defined in the model annotation.
- Each time the
.vadafile is compiled, the autogenerated description is refreshed, ensuring that it remains up-to-date with any schema changes.
Superclasses
A superclass in the context of a@model annotation allows for the inheritance
of attribute schemas from a base predicate. This feature simplifies the
management of related predicates by allowing common attributes to be defined
once in a superclass predicate.
In order to extend a schema in this way, you must wrap the superclass model in
parentheses (). You can then refer to attributes of the superclass using square
brackets [] in the fields definition.
The syntax is as follows:
@model("subclass(superclass)", "['id:superclass[id]']").
Example
Consider a person as a superclass and engineer as a derived class from
person:
engineer inherits id and name fields from person and adds
a new field specialty.
Superclasses can also be modelled deeply as follows:
Triples
Traditional knowledge graphs are modelled using triples, where relationships between entities are expressed as a triple of[subject, predicate, object].
Example
Using person and engineer entities, a triple relationship can be defined to
capture ownership or control dynamics:
(person), a predicate
manages, and an object (engineer), with an additional field describing the
level of responsibility.
Notice how the actual triple is simply the
manages relationship, but we’ve
added a schema for the level as well. In fact, all relationships between any
number of entities, and having any number of properties, can be modelled in this way.Composition
In addition to primitive data types, the@model annotation allows a predicate
to include other predicates as complex data types. This facilitates the modeling of
intricate relationships and nested data structures directly within your schema
definitions, providing a robust mechanism for data integrity and hierarchical
data management.
When defining a predicate that uses composition, one of the fields can be
specified as another predicate. This nested predicate must define a primary key
that identifies its instances uniquely, which is used as the reference key in
the composite predicate.
Example
Consider modeling events and states where each event transitions from one state to another:Start State and End State, with state_id serving
as the data type for these fields, implied to be string type due to the primary
key type of state.
Vadalog Examples
Typed Collections
Composition also allows you to include a predicate as a data type within a Collection. Specifically, the type of elements within the Collection is determined by the type of the primary key of the predicate defined within the brackets.Bind, Mappings and Qbind
These annotations (@bind, @mapping, @qbind) allow to customize the data
sources and targets for the @output annotation.
@bind
@bind binds an input or output atom to a source. The syntax for @bind is the
follows:
atomName is the atom we want to bind, data source is the name of a
source defined in the Vadalog configuration, outermost container is a
container in the data source (e.g., a schema in a relational database),
innermost container is a content in the data source (e.g. a table in a
relational database).
Let’s take a look at this example:
showLineNumbers
m from a Postgres data source, specifically
from schema doctors_source and table Metprescriptions, reads facts for q
from a SQLite (in SQLite the schema is ignored) data source and performs a join.
bind multiple sources to an input predicate
You can bind multiple external sources (csv, postgres, sqlite, neo4j, …) to a single input predicate. In this example we have a graph partitioned in a csv file and a postgres database and we bind them to the predicateedge. As a
result the facts from the two sources are merged into edge.
showLineNumbers
@mapping
@mapping maps specific columns of the input/output source to a position of an
atom. An atom that appears in a @mapping annotation must also appear in a
@bind annotation.
The syntax is the following:
atomName is the atom we want to map, positionInAtom is an integer
(from 0) denoting the position of the atom that we want to map; columnName is
the name of the column in the source (or equivalent data structure),
columnType is an indication of the type in the source. The following types can
be specified: string, int, double, boolean and date.
In this example, we map the columns of the Medprescriptions table:
@qbind
@qbind binds an atom to a source, generating the facts for the atom as
the result of a query executed on the source.
The syntax is the following:
atomName is the atom we want to bind, data source is the name of a
source defined in the Vadalog configuration, outermost container is a
container in the data source (e.g., a schema in a relational database), query
is a query in the language supported by the source (e.g., SQL for relational
databases).
Consider this example:
t to the data source postgres, selecting a specific content
from the table TestTable.
You can also use parametric @qbind, for example:
${1} is a parameter, which will have the values of the first input field
t. Parametric @qbind should be used in joins with other atoms.
You can also use multiple parameters within a parametric @qbind:
${1} and ${2} are the first and second parameters of all t results.
Post-processing with @post
This category of annotations include a set of post-processing operations that can be applied to facts of atoms annotated with @output before exporting the result into the target. Observe that also if the result is simply sent to the standard output, the post-processing is applied before. The syntax is the following:atomName is the name of the atom (which must also be annotated with
@output) for which the post-processing is intended and post processing directive is a specification of the post-processing operation to be applied.
Multiple post-processing annotations can be used for the same atom, in case
multiple transformations are desired.
In the following sections we give the details.
Order by
It sorts the output over some positions of the atom. The syntax is the following:atomName is the atom to be sorted, p1, …, pn are integers denoting a
valid position in atomName (starting from 1). The sorting is orderly applied on
the various positions. A position can be prefixed with the minus sign (-) to
denote descending sorting.
For the various data types the usual order relations are assumed (to be
extended).
Consider this example:
showLineNumbers
Min
It calculates the minimum value for one ore more positions on an atom, grouping by the other positions. The syntax is the following:atomName is the atom at hand, p1, …, pn are integers denoting a valid
position in atomName (starting from 1).
showLineNumbers
showLineNumbers
(1,"b"), (2,"c") and (1,"a") fall within one
group, and (1,"a") is a minimal tuple among them according to the
lexicographic order.
Max
It calculates the maximum value for one ore more positions on an atom, grouping by the other positions. The syntax is the following:atomName is the atom at hand, p1, …, pn are integers denoting a valid
position in atomName (starting from 1).
showLineNumbers
showLineNumbers
("b",2), ("c",1) and ("a",2) fall within one
group, and ("c",1) is a maximal tuple among them according to the
lexicographic order.
Argmin
It groups the facts of an atom according to certain positions and, for each group, it returns only the facts that minimise a specific position. The syntax is the following:atomName is the atom at hand, p is the position to minimise (from 1) and
p1, …, pn are integers denoting the positions that individuate a specific
group.
showLineNumbers
Argmax
It groups the facts of an atom according to certain positions and, for each group, it returns only the facts that maximise a specific position. The syntax is the following:atomName is the atom at hand, p is the position to maximise (from 1)
and p1, …, pn are integers denoting the positions that individuate a specific
group.
showLineNumbers
Unique
In reasoning with Vadalog Parallel, there are particular situations where duplicate facts for a specific atom may occur in the output. In general, there is no guarantee that output atoms are duplicate-free. In case such guarantee is required, the unique post-processing annotation can be used. The syntax follows:atomName is the name of the atom at hand.
Certain
As Vadalog Parallel handles marked nulls, it is possible that the facts of some output atoms contain such values. Sometimes this may be not desired, for example when the result needs to be stored into a relational database. Thecertain post-processing annotation filters out, for a given atom, all the
facts containing any marked nulls.
The syntax is as follows:
atomName is the name of the atom at hand.
Limit and Prelimit
Sometimes it is useful to limit an output relation to a fixed number of tuples. One can achieve this in two different way with the use of the post-processing annotationslimit and prelimit as shown below.
@Param
The@param annotation is used to introduce and define parameters that can be
referenced throughout the rules within a program. Parameters allow for dynamic
values that can be modified without changing the core logic of the program,
making the rules more flexible and reusable.
For parameterization via API refer to
evaluateFromRepoWithParams.
Syntax
- parameter_name: A string representing the name of the parameter. It should be unique within the context of the program.
- value: The value associated with the parameter. This can be any valid value type in Vadalog (e.g., integer, string, double, list, etc..).

