Rule Inference API
The Rule Inference API provides functions to infer Vadalog rules from a database or data source schema. It generates a linear Vadalog rule for each table or file and a join rule for each table having foreign keys.
Infer Schema
- Python SDK
- REST API
import prometheux_chain as px
from prometheux_chain.data.database import Database
# Create a Database object
db = Database(
database_type="postgresql",
username="prometheux",
password="prometheux",
host="localhost",
port=5432,
database_name="prometheux"
)
# Infer Vadalog rules from the database
inferred_rules = px.infer_schema(db, add_bind=True, add_model=False)
# Save the inferred rules to a file
with open("infer-from-postgresql.vada", 'w') as file:
file.write(inferred_rules)
Function Signature
def infer_schema(database, add_bind=True, add_model=False)
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
database | Database | Yes | An instance of the Database class containing connection details |
add_bind | bool | No | Whether to add a bind statement in the inferred schema. Defaults to True |
add_model | bool | No | Whether to add a model annotation statement. Defaults to False |
Returns
Returns inferred Vadalog rule from database or datasource schema as a string.
HTTP Request
POST /api/v1/data/infer-schema
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| database | object | Yes | Database connection configuration |
| addBind | boolean | No | Whether to add a bind statement (default: true) |
| addModel | boolean | No | Whether to add a model annotation (default: false) |
The database object should contain:
| Field | Type | Required | Description |
|---|---|---|---|
| databaseType | string | Yes | Database type (e.g., "postgresql", "mysql", "neo4j", "csv", "excel") |
| host | string | Yes | Database host |
| port | integer | No | Database port |
| username | string | Yes | Database username |
| password | string | Yes | Database password |
| databaseName | string | Yes | Database name |
| schema | string | No | Database schema |
| options | object | No | Additional database-specific options |
cURL Example
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "postgresql",
"host": "localhost",
"port": 5432,
"username": "prometheux",
"password": "prometheux",
"databaseName": "prometheux"
},
"addBind": true,
"addModel": false
}'
Response
{
"data": "@bind(\"customers\", \"postgresql\", ...).\ncustomers(Id, Name, Email) :- ...",
"message": "Schema inferred successfully",
"status": "success"
}
Database Examples
PostgreSQL
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="postgresql",
username="prometheux",
password="prometheux",
host="localhost",
port=5432,
database_name="prometheux"
)
inferred_rules = px.infer_schema(db, add_bind=True)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "postgresql",
"host": "localhost",
"port": 5432,
"username": "prometheux",
"password": "prometheux",
"databaseName": "prometheux"
},
"addBind": true
}'
Neo4j
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="neo4j",
username="neo4j",
password="neo4j2",
host="localhost",
port=7687,
database_name="neo4j"
)
inferred_rules = px.infer_schema(db, add_bind=True)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "neo4j",
"host": "localhost",
"port": 7687,
"username": "neo4j",
"password": "neo4j2",
"databaseName": "neo4j"
},
"addBind": true
}'
CSV File from S3
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="csv",
username="AKIA4xxxx12",
password="JyxxxxU+",
host="s3a://prometheux-data",
port=None,
database_name="companies.csv",
options={
"region": "eu-west-2",
"endpoint": "s3.amazonaws.com",
"credentials.provider": "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider",
"delimiter": "\t"
}
)
inferred_rules = px.infer_schema(db, add_bind=True)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "csv",
"host": "s3a://prometheux-data",
"username": "AKIA4xxxx12",
"password": "JyxxxxU+",
"databaseName": "companies.csv",
"options": {
"region": "eu-west-2",
"endpoint": "s3.amazonaws.com",
"credentials.provider": "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider",
"delimiter": "\t"
}
},
"addBind": true
}'
Databricks
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="databricks",
username="token",
password="dapixxxx",
host="dbc-xxxx-02fe.cloud.databricks.com",
port=443,
database_name="/sql/1.0/warehouses/3283xxxx"
)
inferred_rules = px.infer_schema(db, add_bind=True)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "databricks",
"host": "dbc-xxxx-02fe.cloud.databricks.com",
"port": 443,
"username": "token",
"password": "dapixxxx",
"databaseName": "/sql/1.0/warehouses/3283xxxx"
},
"addBind": true
}'
Databricks with Specific Schema
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
inferred_rules = px.infer_schema(
Database(
database_type="databricks",
username="token",
password="dapixxxx",
host="dbc-xxxx-02fe.cloud.databricks.com",
port=443,
database_name="/sql/1.0/warehouses/3283xxxx",
schema="my_catalog.my_schema"
),
add_bind=True,
add_model=False
)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "databricks",
"host": "dbc-xxxx-02fe.cloud.databricks.com",
"port": 443,
"username": "token",
"password": "dapixxxx",
"databaseName": "/sql/1.0/warehouses/3283xxxx",
"schema": "my_catalog.my_schema"
},
"addBind": true,
"addModel": false
}'
Snowflake
Instead of using a password, you can use a Programmatic Access Token (PAT) for authentication. This avoids MFA prompts during automated workflows. See the Snowflake data source documentation for details.
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="snowflake",
username="my_username",
password="my_password", # Or use your PAT token
host="jdbc:snowflake://A77885826xxxx-IV3xxxx.snowflakecomputing.com",
port=443,
database_name="my_database",
schema="my_schema",
options={"warehouse": "my_warehouse"}
)
inferred_rules = px.infer_schema(db, add_bind=True)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "snowflake",
"host": "jdbc:snowflake://A77885826xxxx-IV3xxxx.snowflakecomputing.com",
"port": 443,
"username": "my_username",
"password": "my_password",
"databaseName": "my_database",
"schema": "my_schema",
"options": {
"warehouse": "my_warehouse"
}
},
"addBind": true
}'
Excel File
Excel files are treated as a database where sheets are considered as tables.
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="excel",
username="",
password="workbookPassword",
host="path/to/excel_file",
port=None,
database_name="excel_file.xlsx",
)
inferred_rules = px.infer_schema(db, add_bind=True)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "excel",
"host": "path/to/excel_file",
"username": "",
"password": "workbookPassword",
"databaseName": "excel_file.xlsx"
},
"addBind": true
}'
BigQuery
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
gcpAccessToken = os.environ["GCP_ACCESS_TOKEN"]
db = Database(
database_type="bigquery",
username="",
password="",
host="",
port=None,
database_name="my_project_id",
schema="datasetId",
options={
"authMode": "gcpAccessToken",
"gcpAccessToken": gcpAccessToken,
"parentProject": "my_parent_project_id",
"billingProjectId": "my_billing_project_id",
"region": "us-central1"
}
)
inferred_rules = px.infer_schema(db, add_bind=True, add_model=True)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "bigquery",
"host": "",
"username": "",
"password": "",
"databaseName": "my_project_id",
"schema": "datasetId",
"options": {
"authMode": "gcpAccessToken",
"gcpAccessToken": "YOUR_GCP_ACCESS_TOKEN",
"parentProject": "my_parent_project_id",
"billingProjectId": "my_billing_project_id",
"region": "us-central1"
}
},
"addBind": true,
"addModel": true
}'
Text File
Infer concepts and relationships from text content.
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="text",
username="",
password="",
host="path/to/file",
port=None,
database_name="document.txt"
)
inferred_rules = px.infer_schema(db, add_bind=True, add_model=False)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "text",
"host": "path/to/file",
"username": "",
"password": "",
"databaseName": "document.txt"
},
"addBind": true,
"addModel": false
}'
Binary File (PDF, Images)
Binary files support various formats including PDF, JPG, PNG, and other binary formats.
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="binaryfile",
username="",
password="",
host="path/to/file",
port=None,
database_name="document.pdf"
)
inferred_rules = px.infer_schema(db, add_bind=True, add_model=False)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "binaryfile",
"host": "path/to/file",
"username": "",
"password": "",
"databaseName": "document.pdf"
},
"addBind": true,
"addModel": false
}'
Business Documents
Structured documents such as ID documents, receipts, tax forms, and mortgage documents. Supported document types include:
| Category | Document Types |
|---|---|
| Financial | check.us, bankStatement.us, payStub.us, creditCard, invoice |
| ID Documents | idDocument.driverLicense, idDocument.passport, idDocument.nationalIdentityCard, idDocument.residencePermit, idDocument.usSocialSecurityCard |
| Receipts | receipt.retailMeal, receipt.creditCard, receipt.gas, receipt.parking, receipt.hotel |
| Tax Documents | tax.us.1040.2023, tax.us.w2, tax.us.w4, tax.us.1095A, tax.us.1098, tax.us.1099 |
| Mortgage Documents | mortgage.us.1003 (URLA), mortgage.us.1004 (URAR), mortgage.us.closingDisclosure |
| Other | contract, healthInsuranceCard.us, marriageCertificate.us |
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="binaryfile",
username="",
password="",
host="path/to/file",
port=None,
database_name="driver_license.pdf",
options={"documentType": "idDocument.driverLicense"}
)
inferred_rules = px.infer_schema(db, add_bind=True, add_model=False)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "binaryfile",
"host": "path/to/file",
"username": "",
"password": "",
"databaseName": "driver_license.pdf",
"options": {
"documentType": "idDocument.driverLicense"
}
},
"addBind": true,
"addModel": false
}'
Amazon DynamoDB
- Python SDK
- REST API
import os
import prometheux_chain as px
from prometheux_chain.data.database import Database
os.environ["PMTX_TOKEN"] = "YOUR_TOKEN"
db = Database(
database_type="dynamodb",
username="AKIA4xxxx12", # AWS Access Key ID
password="JyxxxxU+", # AWS Secret Access Key
host="",
port=None,
database_name="",
options={
"region": "us-east-1",
"endpoint": "",
"sessionToken": "",
"sampleLimit": "100"
}
)
inferred_rules = px.infer_schema(db, add_bind=True, add_model=True)
curl -X POST "https://api.prometheux.ai/jarvispy/my-org/my-user/api/v1/data/infer-schema" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-d '{
"database": {
"databaseType": "dynamodb",
"host": "",
"username": "AKIA4xxxx12",
"password": "JyxxxxU+",
"databaseName": "",
"options": {
"region": "us-east-1",
"endpoint": "",
"sessionToken": "",
"sampleLimit": "100"
}
},
"addBind": true,
"addModel": true
}'