Skip to main content

Data API

The Data API allows you to connect, manage, and query external data sources within your workspaces.


Connect Data Source

Connect an external database to a workspace.

import prometheux_chain as px

# Create a database configuration
database_config = px.Database(
database_type='postgresql',
username='user',
password='password',
host='localhost',
port=5432,
database_name='mydb',
tables=['users', 'orders']
)

# Connect the source
source_data = px.connect_sources(
database_payload=database_config,
compute_row_count=True
)

Function Signature

def connect_sources(
database_payload: Database = None,
compute_row_count=False
)

Parameters

ParameterTypeRequiredDescription
database_payloadDatabaseYesThe database connection configuration
compute_row_countboolNoWhether to compute row counts for tables. Defaults to False

Returns

The connection response data containing source information.

Response

{
"data": {
"connectionStatus": true,
"sources": [
{
"id": "source_123",
"name": "customers",
"type": "table",
"row_count": 1500
}
],
"errorMessage": null
},
"message": "Database connection successful",
"status": "success"
}

List Data Sources

List all connected data sources in a workspace.

import prometheux_chain as px

# List all sources in the workspace
sources = px.list_sources()
print(f"Available sources: {sources}")

Function Signature

def list_sources()

Parameters

ParameterTypeRequiredDescription

Returns

A list of data source information dictionaries.


Cleanup Data Sources

Remove data sources from a workspace.

import prometheux_chain as px

# Clean up all sources
px.cleanup_sources()

# Clean up specific sources
px.cleanup_sources(source_ids=['source1', 'source2'])

Function Signature

def cleanup_sources(source_ids=None)

Parameters

ParameterTypeRequiredDescription
source_idslistNoList of specific source IDs to clean up. If None, cleans up all sources

Infer Schema

Infer schema from a database configuration.

import prometheux_chain as px

db = px.Database(
database_type='postgresql',
username='user',
password='password',
host='localhost',
port=5432,
database_name='mydb',
)

schema_result = px.infer_schema(database=db, add_bind=True, add_model=False)

Function Signature

def infer_schema(database: Database, add_bind=True, add_model=False)

Parameters

ParameterTypeRequiredDescription
databaseDatabase or dictYesThe database connection configuration
add_bindboolNoWhether to add bind metadata. Defaults to True
add_modelboolNoWhether to add model metadata. Defaults to False

Returns

The schema inference result from the API.


Database Class

The Python SDK provides a Database class for structured database configuration.

from prometheux_chain.data.database import Database

db = Database(
database_type='postgresql',
username='user',
password='password',
host='localhost',
port=5432,
database_name='mydb',
tables=['users', 'orders'],
schema='public'
)

# Convert to dictionary for API call
db_config = db.to_dict()

Constructor Parameters

ParameterTypeRequiredDescription
database_typestrYesDatabase type (e.g., 'postgresql', 'mysql', 'csv', 'snowflake')
usernamestrNoDatabase username
passwordstrNoDatabase password
hoststrNoDatabase host address
portintNoDatabase port number
database_namestrNoName of the database
tableslistNoList of specific tables to include
schemastrNoDatabase schema name
catalogstrNoDatabase catalog name
querystrNoCustom SQL query
optionsdictNoAdditional database-specific options
selected_columnslistNoSpecific columns to include
ignore_columnslistNoColumns to exclude
ignore_tableslistNoTables to exclude
urlstrNoDirect connection URL

Methods

  • to_dict(): Converts the Database object to a dictionary format suitable for API calls
  • from_dict(data): Class method to create a Database object from a dictionary

Complete Workflow Example

import prometheux_chain as px
import os

# Set up authentication and configuration
os.environ['PMTX_TOKEN'] = 'my_pmtx_token'
px.config.set('JARVISPY_URL', "https://api.prometheux.ai/jarvispy/my-org/my-user")

# Create a database configuration using the Database class
database = px.Database(
database_type='postgresql',
username='myuser',
password='mypassword',
host='localhost',
port=5432,
database_name='my_database',
tables=['customers', 'orders'],
schema='public'
)

# Connect the database source
source_data = px.connect_sources(database_payload=database)

# List all sources
sources = px.list_sources()
print(f"Connected sources: {sources}")

# Clean up when done
px.cleanup_sources()