๐งฎ AI SQL Generation
How It Works
Natural Language Processing
The AI understands blockchain terminology and data patterns:
- Domain Knowledge: Recognizes DeFi protocols, token standards, transaction types
- Schema Awareness: Knows your project's tables, columns, and relationships
- Query Optimization: Generates efficient queries with proper indexing
- Error Prevention: Avoids common SQL mistakes and performance issues
Smart Table Selection
Instead of querying all tables, the AI:
- Analyzes your question to identify relevant data
- Selects optimal tables based on semantic similarity
- Considers table size and query performance
- Explains why specific tables were chosen
Supported Query Types
Time-Series Analysis
Natural Language: "Show me daily transaction volumes for the past month"
SELECT
DATE_TRUNC('day', block_time) AS day,
COUNT(*) AS transaction_count,
SUM(value_usd) AS volume_usd
FROM ethereum_transactions
WHERE block_time >= NOW() - INTERVAL '1 month'
GROUP BY 1
ORDER BY 1Aggregations & Grouping
Natural Language: "Top 10 tokens by trading volume this week"
SELECT
token_symbol,
token_address,
SUM(volume_usd) AS total_volume
FROM dex_trades
WHERE block_time >= DATE_TRUNC('week', NOW())
GROUP BY token_symbol, token_address
ORDER BY total_volume DESC
LIMIT 10Complex Filtering
Natural Language: "Failed transactions with gas usage above 100k"
SELECT
hash,
from_address,
to_address,
gas_used,
gas_price,
block_time,
error_message
FROM ethereum_transactions
WHERE status = 'failed'
AND gas_used > 100000
ORDER BY block_time DESCMulti-Table Joins
Natural Language: "Show me Uniswap pools with their token information"
SELECT
p.pool_address,
p.fee_tier,
t0.symbol AS token0_symbol,
t1.symbol AS token1_symbol,
p.tvl_usd,
p.volume_24h
FROM uniswap_v3_pools p
JOIN tokens t0 ON p.token0 = t0.address
JOIN tokens t1 ON p.token1 = t1.address
WHERE p.tvl_usd > 1000000
ORDER BY p.tvl_usd DESCAdvanced Features
Query Explanation
Every generated query includes:
- Plain English Summary: What the query does
- Table Usage: Why specific tables were selected
- Performance Notes: Index usage and optimization tips
- Result Schema: Expected output columns and data types
Automatic Optimization
The AI applies SQL best practices:
- Index Usage: Leverages existing database indexes
- Query Rewriting: Optimizes for ClickHouse columnar storage
- Memory Management: Limits result sets for large datasets
- Partition Pruning: Uses date partitioning when available
Error Detection & Fixing
When queries fail, the AI can:
- Analyze error messages and suggest fixes
- Rewrite queries to avoid common issues
- Suggest alternative approaches
- Explain why the original query failed
Query Refinement
Iterative Improvement
Start with a basic query and refine:
- "Show me Ethereum transactions"
- "Only from the last 24 hours"
- "With value greater than 1 ETH"
- "Ordered by gas price, highest first"
Performance Optimization
Ask the AI to optimize queries:
- "Make this query faster"
- "Reduce memory usage for this query"
- "Add proper indexes for better performance"
- "Limit results to avoid timeouts"
Data Quality
Improve data accuracy:
- "Exclude test transactions"
- "Only include successful transactions"
- "Filter out contract creation transactions"
- "Remove duplicate entries"
Best Practices
Effective Prompts
- Be Specific About Time Ranges: "Transactions from the past 7 days"
- Specify Data Granularity: "Daily trading volumes aggregated by hour"
- Include Business Logic: "Active users (those who made transactions)"
- Mention Edge Cases: "Exclude mints and burns"
Performance Considerations
- Use time filters and reasonable limits
- Choose appropriate aggregation levels for the analysis window
Query Templates
Token Analysis
-- Token holder distribution
SELECT
balance_tier,
COUNT(*) AS holder_count,
SUM(balance) AS total_balance
FROM (
SELECT
address,
balance,
CASE
WHEN balance < 100 THEN '0-100'
WHEN balance < 1000 THEN '100-1K'
WHEN balance < 10000 THEN '1K-10K'
ELSE '10K+'
END AS balance_tier
FROM token_balances
WHERE token_address = '0x...'
) t
GROUP BY balance_tier
ORDER BY MIN(balance)DeFi Protocol Analysis
-- Liquidity pool performance
SELECT
pool_address,
DATE_TRUNC('day', block_time) AS day,
AVG(tvl_usd) AS avg_tvl,
SUM(volume_usd) AS daily_volume,
SUM(fees_usd) AS daily_fees,
SUM(fees_usd) / NULLIF(AVG(tvl_usd), 0) * 365 AS apr
FROM pool_snapshots
WHERE block_time >= NOW() - INTERVAL '30 days'
GROUP BY 1, 2
ORDER BY 1, 2Transaction Analysis
-- Gas usage patterns
SELECT
DATE_TRUNC('hour', block_time) AS hour,
AVG(gas_price) AS avg_gas_price,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY gas_price) AS median_gas_price,
COUNT(*) AS tx_count,
SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) AS failed_count
FROM ethereum_transactions
WHERE block_time >= NOW() - INTERVAL '1 day'
GROUP BY 1
ORDER BY 1Troubleshooting
- "Table not found": Verify table names and access
- "Query timeout": Add time filters and adjust granularity
- "No results returned": Check filters and date ranges
- "Column doesn't exist": Browse schema and verify spelling/case
Integration with Dashboards
- Save generated SQL to your query library
- Create visualizations from AI-suggested chart types
- Schedule reports and set up alerts based on query results
Updated 15 days ago