Overview
Connect your agents to databases for intelligent data retrieval and analysis. MagOneAI provides two approaches to database access:- Natural language to SQL — agents ask questions in plain language, converted to SQL automatically
- Direct SQL execution — agents execute SQL queries directly for precise control
Vanna Text-to-SQL
Vanna is an AI-powered text-to-SQL system that converts natural language questions into SQL queries. This makes databases accessible to agents without requiring them to understand complex schema details.How it works
User or agent asks a question
Natural language question: “What were our top 5 customers by revenue last quarter?”
Query executes against database
MagOneAI executes the generated SQL against your configured database connection.
Supported databases
Vanna supports major SQL databases:- PostgreSQL — production-grade relational database
- MySQL — widely used open-source database
- Microsoft SQL Server — enterprise database system
- SQLite — lightweight embedded database
- MariaDB — MySQL-compatible database
- Amazon Redshift — cloud data warehouse
- Google BigQuery — serverless data warehouse
- Snowflake — cloud data platform
Training Vanna on your schema
For best accuracy, train Vanna on your specific database schema. This helps it understand:- Table relationships and foreign keys
- Common query patterns in your domain
- Business logic and calculated fields
- Naming conventions and terminology
How to train Vanna
How to train Vanna
Training Vanna improves query accuracy by teaching it about your database:
- Automatic schema extraction: Vanna reads your database schema (tables, columns, types, relationships)
- Example queries: Provide example question-SQL pairs for common queries
- Business terminology: Define domain-specific terms (e.g., “ARR” means “Annual Recurring Revenue”)
- Query patterns: Show Vanna how you typically join tables or calculate metrics
Limitations of text-to-SQL
While powerful, text-to-SQL has limitations:- Ambiguous questions may generate incorrect SQL
- Complex analytical queries may require human-written SQL for accuracy
- Schema changes require retraining for best results
- Performance depends on training quality and schema complexity
CSV-to-SQL tool
Query CSV files using SQL without setting up a database server. This is useful for:- Ad-hoc data analysis on uploaded files
- Prototyping workflows before connecting to production databases
- Processing data exports from external systems
How it works
- Upload CSV file to MagOneAI (via UI or API)
- File is loaded into SQLite automatically with inferred schema
- Query with SQL using standard SELECT statements
- Results returned in structured format
Use cases
- Analyze sales data exports from e-commerce platforms
- Process survey results from CSV downloads
- Join multiple CSV files for cross-dataset analysis
- Validate data quality before database import
CSV-to-SQL uses an ephemeral SQLite database. Files are not permanently stored — they’re loaded for the duration of the workflow execution.
Database connection configuration
Connect agents to your databases securely using the configuration workflow.Add database connection in project settings
Navigate to Project Settings → Integrations → Databases and click Add Connection.
Configure connection string
Provide connection details:
- Connection type: PostgreSQL, MySQL, etc.
- Host and port: Database server address
- Database name: Target database
- Username and password: Database credentials
Train text-to-SQL model (optional)
For Vanna text-to-SQL, train the model on your schema and example queries. This improves accuracy significantly. Training is optional but highly recommended.
Attach database tools to agents
In agent configuration, enable database tools and specify which database connections the agent can access.
Connection string format
Connection strings follow standard database URL formats: PostgreSQL:vault:database/production_db.
Security considerations
Database access by AI agents requires careful security design. Follow these best practices to protect your data.Use read-only credentials
Create a dedicated read-only database user: PostgreSQL example:Connection credentials in Vault
All database credentials are stored in HashiCorp Vault:- Encrypted at rest with AES-256
- Access controlled by role-based permissions
- Automatic key rotation supported
- Audit logging for all credential access
- Workflow configurations (reference by name only)
- Agent logs or traces
- API responses
- User interfaces
Query audit logging
MagOneAI logs all database queries executed by agents:- What query was executed (full SQL text)
- Who executed it (which agent, which user, which workflow)
- When it was executed (timestamp)
- What data was accessed (tables and columns)
- Query result metadata (row count, execution time)
Row-level security
Where supported by your database, implement row-level security (RLS) to restrict which data agents can access. PostgreSQL RLS example:Network security
Restrict database access at the network level:- Firewall rules: Allow connections only from MagOneAI platform IP addresses
- VPC peering: For cloud databases, use VPC peering to avoid exposing databases to the public internet
- SSL/TLS: Require encrypted connections to the database
- Private endpoints: Use private database endpoints when available
Prevent SQL injection
While MagOneAI uses parameterized queries to prevent SQL injection, additional safeguards:- Validate agent-generated SQL before execution (especially with text-to-SQL)
- Set query timeouts to prevent resource exhaustion
- Limit result set sizes to prevent memory issues
- Whitelist tables and columns that agents can access
Usage in workflows
Database tools can be used in both Agent nodes (intelligent query construction) and Tool nodes (predefined queries).Agent node with text-to-SQL
System prompt:- Uses Vanna to convert the question to SQL
- Executes the generated query
- Receives results (product names and revenue figures)
- Formats the answer: “Last month’s top products by revenue were…”
Tool node with direct SQL
For predefined, recurring queries, use a Tool node with direct SQL: Tool: Execute SQL Query Connection:production_db_readonly
Query:
Combining database and other tools
Database + Web Search:- Query internal sales data from database
- Search web for competitor pricing
- Compare and generate market positioning analysis
- Query database for weekly metrics
- Format results into report
- Email report to stakeholders
- Query database for customer information
- Search internal documents for customer history
- Synthesize complete customer profile
Performance optimization
Database queries can be slow or resource-intensive. Optimize for agent use:Query timeouts
Set maximum query execution time to prevent long-running queries:- Default timeout: 30 seconds
- Configurable per connection in project settings
- Queries exceeding timeout are cancelled automatically
Result set limits
Limit the number of rows returned to prevent memory issues:- Default limit: 1,000 rows
- Configurable per connection
- Agents should use pagination for large datasets
Query caching
Enable query result caching for frequently repeated queries:- Cache TTL: configurable (default 5 minutes)
- Cache key: query text + parameters
- Automatic invalidation on cache expiry
Indexing
Ensure database tables are properly indexed for common query patterns:- Index foreign keys used in joins
- Index columns used in WHERE clauses
- Composite indexes for multi-column filters
- Analyze query execution plans regularly
Troubleshooting
Connection fails
Connection fails
Symptoms: “Unable to connect to database” or timeout errors.Solutions:
- Verify connection string is correct (host, port, database name)
- Check database credentials are valid
- Ensure MagOneAI platform IPs are whitelisted in firewall
- Verify database is running and accepting connections
- Check SSL/TLS settings if required
Text-to-SQL generates incorrect queries
Text-to-SQL generates incorrect queries
Symptoms: Query results don’t match the question asked.Solutions:
- Train Vanna with example question-SQL pairs for your schema
- Provide clearer, more specific questions
- Define business terminology in training data
- For complex queries, use direct SQL execution instead
- Review generated SQL before execution (enable SQL preview in settings)
Queries timeout or run slowly
Queries timeout or run slowly
Symptoms: “Query timeout” errors or very slow response times.Solutions:
- Add indexes to tables for columns used in WHERE and JOIN clauses
- Reduce result set size with LIMIT clauses
- Optimize query structure (avoid SELECT *, use specific columns)
- Increase query timeout in connection settings if appropriate
- Consider materialized views for complex aggregations
Permission denied errors
Permission denied errors
Symptoms: “Access denied” or “Permission denied” when executing queries.Solutions:
- Verify database user has necessary SELECT permissions
- Check row-level security policies aren’t blocking access
- Ensure database user can access the specific tables being queried
- Review database audit logs for detailed permission errors
CSV upload fails
CSV upload fails
Symptoms: CSV-to-SQL tool returns errors when loading files.Solutions:
- Verify CSV is properly formatted (headers in first row)
- Check for encoding issues (use UTF-8)
- Ensure column names don’t have special characters
- Reduce file size if exceeding limits (max 50MB)
- Validate data types can be inferred correctly