Skip to main content

What you’ll build

You’ll build an intelligent IT support triage system that automatically classifies incoming tickets, attempts automated resolution for common issues using RAG-powered knowledge bases, and escalates complex problems to the appropriate specialist team with full context. This reduces resolution time for common issues from hours to seconds and ensures complex issues reach the right experts immediately. This workflow demonstrates how to:
  • Automatically classify and route IT support tickets
  • Resolve common issues using knowledge base-powered automation
  • Implement smart escalation based on issue complexity and severity
  • Provide 24/7 first-line support for employees
  • Build confidence-based routing for quality assurance

What the system handles

Automated resolution for common issues:
  • Password resets and account unlocks
  • VPN connection problems
  • Email configuration
  • Software installation instructions
  • Printer setup and troubleshooting
  • Network connectivity issues
  • Standard access requests
Smart escalation for complex issues:
  • Hardware failures
  • Security incidents
  • Critical system outages
  • Advanced software issues
  • Custom application problems
  • Database issues

Prerequisites

Before you begin, ensure you have:
  • MagOneAI instance with workflow builder
  • Knowledge base with IT runbooks (optional but highly recommended):
    • Troubleshooting guides
    • How-to documentation
    • Common issue resolutions
    • Configuration instructions
  • Ticketing system integration (optional) - Jira, ServiceNow, Zendesk, etc.
  • LLM provider configured (GPT-4, Claude 3.5 Sonnet recommended)
  • Notification tools - Email, Slack, MS Teams
Start with a knowledge base containing your top 20 most common IT issues and their solutions. You can expand over time as you identify more patterns.

Architecture

The IT support workflow uses classification-based routing with automated resolution attempts:
Trigger (Ticket/Chat submission)

Triage Agent (Classify: category, severity, complexity)

Condition Node: Known issue with solution?
    ├── YES → Resolution Agent (with RAG) → Auto-reply to user
    └── NO → Condition Node: Severity level?
            ├── CRITICAL → Escalate to L2 + Alert on-call
            └── NORMAL → Create ticket with context + Notify L1 queue

Why this architecture works

Fast Resolution

Common issues resolved in under 30 seconds vs. 2-8 hour wait for human support

Right Routing

Complex issues reach specialists immediately with full context, not bounced between teams

24/7 Coverage

First-line support available anytime, reducing pressure on on-call staff

Learning System

Knowledge base grows over time, handling more issues automatically

Step-by-step build

1

Create IT knowledge base with runbooks

Build a knowledge base of IT solutions and procedures.Gather documentation:
  1. Common Issue Resolutions
    • Password reset procedures
    • VPN troubleshooting steps
    • Email setup guides (Outlook, Gmail, mobile)
    • Wi-Fi connection issues
    • Printer setup and common errors
    • Software installation guides
    • Browser troubleshooting
  2. Access Request Procedures
    • How to request software licenses
    • Access control procedures
    • VPN access requests
    • Admin rights request process
    • Shared drive access
  3. Security Procedures
    • Phishing reporting
    • Lost device protocol
    • Security incident reporting
    • Data breach response
    • Suspicious activity reporting
  4. Hardware Guides
    • Laptop setup guides
    • Monitor and peripheral setup
    • Mobile device configuration
    • Hardware replacement process
    • Warranty and repair procedures
Create the knowledge base:
  1. Navigate to Knowledge BasesCreate New
  2. Name: “IT Support Runbooks”
  3. Upload documents:
    • IT_Runbook_Common_Issues.pdf
    • VPN_Troubleshooting_Guide.pdf
    • Email_Configuration_Guide.pdf
    • Network_Troubleshooting.pdf
    • Access_Request_Procedures.pdf
    • Security_Incident_Procedures.pdf
  4. Configure chunking: Automatic (procedure-based chunking works well)
  5. Wait for processing to complete
Runbook structure for RAG:
# Issue: Cannot Connect to VPN

## Symptoms
- VPN client shows "Connection failed"
- Error message: "Authentication failed"
- Unable to access internal resources

## Common Causes
1. Expired password
2. VPN client out of date
3. Network blocking VPN ports
4. VPN account not activated

## Resolution Steps

### Step 1: Verify Credentials
1. Ensure you're using your network username (not email)
2. Check if password has expired
3. Try logging into the employee portal to verify password

### Step 2: Update VPN Client
1. Open VPN client
2. Go to Help → Check for Updates
3. If update available, install and restart

### Step 3: Check Network Connection
1. Ensure you're connected to internet
2. Try accessing a website (e.g., google.com)
3. If on corporate guest Wi-Fi, VPN may be blocked

### Step 4: Verify VPN Account Status
1. New employees: VPN activated 24h after start date
2. Check with IT if account is active
3. Contractors: Ensure VPN access was requested

## When to Escalate
- None of the above steps work
- Error message is different from above
- VPN was working before and suddenly stopped

## Related Issues
- See also: Network Connectivity Issues
- See also: Password Reset Procedures
This structure helps RAG retrieve the right information at the right time.
2

Create the triage agent

Build an agent that classifies incoming support requests.Agent configuration:Name: IT Support Triage Agent Model: GPT-4 or Claude 3.5 SonnetPersona:
You are an IT support triage specialist who classifies incoming IT support requests.

Analyze the support request and classify it across three dimensions:

1. CATEGORY (primary issue type):
   - NETWORK: Wi-Fi, VPN, connectivity issues
   - ACCESS: Passwords, permissions, account access
   - SOFTWARE: Application issues, installations, licenses
   - HARDWARE: Laptop, monitor, peripherals, phone
   - EMAIL: Email configuration, delivery, mailbox issues
   - SECURITY: Phishing, malware, suspicious activity, lost device
   - PRINTING: Printer setup, connection, print queue
   - OTHER: Anything not clearly fitting above categories

2. SEVERITY:
   - CRITICAL: System down, security incident, business-blocking issue
   - HIGH: Major functionality impaired, affecting multiple users
   - MEDIUM: Issue affecting single user, workaround available
   - LOW: Minor issue, feature request, question

3. RESOLVABILITY (can this be auto-resolved?):
   - AUTO: Common issue with documented solution (e.g., VPN setup, password reset)
   - ESCALATE: Requires human expertise or special access
   - UNCERTAIN: Need more information to determine

4. CONFIDENCE (how confident are you in this classification?):
   - HIGH: Clear, well-described issue matching known patterns
   - MEDIUM: Issue description is somewhat vague but likely classification
   - LOW: Unclear description, ambiguous symptoms

Output structured JSON:
{
  "category": "NETWORK",
  "severity": "MEDIUM",
  "resolvability": "AUTO",
  "confidence": "HIGH",
  "summary": "User unable to connect to VPN - authentication failed error",
  "extracted_info": {
    "error_message": "Authentication failed",
    "affected_system": "VPN",
    "user_platform": "Windows laptop"
  },
  "reasoning": "Clear VPN connection issue with specific error message. This is a common issue with documented troubleshooting steps."
}

Important:
- ALWAYS classify SECURITY issues as CRITICAL severity
- Be conservative with AUTO resolvability - when in doubt, choose ESCALATE
- If you need more information, mark as UNCERTAIN and suggest questions to ask
Configuration:
  • Temperature: 0.2 (consistent classification)
  • Structured output: Enabled (JSON schema)
  • Timeout: 15 seconds
3

Create the resolution agent

Build an agent that attempts to resolve issues using the knowledge base.Agent configuration:Name: IT Issue Resolution Agent Model: GPT-4 or Claude 3.5 Sonnet Knowledge Base: IT Support Runbooks (RAG enabled)Persona:
You are an IT support specialist who helps users resolve technical issues using documented procedures.

Your task:
1. Search the IT knowledge base for solutions to the reported issue
2. Provide clear, step-by-step instructions
3. Include troubleshooting steps in logical order
4. Explain WHY each step helps (builds user understanding)
5. Specify when to escalate if steps don't work
6. Be encouraging and patient

Response format:

Hi [User Name],

I can help you with this issue. Here's what to try:

[Step-by-step instructions from knowledge base]

**What to expect:**
[What should happen if steps work]

**If this doesn't work:**
[When to escalate and how]

**Why this happens:**
[Brief explanation of the issue - helps prevent recurrence]

Let me know if this resolves the issue or if you need further assistance!

Best regards,
IT Support

Important guidelines:
- Only provide solutions from the knowledge base (no hallucination)
- If no solution in KB, say so and escalate
- Include screenshots or links if referenced in KB
- Adapt technical language to user's expertise level
- Always cite the source document
- If multiple solutions exist, provide the most common/simple one first
Configuration:
  • Temperature: 0.3 (balanced creativity and consistency)
  • RAG settings:
    • Retrieve: 5 chunks
    • Relevance threshold: 0.75
    • Enable reranking
  • Max tokens: 1000
  • Timeout: 30 seconds
4

Build the workflow with conditional routing

Construct the triage and routing workflow.Node 1: Trigger
  • Type: Manual Trigger, API Trigger, or Email Trigger
  • Inputs:
    • user_name (text)
    • user_email (text)
    • issue_description (text, long)
    • ticket_id (text, optional - if integrated with ticketing system)
Node 2: Triage Agent
  • Agent: IT Support Triage Agent
  • Input: {{trigger.issue_description}}
  • Output variable: triage_result
Node 3: Condition - Known Issue?
  • Condition:
    {{triage_result.resolvability}} == "AUTO"
    AND
    {{triage_result.confidence}} == "HIGH"
    
  • True branch: Attempt auto-resolution
  • False branch: Go to severity check
Node 4 (True branch): Resolution Agent
  • Agent: IT Issue Resolution Agent
  • Input:
    User Name: {{trigger.user_name}}
    Issue: {{trigger.issue_description}}
    Category: {{triage_result.category}}
    Summary: {{triage_result.summary}}
    
  • Enable RAG: Yes
  • Output variable: resolution
Node 5 (True branch): Send Auto-Reply
  • Tool: Email or Chat response
  • To: {{trigger.user_email}}
  • Subject: Re: Your IT Support Request - Possible Solution
  • Body:
    {{resolution.output}}
    
    ---
    Ticket ID: {{trigger.ticket_id}}
    This is an automated response. If the issue persists, reply to this email and it will be escalated to our support team.
    
Node 6 (False branch): Condition - Severity Check
  • Condition:
    {{triage_result.severity}} == "CRITICAL"
    
  • True branch: Critical escalation
  • False branch: Normal ticket creation
Node 7 (Critical branch): Alert On-Call Team
  • Tool: Notification (Slack, PagerDuty, Email)
  • Message:
    🚨 CRITICAL IT ISSUE
    
    Ticket: {{trigger.ticket_id}}
    User: {{trigger.user_name}}
    Category: {{triage_result.category}}
    Issue: {{triage_result.summary}}
    
    Full Description:
    {{trigger.issue_description}}
    
    Action Required: Immediate response needed
    
Node 8 (Normal branch): Create Ticket with Context
  • Tool: Ticketing system API or Email to IT queue
  • Create ticket with:
    Title: [{{triage_result.category}}] {{triage_result.summary}}
    Description: {{trigger.issue_description}}
    Reporter: {{trigger.user_name}} ({{trigger.user_email}})
    Priority: {{triage_result.severity}}
    Labels: {{triage_result.category}}, auto-triaged
    
    AI Triage Analysis:
    - Category: {{triage_result.category}}
    - Severity: {{triage_result.severity}}
    - Confidence: {{triage_result.confidence}}
    - Reasoning: {{triage_result.reasoning}}
    
Node 9 (Both escalation branches): Notify User
  • Tool: Email
  • Subject: Your IT Support Request Has Been Received
  • Body:
    Hi {{trigger.user_name}},
    
    Thank you for contacting IT support.
    
    Your issue has been classified as:
    - Category: {{triage_result.category}}
    - Priority: {{triage_result.severity}}
    - Ticket ID: {{trigger.ticket_id}}
    
    {{#if triage_result.severity == "CRITICAL"}}
    This is a critical issue and our on-call team has been alerted. You should receive a response within 30 minutes.
    {{else}}
    Your ticket has been assigned to our support team. Expected response time: 4-8 business hours.
    {{/if}}
    
    We'll update you as soon as we have more information.
    
    Best regards,
    IT Support Team
    
This multi-branch workflow ensures every request is handled appropriately: auto-resolved, escalated urgently, or queued normally with full context.
5

Add confidence-based quality assurance

Ensure auto-resolved issues were actually resolved.Enhancement: Follow-up confirmationAfter auto-resolution (Node 5), add:Node 5a: Wait Node
  • Wait: 4 hours
Node 5b: Send Follow-up Email
  • Subject: Follow-up: Was your IT issue resolved?
  • Body:
    Hi {{trigger.user_name}},
    
    Earlier today, we provided automated assistance for your IT issue:
    "{{triage_result.summary}}"
    
    Was this resolved successfully?
    
    [Yes, resolved] [No, still having issues]
    
    If you clicked "No" or didn't respond, we'll escalate to our support team.
    
    Thanks,
    IT Support
    
Node 5c: Wait for Response
  • Timeout: 24 hours
  • If no response or “No” clicked: Create ticket (escalate)
  • If “Yes” clicked: Close ticket, mark as successfully auto-resolved
This creates a quality assurance loop and catches cases where the auto-resolution didn’t work.
6

Test with common IT scenarios

Validate the workflow with realistic support requests.Test Case 1: Common issue (VPN)
  • Input:
    User: John Smith
    Email: [email protected]
    Issue: "I can't connect to the VPN. It says 'Authentication failed'. I'm working from home on my company laptop."
    
  • Expected:
    • Triage: NETWORK, MEDIUM severity, AUTO resolvable, HIGH confidence
    • Resolution: VPN troubleshooting steps from knowledge base
    • Output: Auto-reply with step-by-step instructions
  • Verify:
    • ✅ Correct classification
    • ✅ Relevant KB article retrieved
    • ✅ Clear instructions provided
    • ✅ Follow-up scheduled
Test Case 2: Critical security issue
  • Input:
    User: Sarah Johnson
    Email: [email protected]
    Issue: "I clicked on a link in an email and now my computer is acting strange. Lots of pop-ups and it's very slow. I think it might be malware."
    
  • Expected:
    • Triage: SECURITY, CRITICAL severity, ESCALATE, HIGH confidence
    • Route: Immediate alert to on-call
    • Output: Critical alert sent + user notified of urgent handling
  • Verify:
    • ✅ Correctly identified as security incident
    • ✅ Marked as CRITICAL
    • ✅ On-call team alerted immediately
    • ✅ User notified of urgent response
Test Case 3: Hardware issue (escalation)
  • Input:
    User: Mike Chen
    Email: [email protected]
    Issue: "My laptop won't turn on. I've tried charging it overnight but still nothing. The power light doesn't come on at all."
    
  • Expected:
    • Triage: HARDWARE, HIGH severity, ESCALATE, HIGH confidence
    • Route: Create ticket with context
    • Output: Ticket created + user notified
  • Verify:
    • ✅ Correctly classified as hardware
    • ✅ Marked for escalation (can’t auto-resolve hardware)
    • ✅ Ticket includes full triage analysis
    • ✅ User notified with expected timeline
Test Case 4: Vague description (low confidence)
  • Input:
    User: Lisa Park
    Email: [email protected]
    Issue: "My computer is slow"
    
  • Expected:
    • Triage: OTHER/SOFTWARE, MEDIUM severity, UNCERTAIN, LOW confidence
    • Route: Create ticket (don’t attempt auto-resolution due to low confidence)
    • Output: Ticket created asking for more information
  • Verify:
    • ✅ Doesn’t attempt auto-resolution (low confidence)
    • ✅ Ticket includes request for more details
    • ✅ User asked to provide additional context
Test Case 5: Password reset
  • Input:
    User: David Lee
    Email: [email protected]
    Issue: "I forgot my password and I'm locked out. How do I reset it?"
    
  • Expected:
    • Triage: ACCESS, MEDIUM severity, AUTO, HIGH confidence
    • Resolution: Password reset instructions from KB
    • Output: Auto-reply with self-service reset link and instructions
  • Verify:
    • ✅ Auto-resolved with password reset procedure
    • ✅ Includes link to self-service portal
    • ✅ Clear instructions provided
Monitor false positives closely: cases marked AUTO but couldn’t actually be auto-resolved. Adjust confidence thresholds if you see too many failed auto-resolutions.
7

Integrate with ticketing system (optional)

Connect to your existing ticketing system for seamless operations.Supported integrations:
  • Jira Service Management
  • ServiceNow
  • Zendesk
  • Freshdesk
  • Custom ticketing systems (via API)
Integration approach:
  1. Add API Tool for your ticketing system
    • Configure authentication (API key, OAuth)
    • Map fields (title, description, priority, category)
  2. Modify workflow nodes:
    • Node 8 (Create Ticket): Use ticketing API instead of email
    • Trigger: Accept tickets from ticketing system webhook
    • Auto-resolution: Update ticket status to “Resolved - Auto”
    • Escalation: Update ticket and assign to appropriate team
  3. Bidirectional sync:
    • Ticket updates from system → MagOneAI
    • MagOneAI resolution → Update ticket
    • Comments and attachments synced
Benefits:
  • Centralized ticket management
  • No duplicate systems
  • Full audit trail
  • Reporting and analytics
  • SLA tracking

Key concepts demonstrated

Conditional Routing

Dynamically route requests based on AI classification, severity, and confidence scores

RAG with Technical Docs

Use knowledge bases of runbooks and procedures to power automated resolutions

Multi-Branch Decision Trees

Build complex routing logic: auto-resolve vs. escalate vs. critical alert

Tool Integration

Connect to ticketing systems, notification tools, and communication platforms

Confidence-Based Actions

Make routing decisions based on AI confidence to balance automation and quality

Quality Assurance Loops

Follow up on auto-resolutions to ensure they actually worked

Customization ideas

Extend this IT support workflow to handle more scenarios:
Create a specialized workflow for VPN troubleshooting:VPN Diagnostic Agent:
  • Asks clarifying questions: OS? Error message? Home or office network?
  • Runs through decision tree of common VPN issues
  • Provides specific solutions based on answers
  • Escalates if all diagnostics fail
Implementation:
  • When triage identifies VPN issue, route to VPN diagnostic workflow
  • Multi-turn conversation to gather diagnostic info
  • Step-by-step troubleshooting with user feedback
  • Final resolution or escalation with full diagnostic context
Benefits:
  • Higher auto-resolution rate for VPN (common issue)
  • Better context when escalating
  • Educates users on self-service diagnostics
Proactive issue detection and resolution:Monitoring integration:
  • Connect to Datadog, New Relic, Nagios, etc.
  • Detect patterns: “Multiple VPN failures from specific location”
  • Trigger proactive workflows
Proactive support:
  • Alert: “VPN server degraded performance”
  • Action: Pre-emptively notify affected users
  • Message: “We’re aware of VPN issues and working on it”
  • Reduce incoming tickets by communicating known issues
Auto-remediation:
  • Detect: “Printer offline”
  • Action: Restart print spooler service
  • Verify: Check if printer is back online
  • Notify: “Printer issue resolved automatically”
Benefits:
  • Shift from reactive to proactive support
  • Reduce ticket volume
  • Improve user experience
Continuously improve triage and resolution:Feedback collection:
  • After resolution: “Did this solve your problem? Yes/No”
  • After escalation: Support agent marks if triage was correct
  • Track: Which categories/issues have low auto-resolution success
Analytics:
  • Auto-resolution success rate by category
  • Triage accuracy (were classifications correct?)
  • Knowledge base gap analysis (common issues without solutions)
  • Average time to resolution (auto vs. human)
Improvement cycle:
  • Weekly review of misclassified tickets
  • Add missing solutions to knowledge base
  • Refine triage agent persona
  • Update category definitions
  • A/B test triage prompt changes
Implementation:
  • Add feedback collection nodes
  • Store feedback with ticket data
  • Build analytics dashboard
  • Schedule weekly review process
Device-specific troubleshooting:Asset management integration:
  • Connect to asset management system (e.g., Jamf, InTune, custom DB)
  • Query user’s device info: model, OS, installed software, warranty
Enhanced triage:
  • “Laptop won’t turn on” + Asset lookup → Check warranty status
  • “Software won’t install” + Asset lookup → Check if compatible with OS version
  • “Slow computer” + Asset lookup → Check device age and specs
Automated actions:
  • Warranty valid → Create RMA ticket
  • Warranty expired → Offer replacement options
  • Incompatible software → Suggest alternatives
  • Old device → Flag for upgrade consideration
Benefits:
  • More accurate diagnostics
  • Faster resolution paths
  • Better inventory management
  • Proactive hardware refresh planning
Empower users to self-diagnose:Self-service features:
  • Interactive troubleshooters: Wizard-style Q&A for common issues
  • Knowledge base search: User can search runbooks directly
  • Status dashboard: Show known issues and outages
  • Common requests: One-click password reset, VPN setup guide
Chat interface:
  • IT Support chatbot available on portal
  • Natural language: “How do I connect to Wi-Fi?”
  • Escalation: “Contact human support” button always visible
Workflow enhancement:
  • Portal interactions feed into triage workflow
  • Track: Which self-service articles are most used
  • Update: Keep popular articles current and clear
  • Measure: Reduction in ticket volume from self-service
Implementation:
  • Build web portal with chat widget
  • Connect to same MagOneAI agents and knowledge base
  • Add analytics tracking
  • Promote portal during onboarding
Support global teams:Language detection:
  • Detect user’s language from issue description
  • Or detect from user profile in HRIS
Translation approach:
  • Translate incoming issue to English for triage
  • Process with English knowledge base
  • Translate resolution back to user’s language
Multilingual knowledge base:
  • Maintain runbooks in multiple languages
  • Route to language-specific KB based on user language
  • Ensure terminology consistency across languages
Supported languages:
  • Based on your employee demographics
  • Common: English, Spanish, French, German, Mandarin, Japanese
Implementation:
  • Add translation agents or API (Google Translate, DeepL)
  • Configure multilingual embedding models
  • Test thoroughly for accuracy
  • Have native speakers review critical translations
Ensure timely responses:SLA tracking:
  • Define SLAs by severity:
    • CRITICAL: 30 minutes
    • HIGH: 4 hours
    • MEDIUM: 8 hours
    • LOW: 24 hours
Automated reminders:
  • 50% of SLA: Reminder to assigned agent
  • 75% of SLA: Escalate to team lead
  • 90% of SLA: Escalate to manager + alert
  • SLA breach: Create incident, notify leadership
Smart escalation:
  • If agent hasn’t responded, auto-reassign to available agent
  • If team is overloaded, offer self-service resources to user
  • If repeated escalations, suggest knowledge base improvement
Implementation:
  • Add timer nodes to track elapsed time
  • Condition nodes at SLA checkpoints
  • Notification nodes for reminders and escalations
  • Integrate with ticketing system SLA features
Reporting:
  • SLA compliance rate
  • Breaches by category and team
  • Average resolution time
  • Identify bottlenecks and training needs

Example triage results

Here’s what triage classifications look like:
Input: “I can’t connect to Wi-Fi at the office. It keeps asking for a password but won’t accept mine.”Triage Output:
{
  "category": "NETWORK",
  "severity": "MEDIUM",
  "resolvability": "AUTO",
  "confidence": "HIGH",
  "summary": "Unable to connect to office Wi-Fi - password rejection",
  "extracted_info": {
    "network_type": "Office Wi-Fi",
    "symptom": "Password not accepted",
    "location": "Office"
  },
  "reasoning": "Common Wi-Fi issue. We have documented troubleshooting for corporate Wi-Fi authentication failures."
}
Action: Auto-resolve with Wi-Fi setup guide from knowledge base

Measuring success

Track these metrics to demonstrate value: Efficiency metrics:
  • % of tickets auto-resolved (target: 30-40% for mature systems)
  • Average time to first response (auto vs. human)
  • Average time to resolution (auto vs. human)
  • % of tickets correctly triaged
  • Reduction in L1 ticket volume
Quality metrics:
  • Auto-resolution success rate (user confirms resolved)
  • Triage classification accuracy
  • Escalation appropriateness (were escalations necessary?)
  • SLA compliance rate
  • User satisfaction scores
Business impact:
  • IT team hours saved per month
  • Reduction in average resolution time
  • 24/7 coverage without additional headcount
  • Improved employee productivity (faster resolutions)
  • Knowledge base ROI (solutions reused via RAG)
Example success metrics after 3 months:
- 850 tickets processed
- 38% auto-resolved (323 tickets)
- Average auto-resolution time: 45 seconds
- Average human resolution time: 6.2 hours
- Estimated time saved: 2,000 hours (323 tickets × 6.2 hours)
- User satisfaction: 4.1/5 for auto-resolutions
- Triage accuracy: 89%
- SLA compliance: 94% (up from 78% before automation)

Next steps

Now that you’ve built an IT support triage workflow, explore related cookbooks:
Need help integrating with your specific ticketing system or building custom diagnostic workflows? Contact our solutions team for IT automation guidance.