AI Assistant

The AI Assistant page provides professionals with a conversational interface to access key business insights—such as upcoming appointments, client no-show trends, and revenue summaries—without navigating multiple dashboard sections. Built on a function-calling LLM pipeline, it translates natural language queries into structured data retrieval, delivering accurate, context-aware responses grounded in real-time database records.

Overview

The assistant operates through a two-phase LLM interaction: first, the model decides which data tool to invoke based on the user’s query; then, after executing the relevant database query, it generates a final response. This architecture ensures factual consistency by preventing the model from hallucinating numbers. The system leverages drizzle-orm to query a schema with 10 tables—including appointments, clients, and professionals—and is protected by rate limiting and plan-based access control. Three core tools power the assistant: get_stats, get_upcoming, and get_no_show_clients, each mapped to specific analytical intents.

The /api/ai/chat endpoint handles all interactions, enforcing authentication via getUser() and authorization through canUseFeature(), which restricts access to Pro and Premium plans. Rate limiting is applied per user ID using getAiChatLimiter(), allowing 20 requests per hour—configured in the rateLimits object under aichat. The LLM backend is Groq’s llama-3.3-70b-versatile, chosen for low-latency function calling.

Design decisions

The system uses a two-call LLM pattern to separate tool selection from response generation. This avoids inlining raw data into the first prompt, preserving token budget and reducing cost. Tool responses are injected as tool-role messages, ensuring the final LLM pass can synthesize results without reprocessing the full context.

Tool definitions are intentionally narrow, each serving a distinct query class. For example, get_stats handles aggregate trends, while get_no_show_clients is reserved for client-specific rankings. This prevents ambiguous tool routing and aligns with the system prompt’s explicit dispatch rules.

Data fetching is batched per time window: getStats() retrieves all appointments from the last six months in one query, then computes monthly breakdowns in memory. This reduces database round-trips compared to one query per month, trading minimal memory overhead for significantly lower latency.

Potential improvements

The getStats() function computes monthly aggregates client-side, but moving this to SQL would improve performance. For example, using date_trunc('month', starts_at) in a GROUP BY clause (via Drizzle’s sql template) could shift the work to PostgreSQL, reducing data transfer and simplifying logic.

The get_no_show_clients() query lacks a minimum appointment threshold, potentially surfacing clients with 1 session and 1 no-show. Adding a HAVING count(*) >= 2 clause would make the ranking more meaningful by filtering out statistically insignificant cases.

The system prompt mandates strict output formatting for no-show responses, but there’s no runtime validation of the final reply. A malformed LLM output could break UX expectations. Wrapping the second LLM call in a retry loop with a stricter response_format or post-hoc parsing would increase robustness.

References

src/app/api/ai/chat/route.ts
lib/db/schema.ts (implicit via appointments, clients)
lib/plan-guard.ts (via canUseFeature)

AI Assistant

AI Assistant

Overview

Design decisions

Potential improvements

References

On this page