WhatsApp AI Intake
The WhatsApp AI intake agent processes inbound messages, uses Groq for tool-calling, and facilitates appointment creation.
WhatsApp AI Intake
The WhatsApp AI Intake agent serves as a crucial interface, enabling clients to interact with professionals and schedule appointments directly through WhatsApp. This system is designed to understand natural language requests, check professional availability, and create bookings using a large language model (LLM) from Groq, integrated with a suite of specialized tools. It represents a significant component of the platform's automated client engagement strategy, streamlining the booking process and reducing manual overhead for professionals.
Overview
At its core, the WhatsApp AI Intake agent operates by receiving inbound messages via a dedicated webhook, processing them through an LLM, and executing specific actions based on the client's intent. The entire process follows a two-pass LLM pattern, similar to the general /api/ai/chat route:
- First Pass: The LLM analyzes the incoming message and the conversation history to determine if a specific tool needs to be invoked to fulfill the user's request.
- Second Pass: If a tool was called, its results are fed back to the LLM, which then generates a natural language response to the client, incorporating the tool's output.
This architecture relies heavily on several database tables, including professionals for professional details, availability for scheduling rules, appointments for managing bookings, clients for client information, and notifications for alerting professionals. Conversation state and transactional context are managed using Redis, ensuring continuity across interactions. The system also integrates with WhatsApp's messaging capabilities for sending templated responses, such as msgBookingRequest, and uses email for professional notifications.
WhatsApp Webhook
The entry point for all inbound WhatsApp messages is the /api/whatsapp/webhook API route, which handles POST requests. While the specific implementation details of this route were not provided, its purpose is to receive messages from the WhatsApp platform and forward them to the AI intake logic. This webhook is subject to a whatsapp rate limit of 10 requests per minute, preventing abuse and ensuring system stability under high load. Upon receiving a message, the webhook is responsible for extracting the message content, the sender's phone number, and the professional's identifier, then passing this information to the handleIncomingMessage function in src/lib/ai-intake.ts.
AI Intake Agent (src/lib/ai-intake.ts)
The src/lib/ai-intake.ts file encapsulates the core logic for the AI WhatsApp Intake Agent. It orchestrates the interaction between the incoming message, the LLM, and the backend services.
handleIncomingMessage
This asynchronous function is the central orchestrator. It takes the client's message, sender's phone, and the target professionalId as input.
- Professional and Conversation Loading: It first retrieves the
professionaldetails from theprofessionalstable and loads the ongoingconversationstate from Redis usinggetConversation. If no conversation exists, it attempts to identify the client from theclientstable based on their phone number and initializes a new conversation state. - Message History and Context: The user's message is appended to the conversation history. A
systemPromptis constructed usingbuildSystemPrompt, providing the LLM with its persona and rules. Crucially, if the inbound message is a reply to a recent transactional notification (e.g., a reminder or confirmation), theoutboundContext(stored in Redis viasetOutboundContext) is used to inject specific details about the referencedappointmentIdinto the LLM's system messages. This helps the LLM understand the context of replies like "I need to reschedule." - Groq Interaction: The function then interacts with the Groq LLM using the
groq-sdklibrary, specifically thellama-3.3-70b-versatilemodel.- First Call: The initial call to Groq includes the
systemPrompt, client context, conversation history, and theTOOLSdefinitions. Thetool_choice: "auto"setting allows the LLM to decide whether to invoke one of the defined functions. - Tool Execution: If the LLM decides to call a tool, the
handleIncomingMessagefunction parses the tool call, executes the corresponding local function (e.g.,getAvailableSlots,bookAppointment), and captures the result. - Second Call: The tool's output is then added back to the message history as a
toolmessage, and a second call is made to the Groq LLM. This time, the LLM generates the final, human-readable response to the client, incorporating the results of the tool execution.
- First Call: The initial call to Groq includes the
- Conversation Saving: After generating a reply, the updated
conversationstate, including the LLM's response, is saved back to Redis usingsaveConversation, ensuring continuity for future interactions.
Timezone Helpers
Brazil abolished DST in 2019, making São Paulo consistently UTC-3. The saoPauloDate, formatSaoPauloTime, and formatSaoPauloDate functions ensure that all date and time operations, especially for appointment scheduling, are correctly handled in the America/Sao_Paulo timezone. This prevents common off-by-one errors related to timezones.
Tool Definitions (TOOLS)
The TOOLS array defines the functions that the Groq LLM can invoke. Each tool has a name, a description (in Portuguese, guiding the LLM on when to use it), and parameters schema. The agent defines three tools:
get_available_slots: Used to retrieve available appointment times for a specific date. The LLM is instructed to always use this before suggesting times.book_appointment: Creates a new appointment. The LLM is instructed to use this only after confirming the date, time, and client's name.get_professional_info: Fetches details about the professional, such as their name, profession, session duration, price, and general availability schedule.
Tool Implementations
Each tool definition corresponds to an asynchronous function that interacts with the database or other services:
getAvailableSlots(professionalId, dateStr): This function queries theprofessionalstable for session duration and theavailabilitytable for the professional's weekly schedule. It then fetches existingappointmentsfor the given date, excludingrejectedorcancelledones. It iterates through the professional's availability rules for the specified day, generating potential slots based onsessionDuration, and filters out any slots that conflict with existing appointments or are in the past.bookAppointment(professionalId, dateStr, time, clientName, clientPhone): This is a critical function that handles the creation of a new appointment. It performs an atomic operation within a Drizzle ORM transaction withserializableisolation level. This isolation level is chosen to prevent race conditions where two clients might attempt to book the same slot simultaneously.- It first checks for any conflicting
appointmentswithin the desired time slot. If a conflict is detected, it throws aSLOT_TAKENerror, which is caught and translated into a user-friendly message. - It then either finds an existing
clientby phone number or creates a new entry in theclientstable. - Finally, it inserts the new
appointmentinto theappointmentstable, setting itsstatustoconfirmedorpending_confirmationbased on the professional'sautoConfirmsetting. After a successful booking, it triggers several notifications: an in-appnotificationfor the professional, a WhatsApp message to the professional usingmsgBookingRequest, and an email to the professional if an email address is available.
- It first checks for any conflicting
getProfessionalInfo(professionalId): This function retrieves basic information about the professional from theprofessionalstable and theiravailabilityrules, formatting the schedule for the LLM to present to the client.
System Prompt Builder (buildSystemPrompt)
The buildSystemPrompt function constructs the initial instructions for the LLM. It defines the AI's persona as the scheduling assistant for a specific professional, sets the primary goal of helping clients schedule appointments, and outlines crucial rules: always respond in Brazilian Portuguese, never invent times, ask for missing information (date, client name), confirm appointments clearly, and politely redirect out-of-scope questions. It also explicitly forbids the LLM from identifying itself as an AI.
Conversation Management (src/lib/conversation.ts)
The src/lib/conversation.ts file manages the state of ongoing conversations using Redis.
ConversationState: This interface defines the structure for storing conversation data, includingprofessionalId,clientPhone,clientName, a history ofmessages, and acontextobject to track pending booking details (pendingDate,pendingSlot,appointmentCreated).- Redis Storage: Functions like
getConversationandsaveConversationinteract with Redis to retrieve and persistConversationStateobjects. Conversations are stored with aTTL(Time To Live) of 30 minutes, ensuring that inactive conversations are automatically cleared. AMAX_MESSAGESlimit of 20 messages is enforced to keep the conversation history within the LLM's context window and manage memory usage. Phone numbers are normalized usingstripDigitsto create consistent Redis keys (conv:{professionalId}:{phone}). - Client Activity Tracking: The
markClientActiveandisClientActivefunctions manage aclient_active:{phone}key in Redis with anACTIVE_TTLof 15 minutes. This mechanism is used by the cron job (/api/cron/reminders) to check if a client is currently engaged in a conversation with the AI. If a client is active, transactional reminders are deferred to avoid interrupting an ongoing scheduling dialogue. - Outbound Context: The
setOutboundContextandgetOutboundContextfunctions store information about the last transactional message sent to a client (e.g., a reminder or confirmation). ThisOutboundContextincludes theprofessionalIdand optionally anappointmentId, and has aTTLof 2 hours. When a client replies, thehandleIncomingMessagefunction can retrieve this context, allowing the LLM to understand that the reply might be related to a specific, recent appointment.
Design Decisions
The design of the WhatsApp AI Intake agent reflects several deliberate choices to balance functionality, performance, and user experience:
- Two-Pass LLM Architecture: The decision to use a two-pass approach for LLM interaction (first for tool selection, second for response generation) provides greater control and reliability. It separates the complex task of intent recognition and tool parameter extraction from the final response generation, leading to more accurate tool use and coherent replies.
- Transactional Booking with
SERIALIZABLEIsolation: ThebookAppointmentfunction's use of a Drizzle transaction withserializableisolation level is a critical choice for data integrity. This prevents race conditions where multiple concurrent requests might attempt to book the same appointment slot, ensuring that each slot is assigned uniquely. The explicit handling ofSLOT_TAKENand SQLSTATE40001errors provides a clear feedback mechanism to the user. - Redis for Ephemeral Conversation State: Employing Redis for conversation state management (
ConversationState) is a pragmatic choice. Redis's in-memory nature offers low-latency access, which is essential for interactive chat applications. The use ofTTLautomatically prunes stale conversations, preventing unbounded memory growth, whileMAX_MESSAGESkeeps the LLM context manageable. - Explicit Tool Definitions and Descriptions: Providing the LLM with clear, descriptive tool definitions (including their purpose and parameters) is fundamental to enabling effective tool-calling. The descriptions in Portuguese directly guide the LLM's decision-making process, improving its ability to correctly identify when and how to use each function.
- Dedicated Timezone Handling: The explicit timezone helper functions (
saoPauloDate, etc.) forAmerica/Sao_Paulodemonstrate a commitment to accuracy in scheduling. Given the complexities of timezones and the abolition of DST in Brazil, this dedicated logic prevents common scheduling errors and ensures that appointments are recorded and displayed correctly. - Client Activity Key for Reminder Deferral: The
client_activekey in Redis is a thoughtful design choice to enhance user experience. By deferring automated reminders when a client is actively conversing with the AI, the system avoids sending potentially confusing or redundant messages, maintaining a smoother and more natural dialogue flow.
Potential Improvements
- Dynamic Tool Augmentation: The
TOOLSarray is statically defined. Theget_professional_infotool already fetches some professional data. This could be expanded to dynamically augment tool descriptions or the system prompt with more specific details about the professional's services, specialties, or even custom booking policies. For example, if a professional specializes in "therapy for anxiety," this information could be injected to guide the LLM's responses and tool choices more effectively, making the agent feel more personalized. - Structured Error Handling for LLM Interpretation: The
getAvailableSlotsandbookAppointmentfunctions return simple error messages (e.g.,{ error: "Data inválida." }or{ success: false, reason: "SLOT_TAKEN" }). While the LLM can parse these, providing more structured error objects (e.g., with specific error codes or enumerated types) could allow the LLM to generate more precise, context-aware, and helpful error messages to the client, rather than relying solely on natural language inference. - Advanced Conversation Context Management: The
ConversationState.contextobject currently trackspendingDate,pendingSlot, andappointmentCreated. For more complex multi-turn dialogues, this context could be enriched to track multiple potential appointment options, client preferences (e.g., "morning appointments"), or explicit user intent states (e.g., "rescheduling_flow", "new_booking_flow"). This would allow the LLM to maintain a deeper understanding of the conversation's trajectory and handle more nuanced requests without losing track. - Proactive Slot Suggestion: The
get_available_slotstool requires adateparameter. The system prompt instructs the LLM to ask for a date if not provided. An enhancement could involve modifyingget_available_slotsor adding a new tool that, when called without a specific date, returns the next few available slots across different days. This would allow the AI to proactively suggest options like "I have openings next Tuesday at 10:00 AM or Wednesday at 2:00 PM" if the client expresses general interest in booking, reducing the number of turns required.
References
src/lib/ai-intake.tssrc/lib/conversation.tssrc/app/api/whatsapp/webhook