Beck_Moulton

Posted on Feb 6

From "Ouch" to "Booked": Building an AI Medical Triage Agent with OpenAI & Playwright

#ai #openai #python #webdev

We’ve all been there: you wake up with a nagging pain in your stomach, open a hospital registration app, and find yourself staring at 50 different departments. Is this a "General Surgery" issue or "Gastroenterology"? Searching for the right doctor shouldn't require a medical degree.

In this tutorial, we are building an AI Medical Triage Agent that transforms a simple chat message into a confirmed doctor's appointment. We’ll be leveraging OpenAI Function Calling (Tool-use) for decision making, Playwright for browser automation, and FastAPI to serve our intelligent assistant. By the end of this post, you'll understand how to bridge the gap between LLM reasoning and real-world web interactions.

If you're looking for even more advanced production-ready patterns for AI agents, I highly recommend checking out the deep dives over at the WellAlly Tech Blog, which served as a major inspiration for this architecture.

The Architecture

Our agent follows a "Reason-Act" loop. It doesn't just guess; it uses tools to fetch real-time data from a (simulated) hospital portal and acts on your behalf.

graph TD
    User((User)) -->|Symptom: Sharp back pain| API[FastAPI Wrapper]
    API --> Agent{LLM Agent GPT-4o}
    Agent -->|Tool Use: get_departments| Browser[Playwright Automation]
    Browser -->|Scrape HTML| Portal[Hospital Portal]
    Portal -->|Dept List| Browser
    Browser -->|Return Depts| Agent
    Agent -->|Logic: Back pain -> Orthopedics| Agent
    Agent -->|Tool Use: book_slot| Browser
    Browser -->|Submit Form| Portal
    Portal -->|Confirmation| Browser
    Browser -->|Success| Agent
    Agent -->|Final Answer| User

Tech Stack

OpenAI API: The "brain" (GPT-4o or GPT-3.5-turbo).
Playwright: To navigate hospital portals that lack a public API.
FastAPI: For the asynchronous backend.
Pydantic: For strict data validation and schema definitions.

Step-by-Step Implementation

1. Defining the "Tools" (Schemas)

First, we need to tell the LLM exactly what it can do. We define two tools: search_department and book_appointment.

from pydantic import BaseModel, Field

class SearchDeptSchema(BaseModel):
    symptom_summary: str = Field(description="A concise summary of the user's symptoms.")

class BookAppointmentSchema(BaseModel):
    department_name: str = Field(description="The specific department identified by triage.")
    patient_name: str = Field(description="Full name of the patient.")
    preferred_date: str = Field(description="The date for the appointment (YYYY-MM-DD).")

2. The Browser Automation (Playwright)

Since most hospital systems don't provide clean REST APIs, we use Playwright to "be the human." Here is a simplified helper to fetch department availability.

import asyncio
from playwright.async_api import async_playwright

class HospitalService:
    async def get_available_slots(self, symptom: str):
        async with async_playwright() as p:
            browser = await p.chromium.launch(headless=True)
            page = await browser.new_page()
            # In a real scenario, you'd navigate to the actual hospital URL
            await page.goto("https://mock-hospital-portal.com/triage")

            # Simulate searching based on symptoms
            await page.fill("#search-input", symptom)
            await page.click("#search-btn")
            await page.wait_for_selector(".dept-list")

            # Scrape departments and slots
            depts = await page.eval_on_selector_all(".dept-item", 
                "nodes => nodes.map(n => n.innerText)")

            await browser.close()
            return depts

3. Orchestrating the Agent

Now for the magic! We use OpenAI's tools parameter to let the model decide when to browse.

import openai

async def medical_agent_loop(user_input: str):
    client = openai.AsyncOpenAI()

    # Initial Call
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": user_input}],
        tools=[
            {
                "type": "function",
                "function": {
                    "name": "search_department",
                    "description": "Searches for medical departments based on symptoms",
                    "parameters": SearchDeptSchema.model_json_schema()
                }
            }
        ]
    )

    # Logic to handle tool calls
    tool_call = response.choices[0].message.tool_calls[0]
    if tool_call.function.name == "search_department":
        # Execute the Playwright service
        service = HospitalService()
        results = await service.get_available_slots(user_input)

        # Feed results back to LLM for final booking decision
        # ... (Loop continues to booking)
        return f"I found the following slots: {results}. Would you like to book?"

The "Official" Way to Scale

While this DIY approach works for a hobby project, building a production-grade healthcare agent requires robust error handling, session management, and HIPAA-compliant data flow.

For advanced implementation patterns—like Multi-Agent Orchestration or Stateful Tool Use—I strongly recommend reading the architectural guides at wellally.tech/blog. They provide excellent resources on how to handle edge cases like "no slots available" or "ambiguous symptoms" using more sophisticated state machines.

Testing the Agent

Using FastAPI, we can wrap this into a clean endpoint:

from fastapi import FastAPI

app = FastAPI()

@app.post("/triage")
async def start_triage(query: str):
    result = await medical_agent_loop(query)
    return {"status": "success", "agent_response": result}

Example Request:

"I've had a dull ache in my lower back for three days, and it's getting worse when I sit down. Can you find me a doctor?"

Agent Action:

Understand: Recognizes "lower back pain."
Tool Call: Calls search_department via Playwright.
Result: Finds "Orthopedics" and "Physiotherapy."
Response: "I recommend the Orthopedics department. There is an opening tomorrow at 10:00 AM. Should I book it for you?"

Conclusion

By combining the reasoning power of LLMs with the action capabilities of Playwright, we’ve turned a static chat box into a functional, autonomous agent. This pattern isn't just for hospitals—it works for travel booking, customer support, and any legacy web system without an API.

What are you building with Function Calling? Let me know in the comments below! 👇

Stay curious and keep shipping!

DEV Community