How to decide what gets automated

The first conversation we have with every practice owner, and what those conversations keep teaching us about choosing well.

Taxo Team

May 27, 2026

10 min read

The conversation always starts the same way. A practice owner sits across from us, usually a little tired, often holding a phone that has buzzed twice since they sat down, and asks the question everyone asks. Where do we start. They already have an answer in mind, and the answer is almost always the loudest thing in the building. The phone that never stops ringing. The voicemail box that fills faster than anyone can empty it. The fax machine, still alive in the year of our lord, spitting referrals onto a tray that nobody has time to sort. They want us to point at the fire and put it out, and they want us to do it this week.

We have learned to slow that conversation down. Not because the fire is not real, it is very real, but because the loudest problem and the most automatable problem are rarely the same thing, and choosing the loudest one first is how good intentions turn into stalled pilots and quiet disappointment. So before we talk about what hurts, we talk about the work itself. What is the work actually made of. What has to be decided inside it. What happens when that decision comes out wrong. And whether the world around the task is willing to cooperate with anyone, human or machine, who tries to do it.

What follows is the thinking we walk every owner through. It is not a scoring sheet, though you could turn it into one. It is closer to a way of seeing. Once you can see it, the right place to start tends to announce itself, and it is almost never the place the owner walked in expecting.

Start with the decision, not the task

Most administrative work in a clinic looks like motion. Someone picks up a phone, types into a portal, reads a fax, sends a reminder, hangs up, starts again. But underneath the motion there is always a decision being made, and the nature of that decision is the single best predictor of whether a software agent can do the job. The useful question is not how hard is this task. It is how knowable is the decision buried inside it.

Some decisions are bounded. They live inside a finite set of states, and the rules that move you between those states can be written down, even when there are a great many of them. Checking whether a patient is eligible for a visit is bounded. The payer holds a record, the record says active or inactive, and the path to that record, however tedious, is the same path every time. Booking a patient against a provider's availability is bounded. So is routing a refill request to the right queue, or asking a payer where a claim sits in its adjudication. The decision space is large, but it has edges, and an agent can be trusted inside something that has edges precisely because it can be told what falls outside them.

Other decisions are not bounded at all. They ask a person to weigh, to interpret, to read a situation that has never occurred in exactly that form before. The moment a task starts to lean on clinical judgment, whether a symptom warrants an earlier appointment, whether a medication is appropriate, whether the catch in a patient's voice means something the words do not, you have left the territory where automation belongs. This is not a temporary limitation waiting on a better model to arrive. It is a line we draw on purpose. Administrative work that sits cleanly apart from clinical reasoning carries a fraction of the liability and almost none of the moral hazard. Work that blurs into it should stay in human hands, and we say so plainly, even when the owner would happily hand it over.

So the first cut is simple. Is the decision inside this task bounded, and does it stay on the administrative side of the clinical line. If the answer to both is yes, keep going. If not, this is not where you start, no matter how much it hurts.

Make sure you can tell when it goes wrong

A bounded decision is necessary, but it is not enough, because an agent that makes a clean decision you cannot verify is just an expensive way to be confidently mistaken. The second thing we look for is proof, and we look for it in three forms.

The first is whether the work has a ground truth you can check. After the agent acts, can you point to something in the world that tells you, without argument, whether it succeeded. The appointment is on the calendar or it is not. The payer returned an active status or an inactive one. The claim is in a state the system will confirm if you ask it. Verifiability is what lets you build trust slowly instead of betting all of it at once, and it is also what lets you measure the agent against the very people it is meant to relieve. When we deploy reminders, the standard we hold ourselves to is whether the show up rate matches what a human caller produces, and we can only ask that question honestly because the outcome is something we can count.

The second is the cost of being wrong, and whether the mistake can be undone. A reminder that lands at a slightly odd hour is a small sin, easily forgiven, and it corrects itself by the next day. An eligibility error is a different animal, because it does not announce itself. It slips quietly downstream and resurfaces weeks later as a denied claim, by which point the trail has gone cold and someone is spending an afternoon reconstructing what happened. We are far more comfortable automating work whose errors are loud and reversible than work whose errors are silent and compounding. Where the stakes are real but the work is otherwise a strong candidate, the answer is not to walk away from it. The answer is to build a checkpoint into it, a single moment where a person confirms before anything irreversible commits. The craft is in placing that human exactly where the risk lives, and nowhere else, so the review protects you without erasing the gain.

The third is whether the agent can recognize the edge of its own competence. The workflows that deploy well are not the ones with no exceptions. They are the ones where the exceptions can be seen coming. An agent that handles the routine confidently and hands the unusual case off cleanly is something you can put into production tomorrow. An agent that cannot tell which cases it is failing is a quiet liability, because those failures accumulate without anyone noticing until they have grown too large to ignore. Before we automate anything, we ask whether the task has a natural place for the agent to say, I am not sure about this one, please take it. When that place exists, everything downstream of it becomes easier.

Ask whether the terrain will cooperate

The last theme is about the world the work lives in, and it is the one owners least expect, because it has nothing to do with the task and everything to do with its surroundings.

The first feature of friendly terrain is something we call patternability, and it carries a surprise inside it. You would assume the most automatable work is the work with the cleanest software interface, some tidy connection between systems where data moves without a human ever touching it. Sometimes it is. But some of the best candidates we have ever found run over the ugliest channels imaginable. A voice agent calling a payer to check claim status is working through a phone tree, a hold queue, and a scripted representative, not one of which is an interface in any modern sense. And yet it works, and works beautifully, because the other side of that call is itself running a nearly scripted process. The phone tree is predictable. The representative is reading from a playbook. The mess is regular. What matters is not whether the channel is clean but whether the interaction repeats in a shape an agent can learn. A channel mediated by humans that behaves the same way every single time is friendlier ground than a modern portal that quietly redesigns itself every quarter.

The second feature is time. Some work has to happen the instant a patient is on the line, with no second chance, and that work demands an agent that can fail gracefully in front of a real person who is waiting. Other work can wait. A sweep of claim statuses can run overnight, in batches, retrying the calls that drop, requeuing the ones that stall, and landing a clean summary on someone's desk before they have finished their first coffee. Work that tolerates delay is far more forgiving to automate, because patience buys you reliability for free. You can retry. You can review before you deliver. You can let a person glance at the result before it counts for anything. We tend to begin where the clock is generous and earn our way toward the work that demands an answer right now.

The third feature is plain economics, and it is the one that quietly decides whether any of this was worth doing in the first place. The return on automating a workflow scales with how often it happens multiplied by how much human time each instance consumes. Work that recurs constantly, refills, reminders, eligibility checks, status calls, pays back the cost of building for it many times over, and as a bonus it produces the volume of examples you need to keep making the agent better. The isolated, complicated task that happens twice a month is almost never the place to begin, however satisfying it would be to finally solve. Density is what turns a clever demonstration into something that actually changes a clinic's week.

Why the loudest problem is rarely the first one

Put the three themes together and the shape of a good first project comes into focus. It is bounded and safely administrative in its decision. It is verifiable, reversible, and self aware in its failures. It runs over patternable ground, tolerates a little delay, and happens often enough to matter. The strongest candidates are strong on several of these at once, and the reason owners guess wrong so reliably is that they choose on a single axis. Usually that axis is pain. Sometimes it is raw volume. Neither one alone tells you whether the work can actually be done well.

Consider two workflows that look like cousins from across the room. Outbound claim status calls and prior authorization both put a person on the phone with a payer, both devour staff hours, both are tempting to point at. But claim status is bounded, its result is verifiable against the payer's own system, its errors are reversible, the payer side is patternable, the work batches overnight, and it happens all day every day. Prior authorization shares the volume and the phone time, but its decision leans toward clinical justification, the cost of a misstep is a delay in someone's care, and the path to approval bends case by case in ways that resist any clean script. The first is a place to start. The second is a place to arrive at, carefully, once you have earned the trust and built the checkpoints to handle it. The volume on the two is nearly identical. The right answer is not.

This is the part we most want practice owners to hear, and it is the part that takes the longest to land. The goal of automation is not to silence the loudest problem in the building. It is to find the work where a machine can be reliably, verifiably, and safely better than the status quo, and then to let that one good choice compound into the next. Because the deeper truth underneath all of it, the one that animates everything we build, is that administrative failure is not a clerical inconvenience. A call that never gets returned, a verification that never gets run, a reminder that never goes out, these are not back office problems. They are how patients quietly fall through. Deciding well about what to automate is not an operations decision dressed up in software. It is a decision about who gets cared for and who gets missed, and we think it deserves to be made with exactly that much seriousness.

So when an owner asks us where to start, we no longer point at the fire. We ask what the work is made of, whether we will know the moment it goes wrong, and whether the ground beneath it will hold. The right place to begin has a way of stepping forward on its own.