Agent-First Product Design

Wei-lin: You know, it's like... if you give a kid a Lego set, right? If the tires and axles are already glued together, pre-assembled, what are they going to make? Only cars, right? Trucks. Vehicles.

Valentina: Yeah, makes sense. You're limited by what's already there.

Wei-lin: Exactly. But if you give them all the individual pieces, the raw bricks, the separate tires, the axles, they can make anything. Motorcycles, a tire swing, maybe a robot that looks like a tire swing, who knows? The possibilities just open up.

Valentina: That's a good way to put it. So, what are we building with our Legos today?

Wei-lin: Well, this article I read, it's about building products for AI agents. And it's saying the same thing, actually. About giving them the raw pieces.

Valentina: Ah, okay. I was thinking, like, you just bolt on a chatbot, you know? Make some nice APIs, and that's it. Agent-ready.

Wei-lin: See, that's what I thought too. But the article, it starts with this very specific example. You ask your product's AI to set up an A/B test. So, the agent goes, 'Okay, I got this!' It creates the feature flag, sets up the analytics dashboard. And then... it just stops.

Valentina: Stops? Like, 'I need a coffee break' stop?

Wei-lin: No, it literally stops and says, 'Hey, human. Can you go into the app, manually create this 'experiment' thing yourself, and then paste the ID back into my chat window?'

Valentina: Oh, no. That defeats the whole purpose, right? Like my airline chatbot. I ask it to change a flight, it tells me the weather in my destination city. Very helpful. But the second I say, 'I need to apply a flight credit from a previous cancellation,' it's like, 'Sorry, I don't have that superpower. Please call a human.' It just gives up. The bot doesn't have the same powers as the human agent on the phone.

Wei-lin: Exactly! So the article's first rule, it's about that. Give agents the same capabilities as users. If a human can do something in your product, the agent should be able to do it too. Otherwise, you end up with these leaky abstractions, you know?

Valentina: Yeah, it’s like... why even bother if I still have to do half the work myself? My initial thought was just to give the agent more pre-built tools, like specific buttons it can press in the UI, but automatically. But this example, it's about deeper access.

Wei-lin: Yeah, exactly. And this leads to their second rule, which is, 'Let agents reason at the right layer of abstraction.' They used to give their agent very specific tools, like 'get-insight' or 'get-funnel.' But for an agent, that's still too high-level, it's like making it do four different steps for one simple question.

Valentina: So, what's the 'right layer' then? If not those specific tools?

Wei-lin: Raw SQL. They just gave it direct SQL access to their database. So instead of four clunky API calls to answer 'Why did signups drop last week?', it just writes one elegant SQL query. Just like we talked about with the Legos, right? Giving it the raw materials.

Valentina: Wow. Direct SQL. That's... bold. So they're treating the agent like a very smart, very new hire that you're onboarding, not a dumb robot that just follows a checklist.

Wei-lin: Yeah, that's what they say. Treat it like a brilliant new hire. And the third rule follows from that: 'Front-load essential context, but let the agent learn the rest.' They used to just give it a four-line prompt like, 'Here are some tools, good luck.' Very generic.

Valentina: Like, 'Figure it out, AI!' That sounds like my first week at my old job. Not very helpful.

Wei-lin: Right? So now, because they know any agent connecting to them is there to query PostHog data, they pre-load it with core concepts. Like, what's a feature flag, what's an experiment, their specific SQL syntax, critical querying rules like always filtering by time range. The stuff you know every session will need.

Valentina: Okay, so you give it the fundamentals, the grammar, the basic vocabulary. But you don't micromanage every single thing it does.

Wei-lin: Exactly. And the last rule, this one I found really interesting: 'Embed craft and opinion into 'skills'.' It's not just about what the product can do, but how an expert would use it. They teach the agent expert-level nuances.

Valentina: Like, what kind of nuance?

Wei-lin: Like, for retention analysis, don't use 'signed in' events because they can be inconsistent. Always use '$pageview' events. It's like a senior analyst whispering tips to a junior one. It ensures the agent uses the product not just correctly, but well, based on their internal best practices.

Valentina: That's smart. Otherwise, the agent might give you technically correct but misleading data. So it’s like teaching it the 'why' behind the 'what'.

Wei-lin: Yeah. It's embedding that domain knowledge. So it's not just automating tasks, it's automating good tasks.

Valentina: Okay, this is all very clever. But also, this sounds like a massive engineering investment, no? For something that, let's be honest, feels a bit niche right now. Are we really sure we want to build products for AIs instead of humans? You know, you risk losing that serendipity, that creative intuition of human analysis, by codifying a 'correct' way for an agent to think about problems.

Wei-lin: Hmm, that's a good point. I mean, they do say they found that 'traces hour' where they review what the agents actually did, and they found a case where the AI correctly intervened when it spotted a weird data pattern the human hadn't noticed. So there's some serendipity there too, maybe.

Valentina: But that's one instance, Wei-Lin. How many times does it just follow the 'correct' path and miss something a human would have stumbled upon because they weren't looking for a 'correct' path, just a path? I just feel like this is codifying biases into the system.

Wei-lin: I see your point. But then, if you're giving it raw SQL access, what about the catastrophic risk? Like, an elegant query that accidentally deletes all your production data? How do you prevent that?

Valentina: That's a very good question. I mean, they did mention that everything in their API needs to be accessible, but they also have manual opt-in via YAML config files. So nothing is exposed by default. Maybe that's the guardrail?

Wei-lin: But that's for the tools, not necessarily for the raw SQL, right? If you give it direct SQL access... there's a big leap from 'don't expose by default' to 'can't accidentally drop a table'.

Valentina: Well, they also say they do a 'traces hour' and build evaluations based on what humans caught. So maybe it's this continuous monitoring? Like, you launch it, but you don't just walk away, you watch it like a hawk. It's not fully autonomous in the sense of 'set it and forget it'.

Wei-lin: I guess... but that feels like a lot of human oversight for something meant to be 'agent-first.' It kind of undermines the whole autonomy thing, no?

Valentina: Or it means 'agent-first' doesn't mean 'agent-only'. It's still a tool for humans to use more efficiently. It's not about replacing us entirely, but making us faster, more powerful.

Wei-lin: I'm still not convinced the risk is worth it sometimes. What if the 'expert opinion' you embed is wrong for a new scenario? You've just hardcoded a potential error into your agent. I'm Wei-Lin.

Valentina: And I'm Valentina. This has been Manish Chiniwalar's Station.

Agent-First Product Design

Hosts

Source Articles

Transcript