Tarcroi's comments

Tarcroi · 2026-05-02T14:14:45 1777731285

I find it improbable that a car could have gotten through in a marathon as big as London's, with its incredible organization. In any case, having run quite a few marathons, I can tell that a marathon is anything but a distraction, it's a real challenge.

Tarcroi · 2026-04-12T07:39:12 1775979552

This coincides with Anthropic's peak-hour announcement (March 26th). Could the throttling be partly a response to infrastructure load that was itself inflated by the TTL regression?

HauntingPin · 2026-04-12T08:41:28 1775983288

It would be too fucking funny if this were the case. They're vibe coding their infrastructure and they vibe coded their response to the increased load.

KronisLV · 2026-04-12T09:27:49 1775986069

You'd think they would have dashboards for all of this stuff, to easily notice any change in metrics and be able to track down which release was responsible for it.

HauntingPin · 2026-04-12T09:29:15 1775986155

They probably do, then they pipe it into a bunch of Claude subagents and then you get the current mess.

Tarcroi · 2026-04-10T10:01:00 1775815260

Different thing. SkillKit distributes skills to agents. Skrun runs them as APIs.

Tarcroi · 2026-04-09T17:44:50 1775756690

As I mentioned in another comment here, I've been working on an open-source alternative. Multi-model, 5 providers with fallback. Happy to share the repo if you're interested.

Tarcroi · 2026-04-09T17:26:32 1775755592

I agree on the commodity point, that's why I went multi-model from start.

The registry question is the one I'm thinking about the most. Right now it's flat. I plan to integrate usage data (success rates, cost, trust scores). So the registry tells you which skills actually work well, and that's valuable.

Your article looks interesting, I'll read it.

Tarcroi · 2026-04-08T22:18:18 1775686698

I've been building exactly this. it's Open source, multi-model (5 providers with fallback), from now, it runs locally but the architecture is designed for self-hosted deployment.

Tarcroi · 2026-04-08T21:48:36 1775684916

You're right. For now, it's only local. For a public deployment, the idea is to have sandboxes and verification steps. That won't completely eliminate the risk of prompt injection, but so far no solution has managed to completely resolve this problem.

Tarcroi · 2026-04-08T21:24:55 1775683495

Thanks! Sandbox deployment is planned in the roadmap. I already have a RuntimeAdapter interface in my architecture that I'll use to isolate the VMs. I'm doing exactly the same thing: I'm cross-referencing the models to challenge their plan, and my code reviewer agent's API is a big help.

Tarcroi · 2026-04-08T13:41:51 1775655711

Can this be run a second time and compared against a previous audit?

IlyaIvanov0 · 2026-04-08T13:57:23 1775656643

Curious, are you thinking about this for continuous monitoring, or more for before/after comparison when agent get updated?

Tarcroi · 2026-04-08T21:34:53 1775684093

Both. In my opinion, an agent has a life cycle and needs observability.

IlyaIvanov0 · 2026-04-09T04:27:45 1775708865

It's true. Make sense

IlyaIvanov0 · 2026-04-08T13:54:36 1775656476

Thanks for asking. Not yet, but it is in backlog. I will be doing this in the future.

Tarcroi · 2026-04-08T12:50:30 1775652630

Hi, I'm the "colleague", Impatient to have your feedback!

hmartin · 2026-04-08T21:37:27 1775684247

Thanks for sharing a cool project! Just fyi, more idiomatic English would be "eager to have your feedback" since "impatient" implies frustration.

Tarcroi · 2026-04-08T21:56:23 1775685383

Ha, thanks for the correction! I'll remember that!