Hacker Newsnew | past | comments | ask | show | jobs | submit | afro88's commentslogin

Nah they do. They push Sonnet pretty hard rather than Opus for most tasks.

Also: https://platform.claude.com/docs/en/agents-and-tools/tool-us...


Core AWS services use it too. Even if you are hosted in another region, you can still be affected by a US-East 1 outage

The idea would be to actually load distribute between different cloud providers.

But even then , the load balancer needs to run somewhere. Which becomes a new single point of failure.

I’m sure someone smarter than me has figured this out.


yes, they have. It just costs a shit ton of money and is extremely difficult to get the suits to sign off on TWO full 'cloud services' bills. It generally doubles your cost and workload and increases your uptime by a couple hours/year, assuming you don't have bugs that affect one or the other cloud in your deployment stack.

It's basically a wash for almost all organizations for twice the cost and effort.


Ok...

But where does the load balancer actually run. Does load balancer main run on AWS, and load balancer backup on Oracle?


Short TTL DNS or BGP anycast.

also these things don't go down THAT often... well aws, not some others. More uptime that you probably had before. even the stock market takes a few days off every decade. Just ask W.

> not some others.

Looking at Azure and GitHub in particular. ;)



Not really. Your clients can random robin to connection points across providers and move write heads upon connection. If you worry about hard coding you can reduce the surface to a per-context first minimum contact point.

Bingo. This is the one most people don't know about.

I was surprised recently when setting up cloudfront with aws certs that it forced me to use us-east-1 to provision the certs.

STS is only on us-east-1 I believe

Yep. All of the identity and access management services for the non-China public cloud are in us-east-1. https://news.ycombinator.com/item?id=48071472

All the control plane. Data plane is distributed and roles using iam to access resources can still do so during a control plane outage.

Yes, you're right, but in my experience the boundary between the data plane and the control plane is not always clear, and especially unclear on these foundational and basic services.

There were enough "surprisingly control-plane" IAM operations in the AWS services that I dealt with, so we had to exercise extreme caution during outages.


It's literally documented. Try reading it and educating yourself.

I worked there.

Even if I were the stupidest and least curious engineer around (and I was far from it), that's basically irrelevant to what you're scolding me for here…

As part of a team with both software development and operational responsibilities, like most teams at AWS, I had to deal not only with the consequences of my own imperfect knowledge, but also with the imperfect knowledge of my coworkers past and present.


No offense, this is a crazy worthless contribution to the discussion.

Why?


Because everyone in these replies is in complete denial about the physical limits of memory and scaling in general. Ya'll literally living in an alternate reality where model capability increases with a decrease in size, its simply not the case. There will be small focused models that preform well on very narrow tasks, yes, but you will not have "agents" capable of "building most things" running on consumer hardware until more capable (and affordable) consumer hardware exists.

Ah, you haven't realized that consumer hardware gets more capable over time

Not this year, when many vendors either offer lower memory capacities or demand higher prices for their devices.

Correct, the progress is not perfectly linear. But do you believe technological progress has stalled forever? If so, I'd get out of tech and start selling bomb shelters.

Do you really think the trend of consumer hardware is heading towards more memory and better specs? Apple's most popular product this year is an 8gb of RAM laptop..

The trend is heading in the opposite direction, less options for strong consumer hardware and towards cloud based products. This is a memory issue more than anything. Nvidia is done selling their ddr7 to gamers and people with AI girlfriends.


This is more then just the hardware evolving over time but we also are seeing big improvements in quantization and efficiency improvements.

There are physical limits to how much you can compress data. I'm just saying, don't sit on your hands waiting for this to happen, becuase its probably not going to for another decade +. There's no use in waiting, just write the code your fkin self and stop being lazy.

Just so that I have your position straight: you actually believe that over the long term, like 10, 20 years, that the amount of RAM in a laptop is going to go down?

It's not out of the realm of possibility, but I just want to make you aware that this would be a very surprising development in computing history.


This seems to be a different discussion than was going on up thread about:

> in the next few years a "good enough" model will run on entry-level hardware


Exactly. In the next few years, entry-level hardware will not be advancing beyond 16GB. And anything beyond 32GB will remain decidedly high-end.

And that's for laptops with unified memory. In the desktop space, 8GB discrete GPUs are going to be sticking around for a very long time.


I guess we'll find out! I bet all the vendors who supply RAM are looking at the current shortages and thinking "well, it's a shame we could never manufacture more RAM than we currently do."

A future with less RAM is possible with more applications using computational storage with ssd/nvme.

But that's not my main argument is that its delusional for OP thinks its reasonable to expect that soon we'll be able to run models on consumer hardware that will be able to build basically most things,

But I do think there will be many compromises made for consumer electronics, I don't think the powers that be are eager to give consumers all the best memory (that should be clear by now) There's 3 DDR5 DRAM manufactures in the world that have to provide memory to all the world's militaries, governments, datacenters/corporations. Consumers are last priority.


Did they modify their post? I can't see who claimed that consumer hardware will be able to build most things?

> If you looked at a graph of GPU power in consumer hardware and model capability per billion parameters over time, it seems inevitable that in the next few years a "good enough" model will run on entry-level hardware.

Of course there will always be larger flagship models, but if you can count on decent on-device inference, it materially changes what you can build.

I'm making some assumptions about what they're saying, but it seems clear they have no idea what they're about and that they're betting their competency on this technology.


If you're not paying attention to what's happening with small models, I suggest you take a closer look. Keeping parameter count constant, the quality of small models is rising fast. When you look at what you could do with Llama just 3 years ago vs Gemma 4 on the same 16GB hardware, the trend is clear.

Meanwhile, this year Apple bumped the base of their Mac lineup from 8GB to 16GB RAM, and the iPhone 17 Pro ships with 12GB. The Neo is at 8GB but is a brand new product tier which is not comparable to any past model.


Small models are gaining useful reasoning ability and that's a genuinely helpful development, but they'll be heavily limited in world knowledge for the foreseeable future. BTW, the base of the Mac lineup is now once again a 8GB device with a small and low-performance SSD. Many people will tell you that it's broadly comparable (though of course not identical!) to the original base model M1.

For many tasks, including lots of agentic applications, world knowledge is not a "must-have."

To me the Neo is an exception, and doesn't represent the core Mac lineup, which is all at 16GB+ of RAM. If you're developing pro software that would rely on an on-device LLM, you probably wouldn't be targeting the Neo anyway.


Anything can technically "run" on almost any hardware, the meaningful question is what's the real-world performance. I for one have made a case in this thread that DeepSeek V4 is de facto optimal for wide batching, not single-request or single-agent inference - even on consumer hardware (which is unique among practical AI models). I might still be wrong of course, but if so I'd like to understand what's wrong with my assumptions.

I find it disconcerting that an article about cognitive debt contains many "tells" of being written by AI.

Independent of that, the article is just a summary of a HN thread...

I had the same reaction, but the article is not AI-generated according to pangram, which I've generally found reliable. I wonder if LLM turns of phrase and even thought patterns are creeping into normal human thought.

Or, stay with me here, the LLMs were trained on how we, statistically, write.

There are typical LLM voices and styles, just like human writers have differentiated voices and styles. And some common elements of the typical LLM style are distinct from humans I've previously read.

I recognize this. It's also the case that I suspect that I've read more about how annoying suspected LLM output is to read than I have read LLM output. The slop is, to me, an incredibly unwelcome contribution of humans that don't enjoy the craft but complaining about it is equally stuck in and further exacerbating the froth rather than distilling down to the substance. That is it keeps the focus on the surface rather than on what the core content is and whether it has value.

LLM writing doesn’t have substance, it’s statistically likely text generated from some bullet points, without intention or style.

When you say "we" you're talking about Twitter, right?

I used that once, during a conference about 6 years ago and never again since. My use of "we" references humanity.

In that case, I feel that "we" needs some correction. Because these slopcannons get their ammunition from scraping the gargantuan septic tanks of the Internet, like Twitter and Facebook and whatever 600M Chinese people are using that I've never heard of.

Comparatively very little of that "humanity" corpus is coming from Shakespeare or Swift or Douglas Adams, much as we might prefer if it did.


I never claimed it was trained on the notables of humanity. On the other hand, to your adjacent point (to twist it) that humanity needs some correction, I whole heartedly agree.

Anytime I see “this is not just x, it’s y” i can almost guarantee with high degree of confidence that slop was used.

I'm still pissed that I had to practice removing that from my writing habits. I liked that device, dammit!

It's not just AI-generated, it's also slop!

As someone from outside the Anglophone cultural sphere, when I first learned to write in English, the kind of writing that AI now often produces was taught to me as “formal" writing.

But these days, when I write in that formal style, people sometimes say it sounds like AI. That has been a difficult and frustrating point for me.

I still find the subtle difference hard to understand.


I was raised and educated well inside the Anglosphere (USA) and was also taught to write formally in that way.

Do the people who say you sound like AI give you any specifics?

Also, if you don't mind, what was your English education like? I understand that quite a few Americans work in South Korea as teachers but I have no details about how that manifests.


That used to happen to me more often. When I first came to HN, and even now if I am not careful, my comments can get flagged. Also, when I translate from Korean using DeepL and paste the result, people often say it sounds flagged, awkward, or unnatural. I studied English more seriously in graduate school, although I dropped out. In Korea, there are quite a few Americans who teach English. Public schools often have native English-speaking instructors, but in my case I learned English more seriously at graduate school, and universities also make students study English almost semi-compulsorily.

In Seoul, there are probably many teachers who mainly teach middle and high school students, but a lot of that is through private education rather than the public school system.


It's worth mentioning pangram is more confident in it's positive detections than it's negative ones, as stated by the founder in an interview on the most recent ThursdAI episode

I think its bidirectional. We change our writing based on what we see (AI generated content on the internet) and AI will learn based on what we write.

Guarantee enterprises with SLAs aren't accepting them

The thing about an SLA is that once you’ve broken it you’ve lost the trust. It doesn’t _really_ matter what the cost is for breaking it, nobody chooses their platform based on the refund they’ll get if they’re down. But they absolutely do choose based on reliability and uptime. The enterprise SLA refund credit will show as a (big) metering blip, but the problem is the people who signed the contracts are going to be speaking to Gitlab now

I wonder about the nuance within the data. Like does AI do much worse with children than adults, but still better overall for example. Or biological male vs female. I think we'd want it to do better across all groups, ages etc so we're not introducing some kind of horrible bias resulting in deaths or serious health consequences for some groups

True. But this makes it easier to stand out in a sea of monotony.

It's not, but the author did say they have used this test against models when they come out. So it's possible that put the unpublished text into the training data for the next model, somehow linked back to the author's identity

> The obsession with the word "seam" as it pertains to coding

I quite liked this term when it started using it. And I appreciate the consistent way it talks about coding work even when working on radically different stacks and codebases


"Seam" has been stretched by AI from its original legacy-code context to any point in code where something can be plugged in. I actually asked an AI about this a few weeks ago because I was surprised by the consistent, frequent use of "seam".

Frequent words I see from GPT: "shape", "seam", "lane", "gate" (especially as verb), "clean", "honest", "land", "wire", "handoff", "surface" (noun), "(un)bounded", "semantics" (but this one is fair enough), and sometimes "unlock"

It feels like AI really likes to pick the shortest ways to express ideas even if they aren't the most common, which I suppose would make sense if that's actually what's happening.


This has happened in other industries before. Drafting for example when CAD arrived. Entry level wasn't "can draw, willing to learn" anymore, but demanded high domain understanding. So the pathway became compressed learning through study, and field exposure.

Study of senior drafter "red lines": what and why they changed the initial drawing, RFI response etc. Reverse engineering good work. Failed design studies etc.

SWE equivalents: PRs, code review, studying high quality codebases (guess what: LLMs are amazing at helping here), pair programming (learning why what the LLM did was wrong, how to improve it, etc), customer support, debugging prod incidents, studying post mortems etc

We don't hire juniors and throw them boilerplate and tiny bugs while expecting them to learn along the way ad hoc through some pair programming and the occasional deep end. We give them specific tasks and studies that develop their domain understanding and taste, actively support and mentor them, and expect them to drive some LLMs on the side to solve simple issues that still need human eyes on it.


> We don't hire juniors and throw them boilerplate and tiny bugs while expecting them to learn along the way ad hoc through some pair programming and the occasional deep end.

Is that generally the case though? I'm about two years into my first job in the industry and that's exactly my experience, and certainly frustrating...


>We don't hire juniors and throw them boilerplate and tiny bugs while expecting them to learn along the way ad hoc

Huh? This is exactly what almost everyone does


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: