Hacker Newsnew | past | comments | ask | show | jobs | submit | gdiamos's commentslogin

“We hold a meeting to talk about the meetings, and another to plan the meetings about the meetings.“

I dropped scrum last year.


People must realize that it's people who book meetings, not processes.

And some of them book meetings to try to justify their presence.

Blaming scrum for having meetings on the calendar is a scapegoat. You will have those meetings show up regardless of what processes you chose to replace scrum.


Lies, damned lies.

SCRUM by definition is meetings.

in fact, I like this topic, because usually people use the motte and bailey of Agile. (as in, Agile is a manifesto to some, or a process to someone else) but scrum is super well defined, and it is defined by: structured meetings.


Is there, in Scrum, a meeting "to plan the meetings about the meetings", as the critique suggests?

My understanding is there are 3 meetings in core scrum:

- planning (which is not about planning meetings, but about breaking down work)

- review

- retrospective

None of those are 2nd or 3rd order "meeting planning" meetings.

If people throw in backlog grooming or other sessions, that's up to them.


I used to think evil killer robot discussions among AI researchers was an idea based in Hollywood, not science.

Then I realized how effective the fear was at fundraising...


I originally thought evil killer robots discussions in AI labs was an idea out of Hollywood.

Then I saw how effective it was at raising money.


I personally don't mind letting Claude write about work.

You could spend 80% doing the work and 20% writing about it, or 99% doing the work and 1% copy-pasting Claude's writeup about it into a blog.

There is nothing wrong with writing if you are into it, and yes you can probably do better than Claude, but I can related to engineers who just want to build.


If you can’t be bothered to write it, why should I bother to read it?


Because it contains information of value to you ? I mean if it doesn’t, just don’t read it.



To quote another HN comment recently made:

> Using AI to write content is seen so harshly because it violates the previously held social contract that it takes more effort to write messages than to read messages. If a person goes through the trouble of thinking out and writing an argument or message, then reading is a sufficient donation of time.

However, with the recent chat based AI models, this agreement has been turned around. It is now easier to get a written message than to read it. Reading it now takes more effort. If a person is not going to take the time to express messages based on their own thoughts, then they do not have sufficient respect for the reader, and their comments can be dismissed for that reason.


So to a large extent I appreciate that argument, however I feel this applied more to throwaway comments or sales outreach, writing with low information density. In this occasion the work that went into it is a lot, it would be lost or inaccessible to me otherwise, I am genuinely grateful someone stuck their work in an LLM, said tidy this up to post, and hit enter.


I could spend 100% doing the work with my own Claude, and 0% reading yours. That's a negative-sum outcome. I do think that the 80%/20% split is better (though anything that is mostly human voice is fine for me).


Because the failures are so frequent and often load-bearing that it makes it a negative sum to even attempt to read stuff that appears generated.


One of my lessons in using different accelerators, whether they be different NVIDIA versions, or GPU->TPU, etc is that someone needs to do this work of indexing, partitioning, mapping, scheduling, and benchmarking. That work is labor intensive.

In this case, google has already done it, and that will be true for high resourced accelerator companies like Google working with the most popular operations like attention.

As long as you use those operations, you are okay. But if you do something different, you need to be prepared to do all of this yourself.


Results as good as Qwen has been posting would seem to trigger a power struggle.

I think companies that don’t navigate these correctly eventually lose.


It was inevitable.


This is why I like Dario as a CEO - he has a system of ethics that is not jus about who writes the largest check.

You may not agree with it, but I appreciate that it exists.


I know the frontier “labs” are holding back publications.

I don’t think it will last among researchers who think beyond production LLMs


Most people don't appreciate how many dead end applications NVIDIA explored before finding deep learning. It took a very long time, and it wasn't luck.


It was luck that a viable non-graphics application like deep learning existed which was well-suited to the architecture NVIDIA already had on hand. I certainly don't mean to diminish the work NVIDIA did to build their CUDA ecosystem, but without the benefit of hindsight I think it would have been very plausible that GPU architectures would not have been amenable to any use cases that would end up dwarfing graphics itself. There are plenty of architectures in the history of computing which never found a killer application, let alone three or four.


Even that is arguably not lucky, it just followed a non-obvious trajectory. Graphics uses a fair amount of linear algebra, so people with large scale physical modeling needs (among many) became interested. To an extent the deep learning craze kicked off because of developments in computation on GPUs enabled economical training.


Nvidia started their GPGPU adventure by acquiring a physics engine and porting it over to run on their GPUs. Supporting linear algebra operations was pretty much the goal from the start.


They were also full of lies when they have started their GPGPU adventure (like also today).

For a few years they have repeated continuously how GPGPU can provide about 100 times more speed than CPUs.

This has always been false. GPUs are really much faster, but their performance per watt has oscillated during most of the time around 3 times and sometimes up to 4 times greater in comparison with CPUs. This is impressive, but very far from the "100" factor originally claimed by NVIDIA.

Far more annoying than the exaggerated performance claims, is how the NVIDIA CEO was talking during the first GPGPU years about how their GPUs will cause a democratization of computing, giving access for everyone to high-throughput computing.

After a few years, these optimistic prophecies have stopped and NVIDIA has promptly removed FP64 support from their price-acceptable GPUs.

A few years later, AMD has followed the NVIDIA example.

Now, only Intel has made an attempt to revive GPUs as "GPGPUs", but there seems to be little conviction behind this attempt, as they do not even advertise the capabilities of their GPUs. If Intel will also abandon this market, than the "general-purpose" in GPGPUs will really become dead.


GPGPU is doing better than ever.

Sure FP64 is a problem and not always available in the capacity people would like it to be, but there are a lot of things you can do just fine with FP32 and all of that research and engineering absolutely is done on GPU.

The AI-craze also made all of it much more accessible. You don't need advanced C++ knowledge anymore to write and run a CUDA project anymore. You can just take Pytorch, JAX, CuPy or whatnot and accelerate your numpy code by an order of magnitude or two. Basically everyone in STEM is using Python these days and the scientific stack works beautifully with nvidia GPUs. Guess which chip maker will benefit if any of that research turns out to be a breakout success in need of more compute?


> GPGPU can provide about 100 times more speed than CPUs

Ok. You're talking about performance.

> their performance per watt has oscillated during most of the time around 3 times and sometimes up to 4 times greater in comparison with CPUs

Now you're talking about perf/W.

> This is impressive, but very far from the "100" factor originally claimed by NVIDIA.

That's because you're comparing apples to apples per apple cart.


For determining the maximum performance achievable, the performance per watt is what matters, as the power consumption will always be limited by cooling and by the available power supply.

Even if we interpret the NVIDIA claim as referring to the performance available in a desktop, the GPU cards had power consumptions at most double in comparison with CPUs. Even with this extra factor there has been more than an order of magnitude between reality and the NVIDIA claims.

Moreover I am not sure whether around 2010 and before that, when these NVIDIA claims were frequent, the power permissible for PCIe cards had already reached 300 W, or it was still lower.

In any case the "100" factor claimed by NVIDIA was supported by flawed benchmarks, which compared an optimized parallel CUDA implementation of some algorithm with a naive sequential implementation on the CPU, instead of comparing it with an optimized multithreaded SIMD implementation on that CPU.


At the time, desktop power consumption was never a true limiter. Even for the notorious GTX 480, TDP was only 250 W.

That aside, it still didn't make sense to compare apples to apples per apple cart...


Well, power envelope IS the limit in many applications; anyone can build a LOBOS (Lots Of Boxes On Shelves) supercomputer, but data bandwidth and power will limit its usefullness and size. Everyone has a power budget. For me, it's my desk outlet capacity (1.5kW); for a hyperscaler, it's the capacity of the power plant that feeds their datacenter (1.5GW); we both cannot exceed Pmax * MIPS/W of computation.


All of that may be true but it’s irrelevant.

If you’re dividing perf by perf/W, it makes no sense to yell “it’s not equal to 100!” You simply failed at dimension analysis taught in high school.


> A few years later, AMD has followed the NVIDIA example.

When bitcoin was still profitable to mine on GPUs, AMD's performed better due to not being segmented like NVIDIA cards. It didn't help AMD, not that it matters. AMD started segmenting because they couldn't make a competitive card at a competitive price for the consumer market.


That physics engine is an example of a dead-end.


There's something of a feedback loop here, in that the reason that transformers and attention won over all the other forms of AI/ML is that they worked very well on the architecture that NVIDIA had already built, so you could scale your model size very dramatically just by throwing more commodity hardware at it.


It was luck, but that doesn't mean they didn't work very hard too.

Luck is when preparation meets opportunity.


It was definitely luck, greg. And Nvidia didn't invent deep learning, deep learning found nvidias investment in CUDA.


I remember it differently. CUDA was built with the intention of finding/enabling something like deep learning. I thought it was unrealistic too and took it on faith in people more experienced than me, until I saw deep learning work.

Some of the near misses I remember included bitcoin. Many of the other attempts didn't ever see the light of day.

Luck in english often means success by chance rather than one's own efforts or abilities. I don't think that characterizes CUDA. I think it was eventual success in the face of extreme difficulty, many failures, and sacrifices. In hindsight, I'm still surprised that Jensen kept funding it as long as he did. I've never met a leader since who I think would have done that.


Nobody cared about deep learning back in 2007, when CUDA released. It wasn't until the 2012 AlexNet milestone that deep neural nets start to become en vogue again.


I clearly remember Cuda being made for HPC and scientific applications. They added actual operations for neural nets years after it was already a boom. Both instances were reactions, people already used graphics shaders for scientific purposes and cuda for neural nets, in both cases Nvidia was like oh cool money to be made.


Parallel computing goes back to the 1960s (at least). I've been involved in it since the 1980s. Generally you don't create an architecture and associated tooling for some specific application. The people creating the architecture only have a sketchy understanding of application areas and their needs. What you do is have a bright idea/pet peeve. Then you get someone to fund building that thing you imagined. Then marketing people scratch their heads as to who they might sell it to. It's at that point you observed "this thing was made for HPC, etc" because the marketing folks put out stories and material that said so. But really it wasn't. And as you note, it wasn't made for ML or AI either. That said in the 1980s we had "neural networks" as a potential target market for parallel processing chips so it's aways there as a possibility.


CUDA was profitable very early because of oil and gas code, like reverse time migration and the like. There was no act of incredible foresight from jensen. In fact, I recall him threatening to kill the program if large projects that made it not profitable failed, like the Titan super computer at oak ridge.


I remember it being less profitable than graphics for a long time.

It did make money that would be interesting to a startup, but not to a public company.


Again, it wasn't exactly a huge sink of resources. There was no genius gamble from jensen like you are suggesting. I suspect your view here is intrinsically tied to your need to feel like you and others who are in your position are responsible for your own success, when in fact it's mostly about luck.


So it could just as easily have been Intel or AMD, despite them not having CUDA or any interest in that market? Pure luck that the one large company that invested to support a market reaped most of the benefits?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: