More

jstimpfle · 2026-05-04T08:29:33 1777883373

> The truth here is that you might want a lot of different operations and the C choice is not only to provide a single choice, which made a lot more sense 50+ years ago than it does today, but to provide a singularly bad default.

C has a single '+' operator, just like Rust has. And what that operator does depends on the types to the left and to the right. You can cast between integer types to achieve different behaviours depending on what you want.

About u8::unchecked_sub() etc, those are just regular functions. Not really a language thing. Yes, nothing of that is standardized in C AFAIK, but I'll happily use e.g. __builtin_add_overflow() or whatever in practice.

We can argue all day long what are the right defaults, checked or unchecked operations. If you want to be safe, you want the compiler to emit checks. It's probably possible to get some of those in GCC. If you want to emit streamlined machine code, you'll definitely not want to add checks after every machine instruction.

tialaramex · 2026-05-04T10:05:17 1777889117

> About u8::unchecked_sub() etc, those are just regular functions. Not really a language thing. Yes, nothing of that is standardized in C

Well "regular functions" in the sense that these are methods of the primitive type u8†, and of course neither C nor C++ can do that at all. So, yeah, it's a language thing.

In C++ what you'd do here instead is invent custom types and add the methods you want to the types, and I would give C++ credit here if the stdlib provided say, a bunch-of-bits base type with all the bit-twiddling methods defined and maybe specialisations for the 32-bit and 8-bit unsigned integers or something, but AFAICT it doesn't do anything like that.

"I could go out of my way to do this" is true for everything in any of the general purpose languages by their nature.

† In Rust if we define a function associated to a type T with a "self" first parameter then you can call that function as just a method on any value of type T and the appropriate parameter is inferred. So e.g. u8::checked_sub(u8::MAX, 10) is Some(245) but u8::MAX.checked_sub(10) is also Some(245) because it de-sugars to the same call.

jstimpfle · 2026-05-04T11:49:07 1777895347

__builtin_add_overflow() is type-generic as well, what is the big difference to this Rust stuff? I'd say Rust is more ergonomic here (standardized calls, I don't have to resort to GCC builtins). But it's not fundamentally different in capabilities.

I really don't care about functions vs methods, what is the difference, it's just syntax. Actually, keeping to regular function calls is mostly more readable to me, compared to using methods, using short unqualified method names, mixing function calls and methods calls, nesting/chaining functions and methods.

tialaramex · 2026-05-04T16:00:37 1777910437

Sure, other than the ergonomics it's the same. But, other than the ergonomics Fortran and C# are the same so what are we even talking about ?

Better ergonomics means it's more likely the programmer writes what they actually meant, and if you do that modern compilers have got better at making what you meant fast even as they remain the same or perhaps slightly worse at converting vague gestures which aren't clear about what you meant into what you had hoped for without expressing it.

jstimpfle · 2026-05-03T11:42:13 1777808533

Signed quantities are a good default, and are easier to deal with when doing subtractions and mixing integers of different widths. (And integers includes pointers here, so it's very hard to not have different widths).

However unsigned integers are still very useful, I'd say essential, in low-level programming. For example when doing buffer management and memory allocation.

   - bitwise operations
   - modular arithmetic implemented with just ++, -- (ringbuffers, e.g TCP sequence numbers)
   - using the full range of a 8-bit, 16-bit, 32-bit datatype (quite common)
   - splitting a positive quantity into two smaller quantities, e.g. using a 16-bit index as 8-bit major index plus 8-bit minor index.

etc.

Don't forget that the signed vs unsigned integer is in some sense an artificial distinction. Machines have you put the distinction in the CPU instructions themselves, they don't track a "signed" property as part of values. And it can make sense to use the same value in different ways. However, C and many other languages decided to put a tag on the type, so operator syntax can be agnostic to signedness, and the compiler will choose the appropriate CPU instruction.

zozbot234 · 2026-05-03T12:39:05 1777811945

> However, C and many other languages decided to put a tag on the type, so operator syntax can be agnostic to signedness, and the compiler will choose the appropriate CPU instruction.

It mostly comes up with widening conversions (signed numbers must extend the sign bit, unsigned numbers set the extra bits to zero), unsigned/signed divide (and multiply, in case of a widened result) and greater than/less than comparisons (and of course geq/leq). (With signed comparison, A is less than B if by starting from INT_MIN (included) and iteratively incrementing you reach A before B. With unsigned comparison, A is less than B if by starting from 0 (included) and iteratively incrementing you reach A before B. This way of phrasing comparison as range inclusion is convenient, since it works around the wrapping concern in a rather clean way.)

jstimpfle · 2026-05-03T11:24:24 1777807464

The way he conducted those interviews, and the conclusions he drew from them, may have been flawed. Because the situation now is that C has unsigned types and Java mostly has not.

And despite all pitfalls especially around mixing signed and unsigned in C, unsigned types are very useful, I'd in fact say that for low-level programming they are essential.

pjmlp · 2026-05-03T14:34:00 1777818840

Doesn't seem to affect the extent Java is used across the industry, including many workloads that in the last century companies would use C instead.

Books like Yourdon Structured Method were mainly targeted to business C back in the day.

__s · 2026-05-03T19:11:48 1777835508

He could've removed implicit signed conversion

jstimpfle · 2026-05-02T13:51:03 1777729863

By adding to the language itself, you mostly make stuff worse. The major reason why C is useful is its quite stable syntax and semantics. Language is typically not the area where you want to add code. It's much better (and much easier) to invent function APIs. See how they shake out, if they're good you might get some adoption.

jstimpfle · 2026-04-19T13:03:53 1776603833

Bounds checks have nothing to do with data races. GP is right, you can add bounds checks. Either using macros or (in C++) with templates and operator overloading.

tialaramex · 2026-04-22T15:51:29 1776873089

Alas, in C or C++ you have mutable aliasing, so I'm afraid you do incur a potential data race because your bounds might alias. Be careful out there.

Also remember that in C++ you may get a reference in these cases and if you keep that reference rather than using it immediately now you also have a potential TOCTOU race because the reference was only valid when you did the bounds check.

jstimpfle · 2026-04-23T15:04:25 1776956665

True, but you do incur potential data races _everywhere_. There's no relation to bounds checking specifically.

tialaramex · 2026-04-24T15:42:43 1777045363

Ah, maybe I should have made the example clearer

With mutable aliasing the length might change even though the data you care about did not, and so adding the check means incurring a race which did not previously exist and which certainly the naive C programmer cannot see...

We can definitely mitigate this in the type system for most real world scenarios, but you don't mitigate problems you don't know about, so knowing is what's important.

jstimpfle · 2026-04-19T11:48:31 1776599311

> Fun fact: wifi is not an acronym for anything, the inventors simply liked how it sounded.

Most certainly it's a reference to "Sci-Fi" or "Hi-Fi".

CaninoDev · 2026-04-19T14:22:58 1776608578

I always thought Wi-Fi meant wireless fidelity? (Or wireless fiction since in the end, everything is wired).

wlonkly · 2026-04-19T23:02:51 1776639771

It doesn't, but the phrase was used in the early days.

https://boingboing.net/2005/11/08/wifi-isnt-short-for.html

pocksuppet · 2026-04-20T00:03:06 1776643386

t was made to sound like Hi-Fi, which stands for high fidelity, and Wireless, but "wireless fidelity" is a meaningless phrase and not what it was intended to directly mean.

jstimpfle · 2026-04-16T16:34:54 1776357294

> Less and less people feel it, because people old-enough to have used branch-powered VCSes have long forgotten about them, and those who didn't forget are under-represented in comparison to the newcomers who never have experienced anything else since git became a monopoly.

I'm old enough to have used SVN (and some CVS) and let me tell you branching was no fun, so much that we didn't really do it.

jstimpfle · 2026-04-16T16:33:39 1776357219

Just because there is one project apparently using this in a way that indicates someone could perceive something as a weakness... It doesn't mean it's a real weakness (nor that it's serious).

You can just not move branches. But once you can do it, you will like it. And you are going to use

   git branch --contains COMMIT

which will tell you ALL the branches a commit is part of.

Git's model is clean and simple, and makes a whole lot of sense. IMHO.

jstimpfle · 2026-04-15T08:48:41 1776242921

Most code most people work on isn't about algorithms at all. The most straightforward algorithm will do. Maybe put some clever data structure somewhere in the core.But for the vast majority of code, there isn't any clear algorithmic improvement, and even if there was, it wouldn't make a difference for the typically small workloads that most pieces of code are processing.

I'll take it back a little bit, because there _is_ in fact a lot of algorithmically inefficient code out there, which slows down everything a lot. But after getting the most obvious algorithmic problems out of the way -- even a log-n algorithm isn't much of an improvement to a linear scan, if n < 1000. It's much more important to get that 100+x speedup by implementing the algorithm in a straightforward and cache friendly way.

jstimpfle · 2026-04-13T23:26:19 1776122779

Now that is interesting too, because git is very fast for all I have ever done. It may not scale to Google monorepo size, it would ve the wrong tool for that. But if you are talking Linux kernel source scale, it asolutely, is fast enough even for that.

For everything I've ever done, git was practically instant (except network IO of course). It's one of the fastest and most reliable tools I know. If it isn't fast for you, chances are you are on a slow Windows filesysrem additionally impeded by a Virus scanner.

forrestthewoods · 2026-04-13T23:46:53 1776124013

The fact that Git has an extremely strong preference for storing full and complete history on every machine is a major annoyance! “Except for network IO” is not a valid excuse imho. Cloning the Linux kernel should take only a few seconds. It does not. This is slow and bad.

The mere fact that Git is unable to handle large binary files makes it an unusable tool for literally every project I have ever worked on in my entire career.

jstimpfle · 2026-04-14T10:14:32 1776161672

git clone --bare --depth=1 https://github.com/torvalds/linux

Takes 21 seconds on my work laptop, indeed a corporate Windows laptop with antivirus installed. Majority of that time is simply network I/O. The cloned repository is 276 MB large.

Actually checking the kernel out takes 90 seconds. This amounts to creating 99195 individual files, totaling 2 GB of data. Expect this to be ~10 times faster on a Linux file system.

So what's your problem?

forrestthewoods · 2026-04-16T07:53:32 1776326012

—-depth=1 is a hack and breaks assorted things. It’s irritating. No I can’t tell you what random rakes I’ve stepped on in the past because of this. Yes they still exist.

If you’d like to argue that version control should be centralized, shallow, and sparse by default then I agree.

jstimpfle · 2026-04-17T11:58:51 1776427131

> If you’d like to argue that version control should be centralized, shallow, and sparse by default then I agree.

I get your sentiment, but I know how working with e.g. SVN feels. Just doing "svn log" was a pain when I had to do it. The "distributed" aspect of DVCS doesn't prevent you from keeping central what you need central. E.g. you can have github or your own hosting server that your team is exchanging through.

The main point of distributed is speed and self-sufficiency which is a huge plus. E.g. occasional network outages and general lack of bandwidth are still a thing in 2026 (and remain so to some extent for the foreseeable future).

Now, could git improve and allow some things to be staged/tiered/transparently cached better? Probably, and that's where some things like LFS come in. I don't have a large amount of experience in this field though, because what I work with is adequately served by the out-of-the-box git experience.

jstimpfle · 2026-04-16T15:43:18 1776354198

Then just do git pull --unshallow whenever you see fit. I normally don't do --depth 1 because cloning repositories is rarely my bottleneck. Just saying that when you need a relatively fast clone time, you can have it.

spockz · 2026-04-14T04:59:33 1776142773

Git-lfs exists for a while now. Does that fix your issue? Or do you mean that it doesn’t support binary diffs?

forrestthewoods · 2026-04-14T06:57:33 1776149853

Git LFS is a gross hack that results in pain and suffering. Effectively all games use Perforce because Git and GitLFS suck too much. It’s a necessary evil.

spockz · 2026-04-15T06:40:49 1776235249

We use git-lfs quite contentedly but we don’t require diffs on binaries. What pain and suffering are you eluding to specifically?

pabs3 · 2026-04-14T04:21:08 1776140468

Git handles large text files and large directories fairly poorly too.