The person who eventually fixed the issue, mkitti, had to push through a lot of "institutional" friction to do so, and the eventual fix is the result of his determined efforts.
I'm reminded of the recent post about R's (CRAN's) somewhat radical approach to integration testing [1] and I wonder if something like that would help with the composition issues described here.
The Julia world is already quite careful with testing and CI. Apart from the usual unit testing, many packages do employ integration testing. The Julia project itself (compiler, etc) is tested against the package ecosystem quite often (regularly and for select pull requests).
maxbond 1 hours ago [-]
I certainly didn't mean to imply that Julia's community was incompetent or that they were not doing integration testing. CRAN's approach (which is mandatory integration testing against all known dependents enforced by the packaging authority - the global and mandatory nature being what makes it different) is genuinely innovative and radical. I don't think that's an approach that should be adopted lightly or by most ecosystems, but I do observe that a.) these languages have similar goals and b.) it's an approach intended to solve problems of much the same shape as described in the article.
Again I think this approach is too radical for most ecosystems, but Julia is pursuing a similarly radical level of composability/reusability and evidently encountering difficulties with it, so I think there may be a compatibility there.
mgkuhn 6 hours ago [-]
Julia is a very powerful and flexible language. With very powerful tools you can get a lot done quickly, including shooting yourself into the foot. Julia's type-system allows you to easily compose different elements of Julia's vast package ecosystem in ways that possibly were never tested or even intended or foreseen by the authors of these packages to be used that way. If you don't do that, you may have a much better experience than the author. My own Julia code generally does not feed the custom type of one package into the algorithms of another package.
bobbylarrybobby 6 hours ago [-]
One can hardly call using the canonical autograd library and getting incorrect gradients or using arrays whose indices aren't 1:len and getting OOB errors “shooting oneself in the foot” — these things are supposed to Just Work, but they don't. Interfaces would go a long way towards codifying interoperability expectations (although wouldn't help with plain old correctness bugs).
With regard to power and flexibility, homoiconicity and getting to hook into compiler passes does make Julia powerful and flexible in a way that most other languages aren't. But I'm not sure if that power is what results in bugs — more likely it's the function overloading/genericness, whose power and flexibility I think is a bit overstated.
patagurbon 4 hours ago [-]
Zygote hasn’t been the “canonical” autodiff library for some time now, the community recognized its problems long ago. Enzyme and Mooncake are the major efforts and both have a serious focus on correctness
jacobolus 6 hours ago [-]
One of the basic marketing claims of the language developers is that one author's algorithm can be composed with another author's custom data type. If that's not really true in general, even for some of the most popular libraries and data types, maybe the claims should be moderated a bit.
hatmatrix 5 hours ago [-]
> one author's algorithm can be composed with another author's custom data type
This is true, and it's a powerful part of the language - but you can implement it incorrectly when you compose elements together that expect some attributes from the custom data type. There is no way to formally enforce that, so you can end up with correctness bugs.
ekjhgkejhgk 1 hours ago [-]
My experience is that it is true, if you throughly implement the interfaces that your types are supposed to respect. If you dont, well, thats not the languages fault.
dhampi 6 hours ago [-]
I quit Julia after running into serious bugs in basic CSV package functionality a few years back.
The language is elegant, intuitive and achieves what it promises 99% of the time, but that’s not enough compared to other programming languages.
ForceBru 6 hours ago [-]
Has anything changed since then? What are y'all's thoughts about correctness in Julia in 2025?
postflopclarity 6 hours ago [-]
improving, still not perfect. it was true then, and is even more true now, that a large fraction of "correctness bugs" (maybe even the majority) arise from `OffsetArrays.jl`, so a simple solution besides "avoid Julia" is "avoid that package"
hatmatrix 5 hours ago [-]
The greater issue that there is still no way to prevent those types of composability bugs.
krull10 4 hours ago [-]
The simplest approach is to always read the interface of packages one wants to use, and if one isn't provided look at the code / open an issue to interact with the developers about their input assumptions. One should also make tests to ensure the interface behaves in the expected manner when working with your code.
Using this approach since 2017 I've never really encountered the types of issues mentioned in Yuri's blog post. The biggest issue I've had is if some user-package makes a change that is effectively breaking but they don't flag it the associated release as breaking. But this isn't really a Julia issue so much as a user-space issue, and can happen in any language when relying on others' libraries.
postflopclarity 5 hours ago [-]
sure there are ways. they're just not employed as diligently as they should be. that's more of a social problem than a technical problem.
missinglugnut 4 hours ago [-]
It baffles me that they dug this hole in the first place. I have feelings on the zero-indexing vs one-indexing debate, but at the end of the day you can write correct code in either, as long as you know which one you're using.
But Julia fucked it up to where it's not clear what you're using, and library writers don't know which one has been passed! It's insane. They chose style over consistency and correctness and it's caused years of suffering.
csvance 3 hours ago [-]
Technically you don't need to know what array indexing is being used if you iterate using firstindex(arr):lastindex(arr). AFAIK the issue was that this wasn't consistently done across the Julia ecosystem including parts of the standard library at the time. No clue as to whether this still holds true, but I don't worry about it because I don't use OffsetArrays.
leephillips 5 hours ago [-]
This is a reasonable article, but way out of date now. Almost all issues raised were solved a while ago.
hatmatrix 5 hours ago [-]
I say this as a huge Julia fan, but the point is not the specific bugs in the article, but the culture of not prioritizing correctness in computation. The initial response by many (not all) in the community was look, those specific bugs are fixed; all languages have bugs; more importantly - look at the benchmark speeds of these computations! Which only reinforced this negative perception.
My understanding is that it's a difficult problem to solve, and there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge. In practice, composability problems arise seldomly, but there is no formal way to guard against it yet. I believe there was some work done at Northeastern U. [1] toward this goal but it's still up to the user to "be careful", essentially.
> the culture of not prioritizing correctness in computation
On the contrary, it is my impression the experienced Julia programmers, including those involved in JuliaLang/julia, take correctness seriously. More so than in many other PL communities.
> there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge
What exactly do you mean by "traits" or "interfaces"? Why do you think these "traits" would help with the issues that bug you?
munificent 5 hours ago [-]
> the culture of not prioritizing correctness in computation.
In a language with pervasive use of generic methods, I don't know what actually means. If I write a function like:
function add3(x, y, z)
x + y + z
end
Is it correct or not? What does "correct" even mean here? If you call it with values where `+` is defined on them and does what you expect, then my function probably does what you expect too. But if you pass values where `+` does something weird, then `add3()` does something weird too.
Is that correct? What expectations should someone have about a function whose behavior is defined in terms of calls to other open-ended generic functions?
Grikbdl 5 hours ago [-]
When I look, an issue such as "the base `sum!` is wrong" is still an open issue. Which, I think, is a bit ridiculous.
leephillips 5 hours ago [-]
This is a typical example:
• The documentation (currently) of the function warns not to use it this way;
• This is a rather perverse use of the function(s) that would be unlikely unless you’re trying to break things;
• The discussion on the issue page demonstrates the exact opposite of a culture not caring about correctness;
• This kind of stuff doesn’t matter to all the scientists who are actually using Julia to do real work.
Nevertheless, sum!() and friends should be, somehow, made to avoid this problem, certainly.
krull10 4 hours ago [-]
I've nearly exclusively used Julia since 2017. I don't think this is a perverse use of such functions -- long ago I naturally guessed I could use `cumsum!` on the same input and output and it would correctly overwrite the values (which now gives a similar warning in the documentation). However, when I first used it that way I tested if it did what I expected to verify my assumption.
It is good the documentation is now explicit that the behavior is not guaranteed in this case, but even better would be if aliasing were detected and handled (at least for base Julia arrays, so that the warning would only be needed for non-base types).
Still, the lesson is that when using generic functions one should look at what they expect of their input, and if this isn't documented one should at least test what they are giving thoroughly and not assume it just works. I've always worked this way, and never run into surprises like the types of issues reported in the blog post.
Currently there is no documentation on what properties an input to `sum!` must support in the doc string, so one needs to test its correctness when using it outside of base Julia data types (I haven't checked the broader docs for an interface specification, but if there is one it really should be linked in the docstring).
leephillips 4 hours ago [-]
But your use of cumsum!() seems natural; I can see using it that way, and might have done so myself. The use of sum!() under discussion seems weird, though.
nsajko 2 hours ago [-]
Julia is not without warts, but this blog post is kinda rubbish. The post claims vague but scary "correctness issues", trying to support this with a collection of unrelated issue tickets from all across Julia and the Julia package ecosystem. Not all of which were even bugs in the first place, and many of which have long been resolved.
The fact that bugs happen in software should not surprise anyone. Even software of critical importance, such as GCC or LLVM, whose correctness is relied upon by the implementations of many programming languages (including C, C++ and Julia itself), are buggy.
Instead the post could have focused more on actual design issues, such as some of the Base interfaces being underspecified:
> the nature of many common implicit interfaces has not been made precise (for example, there is no agreement in the Julia community on what a number is)
The underspecified nature of Number (or Real, or IO) is an issue, albeit not related with the rest of the blog post. It does not excuse the scaremongering in the blog post, however.
elcritch 6 hours ago [-]
Which is a bummer, however the basics are always critical. Numerical stability and known correctness is still (or was) a reason a lot of old fortran libraries were used for so long.
I'm really surprised by the list of issues as some of those are pretty recent (2024) and pretty important parts of the ecosystem like ordereddict.
sgaure 2 hours ago [-]
I use julia intensively and have done so for 5 years or so. I have never encountered anything I would call a correctness bug. I guess it depends on what you count as a correctness bug, and what you mean by "julia". The core language has no obvious bugs, but there are packages of dubious quality.
Say you use some package for numerical integration. One day you cook up your own floating point type, and use the same package with success. Then you change your floating point type subtly, and suddenly weird things start to happen. Is it a correctness bug? Whose bug?
Surely, the author of the integration package didn't have your weird floating type in mind, but it still worked. Until you made it even weirder. These are the things some people think are correctness bugs in julia. It's mostly poor coding.
max_ 6 hours ago [-]
I hear similar bugs exist in python libraries.
Any recommended libraries (or languages) that have thoroughly verified libraries?
mmacvicarprett 6 hours ago [-]
Do you mean any specific library?
max_ 6 hours ago [-]
Any really, library, language etc
TimorousBestie 6 hours ago [-]
In my professional experience, the older numerics libraries tend to be more reliable, with the notable exception of Intel’s MKL.
mmacvicarprett 6 hours ago [-]
How to quickly kill a language primarily meant for technical computing.
6 hours ago [-]
croes 6 hours ago [-]
> My conclusion after using Julia for many years is that there are too many correctness and composability bugs throughout the ecosystem to justify using it in just about any context where correctness matters.
Does this have any impact on the cosmological emulator written in Julia?
I wonder if you can even distinguish correctness issues caused by these bugs in Julia from just the underlaying ML model behaving weirdly.
leephillips 5 hours ago [-]
It certainly would if it were a timely and justified conclusion. Since it’s not, no, it has no impact.
CyberDildonics 6 hours ago [-]
I like the design of the language, but I eventually went through too many cycles of "fast compilation and/or module caching is coming in the next release" and "ahead of time compilation is coming soon" and got burned out. I remember believing the same stuff from java for years until forgetting about it.
mgkuhn 6 hours ago [-]
The pre-compilation speed/caching performance ("time to first plot") has practically been solved since 2024, when Julia 1.10 became the current LTS version. The current focus is on improving the generation of reasonably-sized stand-alone binaries.
CyberDildonics 5 hours ago [-]
I heard it for 10 years, I gave it too many chances. Each time it was solved, then it was going to be solved in a new release right around the corner, again and again. Maybe it is now, I don't care anymore.
TheRealPomax 6 hours ago [-]
I think I may have missed what alternative is being recommended instead, after scrolling through the whole article.
cs702 6 hours ago [-]
The OP shows examples of people being unable to solve a problem in Julia that they solve quickly after switching to PyTorch, Jax, or TensorFlow, so the OP is implicitly recommending those alternatives.
umvi 6 hours ago [-]
He doesn't recommend any alternatives. Looking at his GitHub profile (https://github.com/yurivish?tab=repositories), looks like he's using a lot of Rust and Go these days, though looks like mainly for projects unrelated to the sorts of data crunching suitable for Julia, R, etc
acomjean 6 hours ago [-]
We’ve had a lot of scientist use R and the “tidyverse” collection of packages. Ggplot2 is fantastic for graphing.
I thing a lot of them used “rstudio” to browse the data.
Scrolling through this list, it’s clear that many are “correctness issues.”
I do not link this to argue that scipy bugs are more serious or more frequent. I don’t think that kind of statistical comparison is meaningful.
However, I think a motivated reasoner could write a very similar blog post to the OP, but about $arb_python_lib instead of $arb_julia_lib.
I suppose my position is closer to “only Kahan, Boyd and Higham write correct numerical algorithms” (a hyperbole, but illustrative of the difficulty level).
Grikbdl 5 hours ago [-]
Explicit errors are not correctness issues. Some numerical instability issues in certain algorithms in certain corner cases might be considered as such though.
Regardless, overall, these are grossly of another complexity and seriousness than the base sum function being just wrong, or cultural issues among developers with not verifying inputs "for performance", or things of that nature. The scientific Python community has, in my experience, a much higher adherence to good standards than that.
postflopclarity 3 hours ago [-]
I have found multiple "correctness bugs" of equal seriousness in Polars, and one of them is still open. that is not to throw shade at polars --- I love that package! but my point is that these things happen everywhere in software.
TimorousBestie 4 hours ago [-]
> Explicit errors are not correctness issues.
Yes, of course. I am not conflating the two.
> The scientific Python community has, in my experience, a much higher adherence to good standards than that.
Not in my experience. Nor am I defending Julia.
wbolt 6 hours ago [-]
None
jerf 7 hours ago [-]
@dang: I'm not sure exactly when this was posted since it seems to have no date, but it's at least (2022) per HN's link from that year: https://news.ycombinator.com/item?id=31396861
I mention this because this is definitely the sort of content that can age poorly. I have no direct experience, I've never so much as touched Julia.
tokai 6 hours ago [-]
It cites an example from 2024 so the text has definitely been updated since 2022.
dang 6 hours ago [-]
Usually when an article is substantively the same but has been updated, we use the original year. I've put 2022 in the title now.
Edit: since there are (again? I seem to remember this last time) complaints about the title being a bit too baity, I've pilfered that previous title for this thread as well
cs702 6 hours ago [-]
@dang, thank you for doing that.
When I posted the OP, I considered changing the title, but decided not to editorialize it, per the guidelines.
As an experiment, I would be interested to see if somebody would make a 1-based python list-like data structure (or a 0-based R array), to check how many 3rd party (or standard library) function would no longer work.
6 hours ago [-]
add-sub-mul-div 6 hours ago [-]
We really don't need all the hall monitoring here, it's lame when you see a submission has comments but then it's meta stuff like complaints about dates and titles.
dang 6 hours ago [-]
I don't think 'hall monitoring' is fair here - it's standard for HN titles to include the year that an article dates from, and it's standard for users to point out when such a year hasn't been added to the title yet.
wk_end 6 hours ago [-]
The culture of "hall monitoring" is one of the best things about HN, IMO. It's one of the few places on the internet where people - including/not just the mods - care about maintaining a high quality of discourse.
jerf 6 hours ago [-]
"This Is A Really Cool Insect" doesn't really need (1998) in it, but for things like deficiencies of programming language it's helpful to know when the article comes from. "C++ Really Sucks (1995)" is a very different article from "C++ Really Sucks (2025)", and has very different takeaways for the reader.
refulgentis 6 hours ago [-]
Jerf's been here for 17 years, me, 16 years.
I've seen this article several times, and I'm sure Jerf has as well.
Our instinct, with years of being here, is it isn't a good fit for HN, at least at its current age and as labelled.
It is not conducive to healthy discussion to have an aged[1] blanket dismissal of a language coupled to an assertion that saying "the issues looked fixed?" is denial of people's lived experience.
[1] We can infer it was written in 2021, as the newest issue they created is from then, and they avowed never to use the language again.
Rendered at 22:32:54 GMT+0000 (Coordinated Universal Time) with Vercel.
I may refresh the post with more recent information at some point. In the meantime, those curious can find a short story of one newer correctness bug here: https://discourse.julialang.org/t/why-is-it-reliable-to-use-...
The person who eventually fixed the issue, mkitti, had to push through a lot of "institutional" friction to do so, and the eventual fix is the result of his determined efforts.
While his part of the story mostly played out in venues outside of the Discourse forum some of it is on display in this thread: https://discourse.julialang.org/t/csv-jl-findmax-and-argmax-...
[1] https://news.ycombinator.com/item?id=45259623
Again I think this approach is too radical for most ecosystems, but Julia is pursuing a similarly radical level of composability/reusability and evidently encountering difficulties with it, so I think there may be a compatibility there.
With regard to power and flexibility, homoiconicity and getting to hook into compiler passes does make Julia powerful and flexible in a way that most other languages aren't. But I'm not sure if that power is what results in bugs — more likely it's the function overloading/genericness, whose power and flexibility I think is a bit overstated.
This is true, and it's a powerful part of the language - but you can implement it incorrectly when you compose elements together that expect some attributes from the custom data type. There is no way to formally enforce that, so you can end up with correctness bugs.
The language is elegant, intuitive and achieves what it promises 99% of the time, but that’s not enough compared to other programming languages.
Using this approach since 2017 I've never really encountered the types of issues mentioned in Yuri's blog post. The biggest issue I've had is if some user-package makes a change that is effectively breaking but they don't flag it the associated release as breaking. But this isn't really a Julia issue so much as a user-space issue, and can happen in any language when relying on others' libraries.
But Julia fucked it up to where it's not clear what you're using, and library writers don't know which one has been passed! It's insane. They chose style over consistency and correctness and it's caused years of suffering.
My understanding is that it's a difficult problem to solve, and there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge. In practice, composability problems arise seldomly, but there is no formal way to guard against it yet. I believe there was some work done at Northeastern U. [1] toward this goal but it's still up to the user to "be careful", essentially.
[1] https://repository.library.northeastern.edu/files/neu:4f20cn...
On the contrary, it is my impression the experienced Julia programmers, including those involved in JuliaLang/julia, take correctness seriously. More so than in many other PL communities.
> there are people working on traits/interfaces - but these are still peripheral projects and not part of the core mission to my knowledge
What exactly do you mean by "traits" or "interfaces"? Why do you think these "traits" would help with the issues that bug you?
In a language with pervasive use of generic methods, I don't know what actually means. If I write a function like:
Is it correct or not? What does "correct" even mean here? If you call it with values where `+` is defined on them and does what you expect, then my function probably does what you expect too. But if you pass values where `+` does something weird, then `add3()` does something weird too.Is that correct? What expectations should someone have about a function whose behavior is defined in terms of calls to other open-ended generic functions?
• The documentation (currently) of the function warns not to use it this way;
• This is a rather perverse use of the function(s) that would be unlikely unless you’re trying to break things;
• The discussion on the issue page demonstrates the exact opposite of a culture not caring about correctness;
• This kind of stuff doesn’t matter to all the scientists who are actually using Julia to do real work.
Nevertheless, sum!() and friends should be, somehow, made to avoid this problem, certainly.
It is good the documentation is now explicit that the behavior is not guaranteed in this case, but even better would be if aliasing were detected and handled (at least for base Julia arrays, so that the warning would only be needed for non-base types).
Still, the lesson is that when using generic functions one should look at what they expect of their input, and if this isn't documented one should at least test what they are giving thoroughly and not assume it just works. I've always worked this way, and never run into surprises like the types of issues reported in the blog post.
Currently there is no documentation on what properties an input to `sum!` must support in the doc string, so one needs to test its correctness when using it outside of base Julia data types (I haven't checked the broader docs for an interface specification, but if there is one it really should be linked in the docstring).
The fact that bugs happen in software should not surprise anyone. Even software of critical importance, such as GCC or LLVM, whose correctness is relied upon by the implementations of many programming languages (including C, C++ and Julia itself), are buggy.
Instead the post could have focused more on actual design issues, such as some of the Base interfaces being underspecified:
> the nature of many common implicit interfaces has not been made precise (for example, there is no agreement in the Julia community on what a number is)
The underspecified nature of Number (or Real, or IO) is an issue, albeit not related with the rest of the blog post. It does not excuse the scaremongering in the blog post, however.
I'm really surprised by the list of issues as some of those are pretty recent (2024) and pretty important parts of the ecosystem like ordereddict.
Say you use some package for numerical integration. One day you cook up your own floating point type, and use the same package with success. Then you change your floating point type subtly, and suddenly weird things start to happen. Is it a correctness bug? Whose bug?
Surely, the author of the integration package didn't have your weird floating type in mind, but it still worked. Until you made it even weirder. These are the things some people think are correctness bugs in julia. It's mostly poor coding.
Any recommended libraries (or languages) that have thoroughly verified libraries?
Does this have any impact on the cosmological emulator written in Julia?
https://news.ycombinator.com/item?id=45346538
I thing a lot of them used “rstudio” to browse the data.
https://www.tidyverse.org/
https://github.com/scipy/scipy/issues?q=is%3Aissue%20state%3...
Scrolling through this list, it’s clear that many are “correctness issues.”
I do not link this to argue that scipy bugs are more serious or more frequent. I don’t think that kind of statistical comparison is meaningful.
However, I think a motivated reasoner could write a very similar blog post to the OP, but about $arb_python_lib instead of $arb_julia_lib.
I suppose my position is closer to “only Kahan, Boyd and Higham write correct numerical algorithms” (a hyperbole, but illustrative of the difficulty level).
Regardless, overall, these are grossly of another complexity and seriousness than the base sum function being just wrong, or cultural issues among developers with not verifying inputs "for performance", or things of that nature. The scientific Python community has, in my experience, a much higher adherence to good standards than that.
Yes, of course. I am not conflating the two.
> The scientific Python community has, in my experience, a much higher adherence to good standards than that.
Not in my experience. Nor am I defending Julia.
I mention this because this is definitely the sort of content that can age poorly. I have no direct experience, I've never so much as touched Julia.
The previous HN thread:
Correctness and composability bugs in the Julia ecosystem - https://news.ycombinator.com/item?id=31396861 - May 2022 (407 comments)
Edit: since there are (again? I seem to remember this last time) complaints about the title being a bit too baity, I've pilfered that previous title for this thread as well
When I posted the OP, I considered changing the title, but decided not to editorialize it, per the guidelines.
As an experiment, I would be interested to see if somebody would make a 1-based python list-like data structure (or a 0-based R array), to check how many 3rd party (or standard library) function would no longer work.
I've seen this article several times, and I'm sure Jerf has as well.
Our instinct, with years of being here, is it isn't a good fit for HN, at least at its current age and as labelled.
It is not conducive to healthy discussion to have an aged[1] blanket dismissal of a language coupled to an assertion that saying "the issues looked fixed?" is denial of people's lived experience.
[1] We can infer it was written in 2021, as the newest issue they created is from then, and they avowed never to use the language again.