Open Source AI Definition Erodes the Meaning of “Open Source”

Bora M. Alper@lemmy.world · 4 days ago

Open Source AI Definition Erodes the Meaning of “Open Source”

TootSweet@lemmy.world · 4 days ago

I really can’t overstate how much respect I have for Kuhn and the SFC. If RMS and the FSF are the Free Software movement’s past, Kuhn and the SFC are it’s future, and I can’t imagine anyone better to carry that particular torch.

hendrik@palaver.p3x.de · 4 days ago

Oh, wow. Should be pretty obvious that something isn’t open source, …well… unless the source is open…

CosmoNova@lemmy.world · 3 days ago

You would think that but even here on the Fediverse where many users have an affection for technology and are generally vary of AI, I‘ve seen people gobbling up the Open Source label when the model was open weights at best.

hendrik@palaver.p3x.de · edit-2 3 days ago

I’ve also had that. And I’m not even sure whether I want to hold it against them. For some reason it’s an industry-wide effort to muddy the waters and slap open source on their products. From the largest company who chose to have “Open” in their name but oppose transparency with every fibre of their body, to Meta, the curren pioneer(?) of “open sourcing” LLMs, to the smaller underdogs who pride themselves with publishing their models that way… They’ve all homed in on the term.

And lots of the journalists and bloggers also pick up on it. I personally think, terms should be well-defined. And open-source had a well-defined meaning. I get that it’s complicated with the transformative nature of AI, copyright… But I don’t think reproducibility is a question here at all. Of course we need that, that’s core to something being open. And I don’t even understand why the OSI claims it doesn’t exist… Didn’t we have datasets available until LLaMA1 along with an extensive scientific paper that made people able to reproduce the model? And LLMs aside, we sometimes have that with other kinds of machine learning…

(And by the way, this is an old article, from end of october last year.)

Bora M. Alper@lemmy.world · 4 days ago

That’s a really beautiful & concise way of putting it <3

obbeel@lemmy.eco.br · 3 days ago

You see this on GitHub already. People publish paper results and manuals, along with a few files, and treat that as if it were open source. And this isn’t limited to LLMs, people with CNN papers or crawlers and other results publish a few files and the results on GitHub as if it were open source. I think this is a clash between current scientific community thinking + Big Tech vs Free Software + Free Culture initiatives.

Additionally, you can’t expect something Microsoft/Meta touches to remain untainted for long.