I drafted an unofficial document named ML-Policy[5][...]
[5]: https://salsa.debian.org/deeplearning-team/ml-policy/-/blob/master/ML-Policy.rst
Maybe it is time for us to build a consensus on how we tell whether a
piece of AI is DFSG-compliant or not, instead of waiting for
ftp-masters to interpret those binary blobs case-by-case.
Do we need a GR to reach a consensus?
My mind remains mostly the same from 6 years ago. And after 5~6 years,
the most important concept in ML-Policy remains to be ToxicCandy,
which is exactly AI released under open source license with their
training data hidden.
While I agree with Stefano's later followup that GR's are not good tools
for building concensus, I'm not sure such policy decision is really in
the spirit of the FTP master delegation. I recognize that my skepticism
is influenced by the fact that I would consider following the proposed "OSAID" model to be a substantial weakening of the DFSG.
I don't understand your argument that this decision is not in the realm
of the ftpmaster activities. How could it *not* be, given they are the
team deciding NEW queue acceptance, and that most notably they do so
based on licensing aspects?
The companies [...] want to restrict what you can actually use it
for, and call it open source? And then OSI makes a definition that
seems carefully crafted to let these kind of licenses slip through?
What is the OSI's motivation for creating such an incredibly lax definition for open source AI? Meta is already calling their absolutely-not-open-source model Open Source and promoting it as such, without as much as a *peep* from the OSI condemning the abuse of the term. (although, while doing a quick search to make sure that's true, I found this link from OSI to an article that keeps insisting that LLama3 is open source: https://opensource.org/press-mentions/meta-inches-toward-open-source-ai-with-new-llama-3-1)[...]
Meta is confusing “open source” with “resources available to
some users under some conditions,” two very different things.
We’ve asked them to correct their misstatement.
On 2024/10/29 13:03, Stefano Zacchiroli wrote:
To make Llama models OSAID-compliant Meta [...] will also have to:
[...] (3) release under DFSG-compatible terms their entire training pipeline (currently unreleased).
Again, the OSAID doesn't particularly care about DFSG-compatible, so
not sure where point 3 comes in here, but if there's something obvious
I missed, I'm all ears.
Code: The complete source code used to train and run the system. The
Code shall represent the full specification of how the data was
processed and filtered, and how the training was done. Code shall be
made available under OSI-approved licenses.
In order to be OSAID compliant, Meta will precisely have to change
those licensing terms and make them DFSG-compliant. That would be a
*good* thing for the world and would fix the main thing you are
upset about.
Unfortunately that's not the case. Meta won't have to make Llama3 DFSG compliant in order to be OSAID compliant, since OSAID as not as robust as
the OSD.
Parameters: The model parameters, such as weights or other[...]
configuration settings. Parameters shall be made available under *OSI-approved terms*.
The Open Source AI Definition does not require a specific legal
mechanism for assuring that the model parameters are *freely available
to all*. They may be free by their nature or a license or other legal instrument may be required to ensure their freedom.
Hi folks,
While diverse issues persist, the world and the software ecosystem is still proceeding with the advancement of AI. As a particular type of software, AI is
quite different from the paradigm of traditional software, since there are more components involved as an integral parts of the AI system. People gradually realize the Open Source Definition[3], derived from DFSG[4], could no longer cover AI software very well.
(...)
The OSAID 1.0 has now been released (with no modifications from the RC2).
Are we still going to take any actions or will we let this go?
Gerardo
I'm planning to draft a GR for this, but that is only going to happen after I get through some busy weeks.
Wondered if you'd had another chance to look at this.
On Sat, 2025-01-25 at 12:09 +0000, Sean Whitton wrote:
Wondered if you'd had another chance to look at this.
Ummm... You know what may happen when there is no deadline.
Did you ping this because there are some thoughts from the policy side?
Or just curious?
Nothing to do with Debian Policy, no.
I'm just interested in your thoughts on the matter.
From the Debian side, my concerns are unchanged. The OSAID doesnot guarantee freedom to our user.
On Sat, 2025-01-25 at 12:09 +0000, Sean Whitton wrote:
Wondered if you'd had another chance to look at this.
Ummm... You know what may happen when there is no deadline.
The best time to do this was last year around the OSAID 1.0 release.
The next best time is now. Do you need our help?
I'll focus on a simpler topic for the GR:
"how does Debian community interpret DFSG and software freedom
against the AI model and software?"
Will Debian accept a GR that requires all training data to be free,
including training data that belongs to the core of human dignity? That
would be disturbing. And in fact practically lobotomize good projects.
On Sat, 2025-01-25 at 17:08 +0100, Sam Johnston wrote:
The best time to do this was last year around the OSAID 1.0 release.
The next best time is now. Do you need our help?
I lean towards making things simpler.
Yes I disagree with OSI's decision on OSAID, and the definition
does not guarantee freedom at all. But a bold move towards picking
a fight against OSI on this matter through Debian General Resolution
sounds terrible and reckless to me.
I'll focus on a simpler topic for the GR:
"how does Debian community interpret DFSG and software freedom
against the AI model and software?"
I'll draft from a pure technical point of view. Neutral to
individuals and organizations, without commenting on how others
think and do. In that case it is as simply as elaborating the
"toxic candy" case, and analyzing the OSAID's implication from
a technical point of view.
In that sense, things will be more constructive and doable.
FSF will also able to learn from Debian's GR discussion.
I'll put my limited energy on this matter towards such direction.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 463 |
Nodes: | 16 (2 / 14) |
Uptime: | 156:07:01 |
Calls: | 9,384 |
Calls today: | 4 |
Files: | 13,561 |
Messages: | 6,095,837 |