👁🗨 Why Do AI Companies Only Pretend to Be Open?
About 50 years ago, a group of computer enthusiasts in California founded the Homebrew Computer Club. Its members established the core idea of the open-source movement: programs should be accessible for anyone to use, study, modify, and share with others.
Thanks to open source, developers worldwide have created tools that drive scientific and technological progress. Notable examples include the Python programming language and the Linux operating system kernel.
🍹 How AI Companies Mislead Us
Today, many AI companies label their products as "open." However, they only reveal the general architecture while hiding the most important details.
An analysis by the Open Source Initiative (OSI) revealed that popular language models advertised as "open"—such as Llama 2 and Llama 3.x (Meta), Grok (xAI), Phi-2 (Microsoft), Mixtral (Mistral AI)—fail to meet the fundamental principles of the open-source community. Developers do not disclose the datasets used to train the models or other key details. At best, we can call these products "models with open weights."
However, there are also fully open-source models developed mainly by nonprofit organizations like the Allen Institute for AI or independent enthusiasts.
🖥 Why Do Companies Do This?
In 2024, the European Union adopted the AI Act, which strictly regulates the industry. However, this regulation includes exceptions for open-source software, which some companies try to exploit. Experts call this practice "openwashing."
To combat this confusion, OSI has developed a standard defining what constitutes truly open AI. According to this standard, developers must publish the code and information about the data used to train their models.
If publishing the data directly is not possible (e.g., due to privacy restrictions), companies should at least disclose its sources and how it was processed.
As expected, in 2024, Meta criticized this idea, arguing that traditional approaches do not adequately reflect the complexities of today's rapid AI development.
Authorities can also help address this issue. For example, in the U.S., recipients of scientific grants are already required to publish their research data openly. Widespread adoption of similar rules could ensure that AI remains a transparent and accessible technology rather than a closed resource controlled by a few corporations.
More About Open Source:
🟥 The Success of DeepSeek: How a Chinese Model Challenges ChatGPT
🟥 OpenAI may open source its older models
#news #opensource @hiaimediaen

