The company has publicly released its latest technology so people can build their own chatbots. Rivals like OpenAI and Google argue that approach can be dangerous.
The only reason Bard is free for now is because Google is building up a base of users who become invested in the service before turning it into a subscription. The business model will clearly be to sell the access to the service, and people being able to run their own models is the core danger for them.
I agree, but I think that computational power requirements already do that – complex models that do interesting stuff need a bunch of special v-cards to train for days, and they need a lot of data to train on – so it’s natural that those who already have data and money to process it get there first.
I think, their argument is not even about their monopoly, but to shut down the question of why and how to trust THEM with policing their LLMs before it happened. Open system can be investigated and we can find out that they over or underregulated some stuff, made it biased, find copyrighted materials, personal information, gore or CSAM in their training samples et cetera. They save metric tons of possible lawsuits by making it a rule in the industry that no one can see under the roof of their machines.
Initial training of the models is expensive, but a trained model can be run on a laptop from that point. The problem of initial training can also be addressed by doing it in distributed fashion. There are also open source projects, such as Petals, that allow you running distributed models Bittorrent style. Other approaches like LoRA allow taking existing models and turning them for a particular task without the need to do training from scratch. There’s a pretty good article from Steve Yegge on the recent advances in open source models.
I do agree that avoiding regulation and scrutiny are most definitely additional goals these companies have. They want to keep this tech opaque and frame themselves as responsible guardians of the technology that shouldn’t fall into the hands of unwashed masses who can’t be trusted with it.
The only reason Bard is free for now is because Google is building up a base of users who become invested in the service before turning it into a subscription. The business model will clearly be to sell the access to the service, and people being able to run their own models is the core danger for them.
I absolutely agree with you. That is the internet platform business model after all.
Still though, OpenAI and Google, I think, have a legitimate argument that LLMs without limitation may be socially harmful.
That doesn’t mean a $20 subscription is the one and only means of addressing that problem though.
In other words, I think we can take OpenAI and Google at face value without also saying their business model is the best way to solve the problem.
Personally, I think it’s far more socially harmful to allow a handful of megacorps to control this technology going forward.
I agree, but I think that computational power requirements already do that – complex models that do interesting stuff need a bunch of special v-cards to train for days, and they need a lot of data to train on – so it’s natural that those who already have data and money to process it get there first.
I think, their argument is not even about their monopoly, but to shut down the question of why and how to trust THEM with policing their LLMs before it happened. Open system can be investigated and we can find out that they over or underregulated some stuff, made it biased, find copyrighted materials, personal information, gore or CSAM in their training samples et cetera. They save metric tons of possible lawsuits by making it a rule in the industry that no one can see under the roof of their machines.
Initial training of the models is expensive, but a trained model can be run on a laptop from that point. The problem of initial training can also be addressed by doing it in distributed fashion. There are also open source projects, such as Petals, that allow you running distributed models Bittorrent style. Other approaches like LoRA allow taking existing models and turning them for a particular task without the need to do training from scratch. There’s a pretty good article from Steve Yegge on the recent advances in open source models.
I do agree that avoiding regulation and scrutiny are most definitely additional goals these companies have. They want to keep this tech opaque and frame themselves as responsible guardians of the technology that shouldn’t fall into the hands of unwashed masses who can’t be trusted with it.