Sundays, The Sequence Scope brings a summary of the most important research papers, technology releases and VC funding deals in the artificial intelligence space.
"GPT-4 will be a text-only large language model with better performance on a similar size as GPT-3. It will also be more aligned with human commands and values. "
However it is capable of being scaled and there is another consideration which is training tokens. The article states:-
"We can safely assume, for a compute-optimal model, OpenAI will increase training tokens by 5 trillion. It means that it will take 10-20X FLOPs than GPT-3 to train the model and reach minimal loss. "
I don't think we know the details of all of this yet.
What is fascinating is the way in which huge data sets and training models are being experimented with, with the emphasis on the term "experiment", since outcomes cannot necessarily be anticipated.
The GPT-4 100 Trillion parameter rumours are know to be false. Sam Altman himself debunked it!
Reading here
https://www.datacamp.com/blog/what-we-know-gpt4
It seems that GPT-4 it seems that
"GPT-4 will be a text-only large language model with better performance on a similar size as GPT-3. It will also be more aligned with human commands and values. "
However it is capable of being scaled and there is another consideration which is training tokens. The article states:-
"We can safely assume, for a compute-optimal model, OpenAI will increase training tokens by 5 trillion. It means that it will take 10-20X FLOPs than GPT-3 to train the model and reach minimal loss. "
I don't think we know the details of all of this yet.
What is fascinating is the way in which huge data sets and training models are being experimented with, with the emphasis on the term "experiment", since outcomes cannot necessarily be anticipated.