Ex-Google AI experts launch new start-up with tech inspired by nature

Two former Google AI experts have formed their own AI company, Sakana AI, with a goal of exploring alternative techniques for generative AI based on structures seen in nature.

David Ha and Llion Jones announced that they had formed Sakana AI, which they described as an AI research lab focused on R&D and the creation of innovative foundation models.

The pair have rejected the focus on large language models (LLMs) that many large tech companies are currently pursuing, and have argued for new approaches to building neural networks such as evolutionary computation.

This is a methodology for algorithms in which a large number of programs with random differences are created to tackle the same problem and pitted against one another over many generations, as in evolutionary biology.

It can be used to produce refined neural networks that perform a task in the most optimal way without the need for huge, intensive computation as is found with most AI training performed today.

In a LinkedIn post, Ha encouraged users to get in touch if they “love fish, and you’re excited about developing collective intelligence, evolutionary computing, ALIFE, or neuro-inspired methods to advance the state of foundation models beyond the current AI paradigm”.

Jones has a strong background in natural language processing, and was one of the authors of Attention Is All You Need, the Google research paper that first proposed transformers as a viable model for generative AI.

Transformer models take natural language input like a user sentence and interpret it as a single vector block containing encoded content and context. As data passes through a transformer, it’s tagged to map the relationship between different words in the input sentence which can then be used to inform the order and content of the output.

The introduction of transformer models provided the building blocks for LLMs and some of the most powerful models available today, including OpenAI’s GPT-4 and Google’s PaLM 2.

A problem with them is that they require very large amounts of training and refinement data to work well.

This means they take up lots of space – transformer-based LLMs such as PaLM 2 can have up to 340 billion individual parameters – and need to be trained on extremely powerful hardware like AI chips made by Nvidia or those made by Google.

Sakana AI could combat this problem through the development of “collective intelligence” models as hinted by Ha in his LinkedIn post.

‘Sakana’ is the Japanese for ‘fish’, and the logo for the company is a stylized depiction of a school of fish with one diverging from the rest. Schools are one example of collective intelligence, in which collaboration can be used to make less powerful constituent parts work as a powerful whole.

“Rather than building one huge model that sucks all this data, our approach could be using a large number of smaller models, each with their own unique advantage and smaller data set, and having these models communicate and work with each other to solve a problem,” Ha told CNBC.

Like an individual ant, each of the firm’s models could be considered ineffective on its own but may form a highly cohesive and efficient system when grouped as in a colony.

This is similar to AWS’ pick-and-choose approach to AI, with the firm having rejected generalistic LLMs and allowing customers to select AI models depending on the specific task they seek to complete via a unified platform.

However while each LLM in this system varies in ability on certain tasks, they are all capable of performing end-to-end operations on their own.

By combining collective intelligence with evolutionary computation, Sakana AI could develop powerful systems that come together to perform a task and then break apart into their constituent smaller models when it is finished.

Neural networks that have been refined to a point through evolutionary adversarial training – similar to animals that have become specialised after millions of years of evolution – can each play a small but efficient role, that added up makes a significant impact.

This report was here published in TechCentral.ie: