Algorithmic interactions; AI Emergent Behaviour

Today, I looked at my screen time statistics on my phone, and the app I use the most is ChatGPT. Since its release, AI has become an interest of mine. I started watching educational YouTube videos about large language models, and the concept of emergent behaviours piqued my interest.

What are Emergent abilities?

Emerging behaviors are abilities the AI developed on its own, without the intention of the designer. As LLMs like ChatGPT were scaled up, they have been claimed to exhibit emergent abilities that couldn’t be predicted from extrapolating data from smaller models.

When I first heard about this, I was frightened because it raises concerns about AI safety and alignment. Imagine we keep scaling AI up; there could be a possibility that AI might develop features that are adverse to human well-being. We could be in danger, as we don’t know what the outcome might be.

Some emergent behaviour include, addition or the ability to learn a whole language automatically.

However, these claims have been contested, with some AI researchers arguing that it’s only a mirage.

Are these behaviours only illusions?

Some researchers argue that the way we measure models induces these emerging behaviors. To measure the capabilities of an AI, researchers use a point-scoring system, where points are awarded only if the model completes the task perfectly. However, this scoring system is very harsh; if an AI makes a small mistake, no points are given. Therefore, the drastic improvement in point scores we observe could simply reflect the AI becoming marginally better.

It seems that this way of measuring AI abilities is flawed, but it’s not the researchers’ fault. Research in AI is very difficult due to the private nature of AI models. Science is conducted by running experiments; in the case of AI, scientists would change the input and observe how the output changes. Unfortunately, this is not possible because researchers do not have access to the output, as those metrics are kept secret by AI companies. The current structure of AI research involves companies receiving inputs from scientists and giving back only a performance score.

Financial incentives of companies have also made the results of these experiments untrustworthy. AI is being sold as a commercial product, so companies have a vested interest in exaggerating the positives and underestimating the negatives for financial gain.

We have relied on science to create innovation, and in the field of AI scientists cannot run controlled experiments. For example, if a scientist wants to analyse how an AI model would preform differently with other training data, it would cost millions.

Problem of Interpretability

However, even with access to these models, there remains the problem of interpretability. To achieve a task, an AI model creates subgoals. Yet, an AI can’t explain to humans why it chose those subgoals or how they help in achieving the main goal.

Thus, even with access, scientists have not yet fully understood how to test and interpret AI models. The private nature of AI only exacerbates the challenge.