With the recent sacking and swift rehiring of Sam Altman by OpenAI, debates around the development and use of artificial intelligence (AI) are once again in the spotlight. What’s more unusual is that a prominent theme in media reporting has been the ability of AI systems to do maths.
Apparently, some of the drama at OpenAI was related to the company’s development of a new AI algorithm called Q*. The system has been talked about as a significant advance and one of its salient features was a capability to reason mathematically.But isn’t mathematics, the foundation of AI? How could an AI system have trouble with mathematical reasoning, given that computers and calculators can perform mathematical tasks?
AI is not a single entity. It’s a patchwork of strategies for performing computation without direct instruction from humans. As we’ll see, some AI systems are competent at maths. However, one of the most important current technologies, the large language models (LLMs) behind AI chatbots such as ChatGPT, has struggled so far to emulate mathematical reasoning. This is because they have been designed to concentrate on language.
If the company’s new Q* algorithm can solve unseen mathematical problems, then that might well be a significant breakthrough. Mathematics is an ancient form of human reasoning that large language models (LLMs) have so far struggled to emulate. LLMs are the technology that underlies systems such as OpenAI’s ChatGPT.
At the time of writing, the details of the Q* algorithm and its capabilities are limited, but highly intriguing. So there are various subtleties to consider before deeming Q* a success.These AI systems could be described as competent at maths. However, it’s likely that Q* is not being used to help academics in their work but rather is intended for another purpose.
As a society, we are increasingly comfortable with specialist AI being used to solve predetermined types of problem. For example, digital assistants, facial recognition, and online recommendation systems will be familiar to most people. What remains elusive is a so-called “artificial general intelligence” (AGI) that has broad reasoning capabilities comparable to those of a human.
Naturally, any approach to mathematical reasoning that relies on linguistic probabilities is going to be driving outside its lane. One way around this could be to incorporate some system of formal verification into the architecture (exactly how the LLM is built), which continuously checks the logic behind the leaps made by the large language model.A clue that this has been done could be in the name Q*, which could plausibly refer to an algorithm developed all the way back in the 1970s to help with deductive reasoning. Alternatively, Q* could refer to Q-learning, in which a model can improve over time by testing for and rewarding conclusions that are correct.
But several challenges exist to building mathematically able AIs. For instance, some of the most interesting mathematics consists of highly unlikely events. There are many situations in which one may think that a pattern exists based on small numbers, but it unexpectedly breaks down when one checks enough cases. This capability is difficult to incorporate into a machine.