People say that mathematics is a universal language — mathematical concepts, theorems and definitions can be expressed as symbols that are understandable regardless of language.
In this article, I test the mathematical capabilities of GPT-4 in sixteen different languages.
Early experiments showed GPT-4 scoring high on the AP Math and Calculus SAT tests and in college level mathematics. However, most of these experiments Test GPT-4 math capabilities in English only. To better understand the mathematical capabilities of GPT-4 beyond English, I give you the same mathematical problems in fifteen other languages.
So how good is GPT-4 at math in different languages? In theory, it should be equally good (or bad) in all languages, but unfortunately (as you may have guessed), this is not the case. GPT-4 is much better at solving math problems in English. Depending on the language, GPT-4 could solve some of the problems. However, for traditionally resource-poor languages such as Burmese and Amharic, GPT-4 could not solve the problems I raised.
I use mathematical problems from Euler Project website to test GPT-4. (This is also a throwback to one of my ai/p/prompt-engineering-gpt-3-to-solve” rel=”noopener ugc nofollow” target=”_blank”>one of my previous articles from this year, where I used rapid engineering using ChatGPT to solve some Project Euler problems). Project Euler, named after the eponymous mathematician, is a website with hundreds of mathematical and computer programming problems of different levels of difficulty. Founded in 2001, they have over 850 problems (as of October 2023) and publish a new question approximately every week.
The best thing about Project Euler questions is that each problem has a numerically “correct” answer; this makes it easy to check whether the GPT-4 answer is objectively correct or not. They also tend to be much more complicated than high school or college level math problems. Currently, there is no comprehensive, large-scale understanding of the mathematics of GPT-4 (or other large language models, for that matter)…