Natural language transmits ideas, actions, information and intentions through context and syntax; Additionally, there are volumes contained in databases. This makes it an excellent data source for training machine learning systems. Two engineering master's students in MIT's 6A MEng Thesis Program, Irene Terpstra '23 and Rujul Gandhi '22, are working with mentors at the MIT-IBM Watson ai Lab to use this power of natural language to build systems of ai.
As computing becomes more advanced, researchers seek to improve the hardware it runs on; This means innovating to create new computer chips. And, since there is already literature available on the modifications that can be made to achieve certain parameters and performance, Terpstra and his mentors and advisors Anantha Chandrakasan, dean of the MIT School of Engineering and Vannevar Bush Professor of Electrical and Computer Engineering, and IBM researcher Xin Zhang, are developing an artificial intelligence algorithm that helps in chip design.
“I'm creating a workflow to systematically analyze how these language models can help in the circuit design process. What reasoning powers do they have and how can they be integrated into the chip design process? says Terpstra. “And then on the other hand, if that turns out to be useful enough, we'll see if they can automatically design the chips themselves, attaching them to a reinforcement learning algorithm.”
To do this, the Terpstra team is creating an artificial intelligence system that can iterate on different designs. It means experimenting with several large pre-trained language models (such as ChatGPT, Llama 2, and Bard), using an open source circuit simulator language called NGspice, which has the chip parameters in code form, and a reinforcement learning algorithm. . With text prompts, researchers can see how the physical chip should be modified to achieve a given goal in the language model and generate guidance for adjustments. This is then transferred to a reinforcement learning algorithm that updates the circuit design and generates new physical parameters of the chip.
“The ultimate goal would be to combine the reasoning powers and knowledge base built into these large language models and combine them with the optimization power of reinforcement learning algorithms and have them design the chip itself,” Terpstra says.
Rujul Gandhi works with his own crude language. As an undergraduate at MIT, Gandhi explored linguistics and computer science, bringing them together in her MEng work. “I've been interested in communication, both between humans and between humans and computers,” Gandhi says.
Robots or other interactive ai systems are an area where both humans and machines need to understand communication. Researchers often write instructions for robots using formal logic. This helps ensure that commands are followed safely and as intended, but formal logic can be difficult for users to understand while natural language is simple. To ensure this seamless communication, Gandhi and his advisors Yang Zhang of IBM and MIT assistant professor Chuchu Fan are building a parser that converts natural language instructions into a machine-compatible format. Leveraging the linguistic structure encoded by the pre-trained T5 encoder-decoder model and a data set of annotated basic English commands to perform certain tasks, Gandhi's system identifies the smallest logical units, or atomic propositions, that are present in a given instruction.
“Once you've given your instructions, the model identifies all the smaller subtasks you want it to carry out,” Gandhi says. “Then, using a large language model, each subtask can be compared with the actions and objects available in the robot's world, and if any subtask cannot be carried out because a certain object is not recognized, or an action “Not possible, the system can stop right there to ask the user for help.
This approach of dividing instructions into subtasks also allows your system to understand logical dependencies expressed in English, such as “perform task X until event Y occurs.” Gandhi uses a data set of step-by-step instructions in robot task domains such as navigation and manipulation, focusing on household tasks. Using written data as humans would talk to each other has many advantages, he says, because it means a user can be more flexible in expressing their instructions.
Another of Gandhi's projects is to develop speech patterns. In the context of speech recognition, some languages are considered “low resource” as they may not have a large amount of transcribed speech available or may not have any written form at all. “One of the reasons I applied for this internship at the MIT-IBM Watson ai Lab was my interest in language processing for low-resource languages,” he says. “Many language models today are highly data-driven, and when it's not as easy to acquire all that data, that's when you need to use the limited data efficiently.”
Speech is just a stream of sound waves, but humans having a conversation can easily figure out where words and thoughts begin and end. In speech processing, both humans and language models use their existing vocabulary to recognize word boundaries and understand meaning. In low- or no-resource languages, no written vocabulary may exist, so researchers cannot provide one to the model. Instead, the model can take note of which sequences of sounds occur together more frequently than others and infer that they could be individual words or concepts. In Gandhi's research group, these inferred words are then compiled into a pseudo-vocabulary that serves as a tagging method for low-resource language, creating labeled data for future applications.
Applications of language technology are “virtually everywhere,” says Gandhi. “You could imagine that people could interact with software and devices in their native language, their native dialect. You could imagine improving all the voice assistants we use. “You could imagine it being used for translation or interpretation.”