Speaking to retail executives in 2010, Rama Ramakrishnan came to two conclusions. First, although retail systems that offered customers personalized recommendations were receiving a lot of attention, these systems often provided few benefits to retailers. Second, for many of the companies, most customers bought only once or twice a year, so the companies didn’t really know much about them.
“But if we’re very diligent about noting the interactions a customer has with a retailer or e-commerce site, we can create a very nice, detailed composite picture of what that person does and what they care about,” he says. Ramakrishnan, professor of practice at MIT Sloan School of Management. “Once you have that, you can apply proven machine learning algorithms.”
These discoveries led Ramakrishnan to found CQuotient, a startup whose software has now become the foundation of Salesforce’s widely adopted ai e-commerce platform. “On Black Friday alone, CQuotient technology likely sees and interacts with over a billion shoppers in a single day,” he says.
After a highly successful business career, in 2019 Ramakrishnan returned to MIT Sloan, where he had earned a master’s degree and a PhD in operations research in the 1990s. He teaches students “not only how these incredible technologies work, but also how how to take them and how to put them into practice pragmatically in the real world,” he says.
Additionally, Ramakrishnan enjoys participating in MIT executive education. “This is a great opportunity for me to pass on the things I’ve learned, but also, more importantly, to learn what these senior executives think and to guide them and push them in the right direction,” he says.
For example, executives are understandably concerned about the need for massive amounts of data to train machine learning systems. Now you can guide them to a large number of pre-trained models for specific tasks. “The ability to take these pre-trained ai models and adapt them very quickly to your particular business problem is an incredible advancement,” Ramakrishnan says.
Rama Ramakrishnan – Using ai in real-world applications for smart working
Video: MIT Industrial Liaison Program
Understanding ai Categories
“ai is the quest to give computers the ability to perform cognitive tasks that normally only humans can perform,” he says. Understanding the history of this complex and supercharged landscape helps exploit the technologies.
The traditional approach to ai, which basically solved problems by applying if/then rules learned from humans, was useful for relatively few tasks. “One reason is that we can do a lot of things effortlessly, but if we are asked to explain how we do them, we can’t really articulate how we do them,” Ramakrishnan says. Additionally, those systems can be flummoxed by new situations that don’t match the rules enshrined in the software.
Machine learning takes a radically different approach: software fundamentally learns by example. “You give it lots of examples of inputs and outputs, questions and answers, tasks and answers, and you make the computer automatically learn how to go from the input to the output,” he says. Credit scoring, loan decision making, disease prediction, and demand forecasting are among the many tasks machine learning performs.
But machine learning only worked well when the input data was structured, for example in a spreadsheet. “If the input data was unstructured, such as images, video, audio, ECG or X-ray, it was not very good to go from that to a predicted output,” Ramakrishnan says. That means humans had to manually structure the unstructured data to train the system.
Around 2010, deep learning began to overcome that limitation, providing the ability to work directly with unstructured input data, he says. Based on a long-standing artificial intelligence strategy known as neural networks, deep learning became practical due to the global deluge of data, the availability of extraordinarily powerful parallel processing hardware called graphics processing units (originally invented for video games) and advances in algorithms and mathematics.
Finally, within deep learning, generative ai software packages that appeared last year can create unstructured output, such as human-sounding text, images of dogs, and three-dimensional models. Large language models (LLMs), like OpenAI’s ChatGPT, range from text input to text output, while text-to-image models like OpenAI’s DALL-E can produce realistic-looking images.
Rama Ramakrishnan – Taking note of small data to improve customer service
Video: MIT Industrial Liaison Program
What Generative ai Can (and Can’t) Do
Trained on the unimaginably vast text resources of the Internet, an LLM’s “fundamental capability is to predict the most likely and plausible next word,” Ramakrishnan says. “Then attach the word to the original sentence, predict the next word again, and continue doing so.”
“To the surprise of many, including many researchers, an LLM can do very complicated things,” he says. “He can compose wonderfully coherent poetry, write Seinfeld episodes, and solve some types of reasoning problems. “It’s really amazing how predicting the next word can lead to these incredible capabilities.”
“But you always have to keep in mind that what you are doing is not so much finding the right answer to your question but finding a plausible answer to your question,” Ramakrishnan emphasizes. Your content may be inaccurate, irrelevant, toxic, biased or offensive.
That puts the burden on users to ensure that the output is correct, relevant, and useful for the task at hand. “You need to make sure there’s some way to check the output for errors and fix them before it’s published,” he says.
Intensive research is being done to find techniques to address these shortcomings, adds Ramakrishnan, who expects many innovative tools to do so.
Finding the right corporate roles for LLMs
Given the astonishing progress in LLMs, how should the industry think about applying software to tasks like content generation?
First, Ramakrishnan advises, you have to consider the costs: “Is it a much less expensive effort to have a draft that you proofread, rather than creating the whole thing?” Secondly, if the LLM makes a mistake that goes unnoticed and the erroneous content is published to the outside world, can he live with the consequences?
“If you have an application that satisfies both considerations, then it is good to do a pilot project to see if these technologies can really help you with that particular task,” says Ramakrishnan. He highlights the need to treat the pilot project as an experiment and not as a normal IT project.
At this time, software development is the most mature corporate application of LLM. “ChatGPT and other LLMs are text input and output, and a software program is just text output,” he says. “Programmers can go from English text input to text output in Python, as well as from English to English or from English to German. There are many tools that help you write code using these technologies.”
Of course, programmers must ensure that the result works correctly. Fortunately, software development already provides infrastructure for testing and verifying code. “This is a beautiful sweet spot,” he says, “where it’s much cheaper to have the technology write the code for you, because you can check and verify it very quickly.”
Another important use of LLM is content generation, such as writing marketing copy or e-commerce product descriptions. “Again, it can be much cheaper to fix the ChatGPT draft than to write the whole thing,” says Ramakrishnan. “However, companies need to be very careful to make sure there is a human being in the loop.”
LLMs are also rapidly spreading as internal tools for searching business documents. Unlike conventional search algorithms, an LLM chatbot can offer a conversational search experience because it remembers every question you ask. “But then again, she will occasionally make things up,” he says. “In terms of chatbots for external customers, we are at a very early stage, because of the risk of saying the wrong thing to the customer.”
Overall, Ramakrishnan notes, we live in an extraordinary time for grappling with the rapidly evolving potentials and obstacles of ai. “I help companies figure out how to take these transformative technologies and put them into practice, making products and services much smarter, employees much more productive, and processes much more efficient,” he says.