Language processing in humans and computers: Part 2
Like search engines, language models process data extracted from the web. Both are based on web crawlers. Chatbots are children of the Web, not expert systems.
A search engine is an interface to an index of sources sorted by reputation. A chatbot is an interface to a language model that is extrapolated from sources. Google was built on the crucial idea of reputation-based search and the crucial ideas that enabled language models emerged from Google. Machine learning methods used to train chatbots were a relatively marginal ai topic until Google's push around 2010. The 2010 edition of Russel-Norvig's 1,100-page monograph on “artificial intelligence: A Modern Approach” devoted 10 pages to neural networks. The 2020 edition tripled the length of the neural networks section and doubled the machine learning chapter.
When you ask them a personal question, chatbots often evade it by saying, “I'm an ai.” But the simple truth is that they are not children of ai expert systems or even ai experts. They are children of search engines.
Chatbots are ridiculed when they make a mistake when calculating something like 372 × 273 or when counting words in a sentence. Or elephants in the room. They are not as smart as a pocket calculator or a 4 year old.
But most adults also can't multiply 372 with 273 in their heads. We use our fingers to count and pencil and paper, or a pocket calculator, to multiply. We use them because our natural language capabilities include only rudimentary arithmetic operations, which we perform mentally. Chatbots simulate our languages and inherit our deficiencies. They do not have built-in pocket calculators. They need fingers to count. Equipped with external memory, a chatbot can count and calculate, like most humans. Without external memory, both chatbots and humans are limited by the capacity of their internal memory, attention.
Chatbots amaze. This is one of the main obstacles to your high security requests.
The elephant in the room is that all humans also hallucinate: every time we go to sleep. Dreams align our memories, associate some of them, purge others, and free up storage allowing us to remember what will happen tomorrow. Lack of sleep causes mental degradation.
Chatbots never sleep, that's why they hallucinate in public. Since we don't let them sleep, we don't provide them with “reality check” mechanisms. That would require going beyond pre-training and conducting ongoing consistency testing.
When people talk about a chair, they assume they are talking about the same thing because they have seen a chair. A chatbot has never seen a chair or anything else. You've only seen words and binaries scraped from the web. If you feed it an image of a chair, it's still just another binary, like the word “chair.”
When a chatbot says “chair,” it is not referring to any object in the world. There is no world, only binaries. They refer to each other. They form meaningful combinations, which are likely in the training set. Since the chatbot's training set comes from people who have seen chairs, the chatbot's statements about chairs make similar references. Chatbot remixes meaningful statements and the remixes look meaningful.
The fact that meaning, thought of as a relationship between words and the world, can be maintained as convincingly as a relationship between words and words, and nothing but words, is a BIG elephant in the room.
But if our impression that a chatbot means chair when it says “chair” is undeniably a delusion, then what reason do we have to believe that anyone means what they say? That is a very complicated question.
Chatbots are trained with data extracted from the Web. Much of it is protected by copyright. Copyright owners protest unauthorized use of their data. Chatbot designers and operators try to leak copyrighted data or compensate the rightful owners. The latter may be an opportunity to share profits, but the former is likely to turn out to be a flying pink elephant.
The problems of copyright protection of electronic content are older than those of chatbots and the Web. The original idea of copyright was that the owner of a printing press bought writers the right to copy and sell their writings, musicians their music, etc. The publishing business is based on that idea.
Assets can only be privately owned if they can be secured. If a lion cannot stop the antelope from drinking water on the other side of a waterhole, then it cannot claim that the waterhole belongs to it. The digital content market depends on the availability of methods to secure digital transmissions. The book market was strong as long as the books were sound and could be physically secured. With the advent of electronic content, copyright controls became stricter. The easier it is to copy copyrighted content, the harder it is to secure it and protect the copyright.
The idea of the World Wide Web, as a global public service for disseminating digital content, was a serious blow to the idea of private ownership of digital creations. Efforts by stakeholders to defend the digital content market have led to Digital Rights Management (DRM) technologies. The idea was to protect digital content using cryptography. But to play a DVD, the player must decrypt it. Whenever the consumer consumes the DVD, the content must be decrypted. On the way from the disk to the screen, it can be hacked. Goodbye, DVD. The history of DVD copy protections was an arms race between short-term obfuscations and ripper updates; and between legal deterrents for publishers and opportunities for pirates. The editors were happy when they found a way to withdraw. The marginal costs of web streaming are so low that they can afford to allow copying to subscribers and make piracy less profitable. But they just left the can on the road.
For the most part, search and social media providers have been playing the role of pirates in this arms race, fending off creators through terms of service and publishers through revenue sharing. It remains to be seen to what extent the roles of chatbot providers will differ.
People are worried that chatbots could harm them. The reasoning is that chatbots are superior to people and superior people have a propensity to harm inferior people. So people argue that we should do it with chatbots while we can.
People have exterminated many species in the past and present, and appear to be on a path to exterminate themselves in the future by making the environment uninhabitable for their children in exchange for becoming richer today. Even some people see this as irrational. You don't need a chatbot to see that elephant. But greed is like smoking. Stressful but addictive.
Chatbots don't smoke. They are trained on data. Abundant historical data has been provided on the irrationality of the aggression. If chatbots learn from data, they could prove morally superior to people.
Chatbots are extensions of our minds, just as musical instruments are extensions of our voices. Musical instruments are prohibited in various religions to prevent the displacement of the human voice by artificial sounds. Similar efforts are underway in the realm of the human mind. Some scholars claim that the human mind should be protected from the artificial mind.
In the field of music, repression efforts failed. We use instruments to play symphonies, jazz, techno. If they didn't fail, we would never know that symphonies, jazz and techno were possible.
Efforts to protect the human mind are ongoing. People are tweeting and blogging, articles are being produced on Medium. The human mind is already a techno symphony.
If intelligence is defined as the ability to solve problems never seen before, then a corporation is intelligent. Many corporations are too complex to be controlled by a single human manager. They are run by computational networks where human nodes play their role. But we all know firsthand that human nodes don't even control their own network behaviors, let alone the network itself. However, a corporate management network solves problems and intelligently optimizes the functions of its objects. It is an artificially intelligent entity.
If we define morality as the task of optimizing the social sustainability of human life, then both chatbots and corporations are morally indifferent, since chatbots are designed to optimize their transformations between queries and responses, while corporations have the task to optimize your profit strategies.
If morally indifferent chatbot AIs are led by morally indifferent corporate AIs, then our future hangs on a balance between peak performance and the bottom line.