With 22 million daily queries, Brave Search is fast becoming one of the most popular search engines. Brave delivers unbiased search results based on its index of the Web. Now Brave has gone a step further with the help of artificial intelligence to increase the accuracy of its Summarizer. There is a strong emphasis on users’ right to privacy, without tracking their searches or other actions.
The Brave team explained that it created Summarizer by combining existing technologies in reaction to the increasing prevalence of AI in search engines, including the release of ChatGPT, an AI language model, and Microsoft’s statement that it will include the model. of OpenAI in its Bing search engine.
Large Language Models (LLMs) are trained to handle various sources of information on the Internet, making them more reliable than a purely generative AI model. When a user types a question into Brave Search, Summarizer will provide a short, informative answer at the top of the page based solely on the user’s web search results. It also includes source credit for transparency and accountability, in contrast to AI chat tools that can provide fabricated answers.
Also, there are always active connections to the primary sources from which the data was collected. The authority biases of big language models can be reduced by maintaining proper attribution and providing tools for users to assess the credibility of information sources.
The Brave Search team built Summarizer from the ground up, so users can be sure it adheres to the same high standards of transparency and privacy. The Brave Summarizer uses its own private and executed models, tuned for maximum inference efficiency. Rather than rely on ChatGPT or its underlying infrastructure, Summarizer is comprised of three LLMs that were individually trained for specific tasks:
- The first is a model called question response (QA), which determines whether a piece of text contains an answer. This is an expansion of what was already in place to enable Brave Search’s Knowledge Graph and Featured Snippets features; Brave has used LLM for some time to increase search relevance. The length and number of text segments evaluated make a difference.
- A pool of zero-shot classifiers is then used to further categorize the remaining candidates after the QA extraction step according to a wide range of parameters (hate speech, vulgar writing, spam, etc.) .
- Ultimately, the summary/paraphrase model processes candidate texts to rewrite the input to remove redundancies and standardize language for better readability.
Brave Search users, both on desktop and mobile, can now access Brave Summarizer. The AI model is tested against the peak of 600 requests per second that Brave Search now processes. While only about 17% of queries currently generate a summary. The team expects to scale up by applying Summarizer to all searches, while Bing and Google have yet to open their systems.
A lot of work has gone into ensuring that the summaries generated are of high quality, as well as being scalable. However, since the model is still in its infancy, “hallucinations” are possible, in which seemingly unconnected pieces are combined into a single conclusion. The team plans to fix these issues soon and refine the model so people can start using it.
review the Reference article. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 15k+ ML SubReddit, discord channeland electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.
Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech at the Indian Institute of Technology (IIT), Bhubaneswar. She is a data science enthusiast and has a strong interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring new advances in technology and its real life application.