NuMind launches three SOTA NER models that outperform similarly sized foundation models in the few shots regime and compete with much larger LLMs

Named Entity Recognition (NER) is vital in natural language processing, with applications spanning medical coding, financial analysis, and legal document analysis. Custom models are typically created using transformative encoders pre-trained on self-supervised tasks such as masked language modeling (MLM). However, recent years have seen the emergence of large language models (LLMs) such as GPT-3 and GPT-4, which can address NER tasks through well-designed prompts, but pose challenges due to high costs. of inference and potential privacy concerns.

The NuMind team presents an approach that suggests using LLM to minimize human annotations for creating custom models. Instead of employing an LLM to annotate a single-domain dataset for a specific NER task, the idea involves using the LLM to annotate a diverse, multi-domain dataset covering multiple NER problems. Subsequently, a smaller base model like BERT is pretrained on this annotated dataset. This pre-trained model can be fine-tuned for any subsequent NER task.

The team has presented its three NER models, which are the following:

NuNER Zero: A zero-shot NER model adopts the GLiNER (Generalist Model for Named Entity Recognition via Bidirectional Transformer) architecture and requires input as a concatenation of entity types and text. Unlike GLiNER, NuNER Zero works as a token classifier, allowing detection of arbitrarily long entities. Trained on the NuNER v2.0 dataset, which fuses subsets of Pile and C4 annotated via LLM using the NuNER procedure, NuNER Zero emerges as the leading zero-shot compact NER model, with a level F1 score improvement. token value of +3.1% over GLiNER. -large-v2.1 in the GLiNER benchmark.

NuNER zero 4k: NuNER Zero 4k is the long context (4k tokens) version of NuNER Zero. It generally has lower performance than NuNER Zero, but can outperform NuNER Zero in applications where context size matters.

Zero Range NuNER: NuNER Zero-span is the span prediction version of NuNER Zero, which shows slightly better performance than NuNER Zero but cannot detect entities larger than 12 tokens.

The key features of these three models are:

NuNER Zero: Originated from NuNER, suitable for moderate sized tokens.
NuNER zero 4K: A variation of NuNER works best in scenarios where context size matters.
NuNER zero-span: The interval prediction version of NuNER Zero is not suitable for entities larger than 12 tokens.

In conclusion, NER is crucial in natural language processing; However, creating custom models typically relies on transformative encoders trained through MLM. However, the rise of LLMs such as GPT-3 and GPT-4 poses challenges due to high inference costs. The NuMind team proposes an approach that uses LLM to reduce human annotations by annotating a multi-domain dataset. They feature three NER models: NuNER Zero, a compact zero-shot model; NuNER Zero 4k, which emphasizes a broader context; and NuNER Zero-span, which prioritizes interval prediction with slight performance improvements but limited to entities with fewer than 12 tokens.

Sources

https://huggingface.co/numind/NuNER_Zero-4k
https://huggingface.co/numind/NuNER_Zero
https://huggingface.co/numind/NuNER_Zero-span
https://arxiv.org/pdf/2402.15343
https://www.linkedin.com/posts/tomaarsen_numind-yc-s22-has-just-released-3-new-state-of-the-art-activity-7195863382783049729-kqko/?utm_source=share&utm_medium=member_ios

Sana Hassan, a consulting intern at Marktechpost and a dual degree student at IIT Madras, is passionate about applying technology and artificial intelligence to address real-world challenges. With a strong interest in solving practical problems, she brings a new perspective to the intersection of ai and real-life solutions.

Join the fastest growing ai research newsletter read by researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…

NuMind launches three SOTA NER models that outperform similarly sized foundation models in the few shots regime and compete with much larger LLMs

Technical Terrence Team

Common Wealth WLTH Token ICO Sale Aims to Raise $9 Million

Leave a Reply Cancel reply

Recommended.

Top 50 Python Libraries to Know in 2025

5 Expert Tips for Earning Passive Income from Dividend Stocks

Ethereum Rainbow Chart Forecasts Profits Along with Cosmos and Everlodge

University of Wisconsin-Madison ROBOSHOT Improves Robustness of Zero-Shot Learning: A New Machine Learning Approach to Bias Mitigation

Does Grayscale GBTC Indicate an Imminent Bitcoin Breakout?

Categories

Important Links

NuMind launches three SOTA NER models that outperform similarly sized foundation models in the few shots regime and compete with much larger LLMs

Related

Technical Terrence Team

Common Wealth WLTH Token ICO Sale Aims to Raise $9 Million

Leave a Reply Cancel reply

Recommended.

Top 50 Python Libraries to Know in 2025

5 Expert Tips for Earning Passive Income from Dividend Stocks

Ethereum Rainbow Chart Forecasts Profits Along with Cosmos and Everlodge

University of Wisconsin-Madison ROBOSHOT Improves Robustness of Zero-Shot Learning: A New Machine Learning Approach to Bias Mitigation

Does Grayscale GBTC Indicate an Imminent Bitcoin Breakout?

Categories

Important Links

Get daily news updates to your inbox!