NEC Releases New High-Speed Generative AI Large Language Models

NEC Corporation has expanded its NEC cotomi generative AI services and launched NEC cotomi Pro and NEC cotomi Light, two new high-speed generative AI Large Language Models (LLM) featuring updated training data and architectures.

With the rapid development of generative AI in recent years, many organisations have been considering and verifying business transformation using LLMs. As specific application scenarios emerge, there is a need to provide models and formats that meet customer needs in terms of response time, business data coordination, information protection and other security aspects during implementation and operation.

NEC’s newly developed NEC cotomi Pro and NEC cotomi Light are high-speed, high-performance models that deliver the same high performance as global LLMs, but at more than ten times the speed.

Typically, to improve the performance of an LLM, a model needs to be made larger, but this slows down the operating speed. However, NEC has succeeded in improving speed and performance with this advanced new training method and architecture.

NEC cotomi Pro achieves performance comparable to top-level global models such as GPT-4 and Claude 2, with a response time approximately 87% faster than GPT-4 using an infrastructure of two graphics processing units (GPU).

The even faster NEC cotomi Light has the same level of performance as global models such as GPT-3.5-Turbo, but can process a large number of requests at high speed with an infrastructure of about 1 to 2 GPU, providing sufficient performance for many tasks.

Specifically, in an in-house document retrieval system using a technique called RAG, the system achieved a correct response rate higher than GPT-3.5 without fine-tuning and a correct response rate higher than GPT-4 after fine-tuning, with a response time that is approximately 93% faster.

NEC says that utilizing a model that achieves high processing power with high speed and mass access makes it possible to significantly shorten the response time of business applications that utilise generative AI and improve user experience. In addition, high processing power can significantly improve performance after fine-tuning of individual data for each company.

Related Posts

Distributed Cloud Networking Is the New WAN for AI-Era Applications

Claroty secures $150M USD investment

Claroty appoints new Chief Revenue Officer

ENJOY OUR OTHER CHANNELS