Last week saw model releases, funding news, and company formations, as well as developments from OpenAI and Nvidia. Here are five of his most intriguing new items from last week.
OpenAI introduces Sora
As we see dramatic improvements in generative AI, OpenAI's Sora serves as a prime example of how advanced technology can enhance and enhance human creativity. His Text-to-Video model, Sora, just announced by OpenAI, can generate videos of up to 1 minute while maintaining visual quality and following user prompts. Sora's internal model of the physical world allows him to create multi-character scenes with realistic movement and precision based on user prompts. The results are truly breathtaking and have to be seen for yourself.
Sora's capabilities represent another breakthrough for OpenAI, enabling rapid prototyping and creation of visual content through democratized video production. But Sora's capabilities go far beyond simplifying content creation and reducing the time it takes skilled professionals to create short-form videos.
Of course, on the business side, this means that by shortening production cycles we can respond to market trends faster than ever before. However, I'm interested in Sora's potential impact on other fields. Consider education as just one example. Imagine being able to cater to different learning styles and improve knowledge retention by delivering lessons in a personalized way at the click of a button.
Reka launches new multimodal, multilingual model
Reka announced Reka Flash, a multimodal model with 21 billion parameters that matches the performance of other leading models. Trained from scratch, Reka Flash serves as a turbo offering in the company's lineup of generative AI models. Reka also announced a more compact version called Reka Edge. It has 7 billion parameters and can be run locally on the device, increasing deployment efficiency.
At the moment, performance is a key element of every model launch. And as new models are released, evaluating their performance with benchmarks is important to prove their feasibility, reliability, accuracy, and underlying functionality. With this in mind, Reka has released benchmark results highlighting the model's performance across several generative AI benchmarks including language, multilingual inference, vision, and video.
- MMLU for knowledge-based question answering.
- GSM8K for reasoning and mathematics.
- HumanEval for code generation.
- GPQA for graduate level question answering.
This model performed very well on core language tests, outperforming models such as Google's Gemini Pro, OpenAI's GPT-3.5, and Meta's Llama2-70B, but falling slightly behind GPT-4 and Gemini Ultra. It didn't come close. But what I like about this announcement is the level of transparency Reka is achieving by disclosing the performance of its models on these benchmarks. We've covered just a few of the results here, but you can find many more here by comparison with other models.
Lambda raises $320M to fuel GPU cloud growth
Founded in 2012, Lambda Labs pushes the boundaries of AI and helps businesses leverage it in unique ways. However, starting in 2017, Lambda started promoting his GPU cloud offering. What started as wearable technology and image recognition has evolved into a purpose-built GPU cloud for training, inference, deep learning, and more.
More recently, Lambda has focused on training large-scale language models (LLMs) and other types of generative AI through its on-demand and reserved clouds, making it easy for customers to leverage thousands of Nvidia GPUs. Masu. And this is exactly where his $320 million will be spent. Accelerate the growth of Lambda's GPU cloud and enable your team to use thousands of his Nvidia GPUs with high-speed Nvidia Quantum-2 InfiniBand networking. Amid the global GPU shortage, Lambda's capital injection will better ensure the company's continued expansion of GPU resources and alleviate potential bottlenecks in AI development and deployment.
The most important part of this announcement is that organizations recognize the most important features when choosing an AI infrastructure. Additionally, research from Enterprise Strategy Group revealed that the top two features organizations seek are performance and ease of deployment. Lambda consistently checks both of these boxes, and this funding will ensure these boxes continue to be checked as organizations pursue generative AI initiatives.
Guardrails AI launches with seed round and aims to improve reliability of LLM
Trust has been at the center of many conversations about generative AI lately. As organizations pursue generative AI, they seek stability, accuracy, and compliance. They want to protect their reputation. They want to be able to understand model performance transparently. And we want to control and empower end users to privately use the AI they generate for the better.
In other words, organizations need assurance from a systematic methodology to ensure the safety and effectiveness of generative AI. Guardrails AI is a platform focused on providing safe and effective use of generative AI with increased accuracy and reliability. At launch, the company also introduced Guardrails Hub, an open source product that enables AI developers to build, contribute, share, and reuse advanced verification technologies called validators.
These validators can be used in conjunction with the core Guardrails platform to serve as an important layer of trust for building AI applications, ensuring compliance with specified guidelines and standards. Guardrails offers over 50 pre-built validators created by Guardrails AI as well as several partners and open source contributors. These validators provide essential AI risk management tools that help enterprises address compliance, security, and ethical AI in a programmatic way.
Nvidia offers a new chatbot that runs on your PC
Nvidia's new custom chatbot, Chat with RTX, allows users to personalize their chatbot with their own content on their PC. This demo is free to download and uses Acquisition Enhanced Generation, Nvidia TensorRT-LLM software, and Nvidia RTX acceleration to bring generation AI capabilities locally to his GeForce-powered Windows PC.
End users can connect local files as data sets to open source LLMs like Mistral and Llama 2 to enable queries and get fast, contextual answers. It's also incredibly flexible in terms of file format support, supporting txt, pdf, doc, docx, and xml. Inside the chat, he can also paste the URL of a YouTube video for even more context.
With so much emphasis on cloud-based LLM services these days, this announcement is a breath of fresh air. When you run your chatbot locally, all data remains on your device. There is no need to share your data with third parties or over the internet. This means you can handle sensitive data without worrying about it leaving your device.
Of course, there are certain requirements that the PC itself must meet, such as a GeForce RTX 30 series GPU, 8 GB of video RAM, and a Windows 10 or 11 OS. But either way, the ability to have personalized, context-aware chatbots on personal devices is very powerful in itself.
Mike Leone is a Principal Analyst in TechTarget's Enterprise Strategy Group, covering data, analytics, and AI.
Enterprise Strategy Group is a division of TechTarget. The company's analysts have business relationships with technology vendors.