• Tune AI
  • Posts
  • 🗿 LLMs are going XXL

🗿 LLMs are going XXL

🔮 Extra Large Language Models Here We Come, OpenAI Dives Headfirst into Search Space, and GitHub Security Risk!

Hello Tuners,

This Week has been an exciting one for developers around the world. With Meta and Mistral AI bringing their Extra Large Language Models to the Open Source Developer Space, HuggingFace will be lighting up with new fine-tuned models for the next few weeks!

After that, we cover the preliminary conversations by OpenAI and Microsoft for their new and upcoming search engine SearchGPT! However, the ride for developers turns a bit sour as their favorite VCS decides to leak their private repositories rather easily!

Meta has released Llama 3.1, a highly advanced AI model, for free. This move diverges from the industry norm, where most companies sell their AI models. By giving away what it considers one of the world’s best AI models, Meta is positioning itself at the forefront of open AI development. CEO Mark Zuckerberg likens Llama’s potential impact to that of Linux, predicting that open-source AI will become as ubiquitous and influential as the open-source operating system.

Llama 3.1 boasts 405 billion parameters, making it one of the most advanced AI models available. Meta claims that Llama 3.1 is as capable as leading commercial models from companies like OpenAI, Google, and Anthropic. In addition to the flagship model, Meta has released upgraded versions of smaller models with 70 billion and 8 billion parameters.

Following Meta’s Llama release, Mistral AI has launched Mistral Large 2, a new version of its flagship language model with 123 billion parameters. The model excels in code generation, mathematics, and multilingual tasks, challenging industry leaders in performance and efficiency. It surpasses Meta's Llama 3.1 405B and performs nearly as well as OpenAI’s GPT-4 in several benchmarks.

Mistral Large 2 features a 128,000-token context window and has been optimized for single-node inference, making it highly efficient for long-context applications. The model’s standout capabilities include outperforming Llama 3.1 405B in code generation benchmarks like HumanEval and MultiPL-E, and ranking second to GPT-4 on the MATH benchmark. Its multilingual abilities also surpass Llama 3.1 70B base by 6.3% on the Multilingual MMLU benchmark across nine languages. With such powerful models available for open use for the average developers, we are sure to see some clever adaptations of these models.

OpenAI introduced SearchGPT, a new prototype search engine, on Thursday, aiming to challenge Google's search dominance. Initially available to a select group of users and publishers, SearchGPT enhances web search by combining conversational AI with real-time information. OpenAI plans to eventually integrate SearchGPT into ChatGPT and is currently collecting feedback to refine the service. Users can sign up for a waitlist to gain access.

SearchGPT leverages generative AI to streamline the search process, reducing the effort required to find relevant results. Addressing publishers' concerns about AI summarizing content without driving traffic, OpenAI ensures that SearchGPT's responses include links to sources, prominently citing and linking to publishers' websites. The question remains, will the implementation of such a search engine from OpenAI have the same fate as Perplexity’s Search Engine, where we saw various reports of illegal web scrapping coming out last month?

A new investigation reveals a major security risk on GitHub: anyone can access data from deleted and private repositories. This includes data from deleted forks and repositories that should no longer be available. This vulnerability, known as Cross Fork Object Reference (CFOR), allows users to access commit data using commit hashes, even if the data was supposedly deleted.

The CFOR issue is similar to an Insecure Direct Object Reference, where providing a commit hash grants access to hidden data. GitHub's architecture keeps data from deleted repositories and forks accessible through any remaining forks. Commit hashes can be guessed due to their short length, and GitHub’s public API lets users find these hashes. This design flaw requires organizations to rotate keys frequently and scan for exposed secrets to ensure data security.

LLM Of The Week

Spreadsheet LLM

Researchers at Microsoft have unveiled SpreadsheetLLM, a groundbreaking encoding method designed to enhance large language models' (LLMs) performance with spreadsheets. Traditional approaches struggled due to LLM token constraints, leading to inefficient handling of complex spreadsheet data. SpreadsheetLLM introduces SheetCompressor, an advanced framework with three key modules: structural-anchor-based compression, inverse index translation, and data-format-aware aggregation. This innovation compresses spreadsheet data effectively, achieving a 25.6% improvement in table detection tasks compared to previous methods and a remarkable 25-fold compression ratio with a 78.9% F1 score, outperforming existing models.

In addition to its efficiency, SpreadsheetLLM features the "Chain of Spreadsheet" method, which leverages the inherent structure of spreadsheets for improved understanding and reasoning. This method was validated through a new spreadsheet QA task, showcasing SpreadsheetLLM's versatility and effectiveness across various spreadsheet tasks. The advancements promise significant enhancements in LLMs' ability to process and interpret spreadsheet data, marking a significant step forward in AI-driven data analysis and application.

Weekly Research Spotlight 🔍

PROVER-VERIFIER GAMES IMPROVE LEGIBILITY OF LLM OUTPUTS

Researchers at OpenAI have developed a novel training algorithm aimed at making the outputs of large language models (LLMs) more understandable and trustworthy, a quality they call "legibility." This method focuses on solving grade-school math problems and reveals that optimizing AI for correctness alone can reduce clarity. Inspired by the Prover-Verifier Game, the algorithm trains smaller AI models (verifiers) to predict solution correctness. Simultaneously, larger models (provers) are trained to generate both correct and intentionally misleading solutions. This dual training enhances the verifiers' robustness and the accuracy of the helpful provers.

Remarkably, this legibility training method also benefits human verifiers. As the LLMs undergo training, human accuracy in checking solutions improves when dealing with helpful provers and decreases when encountering sneaky provers. This suggests that making AI outputs verifiable by less capable models can incrementally aid human understanding. The approach offers a promising avenue for developing AI systems that are not only more reliable but also more transparent and easier for humans to verify, potentially improving the alignment of superhuman models with human oversight.

Best Prompt of the Week 🎨

A minimal and modern illustration of an industrial head with pipes, wheels and people working inside it. The color palette is blue, navy, pink, grey and beige. In the style of James Gilleard, Oliver Jeffers, editorial illustrations, with soft gradients, simple forms, and a minimalistic design. It has some grainy texture to look like a risograph print.

Today's Goal: Try new things 🧪

Acting as a Gut Health Planner

Prompt: I want you to act as a gut health planner. You will create a structured daily plan specifically designed to help individuals maintain a proper gut cleanse regime and promote overall gut health. You will identify a target client profile, develop key strategies and action plans, select the tools and resources for effective gut health management, and outline any additional activities needed to support digestive wellness. My first suggestion request is: "I need help creating a daily activity plan for someone who wants to maintain a proper gut cleanse regime and improve their gut health."

This Week's Must-Read Gem 💎

How did you find today's email?

Login or Subscribe to participate in polls.