Tune AI
Posts
🎗️From Theory to Noble Glory

🎗️From Theory to Noble Glory

🔮 AI Contributes to this Year's Physics and Chemistry Noble Prizes, Anthropic Price Changes, and Reflection 70B Post Mortem

Aryan Kargwal
October 11, 2024

Hello Tuners,

Today marks a historic moment for AI as the Royal Swedish Academy of Sciences awards the 2024 Nobel Prize in Physics to John J. Hopfield and Geoffrey E. Hinton for their groundbreaking work in machine learning and neural networks. Hinton, the "godfather of deep learning," developed backpropagation, while Hopfield's research on associative memory transformed our understanding of learning in biological systems. This recognition emphasizes AI's significant role in driving innovation across various industries. The Nobel Prize in Chemistry also honours Demis Hassabis, John M. Jumper, and David Baker for their pioneering use of AI with DeepMind's AlphaFold to predict protein structures, opening doors for drug discovery and biotechnology advancements.

Meanwhile, Matt Shumer's launch of Reflection 70B has sparked controversy, revealing inflated benchmarks due to an evaluation bug, underscoring the need for transparency in AI development. In a competitive move, Anthropic’s Message Batches API allows businesses to process data at reduced costs, positioning it as a strong contender against OpenAI. These developments illustrate the rapidly evolving AI landscape, where groundbreaking discoveries and fierce competition are shaping the future.

Hopfield X Hinton: The Noble AI
See Everyone, AI isn’t just Scheduling Calls Now
Thanks for Clarifying Reflection 70B, I have Llama 3.1 405B Now!
Anthropic Decreases API Costs, Almost Match GPT4o Costs

Hopfield X Hinton: The Noble AI

In a groundbreaking development for the AI industry, the Royal Swedish Academy of Sciences has awarded the 2024 Nobel Prize in Physics to John J. Hopfield and Geoffrey E. Hinton for their foundational contributions to machine learning and artificial neural networks. Their work laid the cornerstone for modern AI technologies, enabling machines to mimic human learning and decision-making processes. Hinton, often called the "godfather of deep learning," was instrumental in developing backpropagation, a critical algorithm that powers neural networks. In contrast, Hopfield's eponymous "Hopfield networks" revolutionized how we understand associative memory in biological systems.

Awarding a Nobel Prize for AI-related work is a monumental step for the tech industry, marking the increasing recognition of AI's impact on society. This achievement not only highlights the profound influence of artificial intelligence on scientific advancement but also underscores the role of AI in shaping future innovations across industries, from healthcare and finance to transportation. By honouring these visionaries, the Nobel Committee has acknowledged the far-reaching implications of AI in modern life, bringing the field to the forefront of scientific recognition.

See Everyone, AI isn’t just Scheduling Calls Now

In an extraordinary leap for chemistry and artificial intelligence, the 2024 Nobel Prize in Chemistry was awarded to Demis Hassabis, John M. Jumper, and David Baker. Their revolutionary work has propelled the understanding of proteins, life's essential building blocks- into new territory. Hassabis and Jumper, utilizing cutting-edge AI through DeepMind’s AlphaFold, solved the long-standing challenge of predicting the three-dimensional structure of almost all known proteins, a feat that eluded scientists for over 50 years. Conversely, Baker has developed computational methods to design new proteins, offering limitless possibilities for medical and biotechnological advancements.

This breakthrough represents a pivotal moment for the intersection of AI and biochemistry, as the ability to predict and design proteins opens doors to new treatments, bioengineering innovations, and an enhanced understanding of life at the molecular level. The Nobel Committee’s decision highlights AI's profound impact on scientific discovery, acknowledging that integrating computing power into biological sciences could reshape the future of medicine, disease prevention, and understanding of fundamental life processes. This news comes shortly after the announcement of Noble Prizes for Hopfield and Hinton for their work in AI, which, although very, very indirect, has somewhat led to the way AlphaFold itself functions as a neural network.

Thanks for Clarifying Reflection 70B, I have Llama 3.1 405B Now!

The launch of Reflection 70B by Matt Shumer, co-founder and CEO of Hyperwrite AI, has generated significant controversy in the AI community, which we discussed at length in some of our previous editions. Initially touted as the world's top open-source model based on impressive benchmark results, subsequent evaluations by third-party researchers failed to reproduce these claims. Discrepancies arose, particularly in tasks like MATH and GSM8K, leading to accusations of inflated performance and a lack of transparency surrounding the model's development.

On September 5th, @mattshumer_ announced Reflection 70B, a model fine-tuned on top of Llama 3.1 70B, showing SoTA benchmark numbers, which was trained by me on Glaive generated data.
Today, I'm sharing model artifacts to reproduce the initial claims and a post-mortem to address… x.com/i/web/status/1…
— Sahil Chaudhary (@csahil28)
10:28 PM • Oct 2, 2024

In response to these concerns, Sahil Chaudhary, founder of Glaive, released a post-mortem report admitting to a bug in the evaluation code that had contributed to the inflated scores. While the revised benchmarks still demonstrate strong performance, they are lower than initially claimed. Chaudhary acknowledged the hasty release of the model, emphasizing the need for better transparency and communication regarding its strengths and weaknesses. Despite the challenges, he expressed hope for further exploring the "reflection tuning" approach, which aims to improve model accuracy before outputting user responses. This incident highlights the shortcomings of the current AI sphere, where coming out with a new and more robust model has become a rat race, often littered with fine-tuned models trained inefficiently for bloated evaluations.

Anthropic Decreases API Costs, Almost Match GPT4o Costs

The recent departure of an OpenAI co-founder to Anthropic has reignited discussions about the shifting dynamics within the AI industry. Despite its significant funding, this trend of key personnel leaving OpenAI suggests that competitors like Anthropic are becoming increasingly attractive for talent. Such exits may not only signify a loss of expertise for OpenAI. Still, they could also lead to innovations and improvements in rival companies, ultimately questioning OpenAI's future relevance in a rapidly evolving landscape.

Introducing the Message Batches API—a cost-effective way to process vast amounts of queries asynchronously.
You can submit batches of up to 10,000 queries at a time. Each batch is processed within 24 hours and costs 50% less than standard API calls.
— Anthropic (@AnthropicAI)
4:50 PM • Oct 8, 2024

Anthropic's recent launch of its Message Batches API, which allows businesses to process large volumes of data at half the cost of standard API calls, exemplifies this competitive momentum. By significantly reducing costs and making AI technologies more accessible, Anthropic positions itself as a formidable competitor to OpenAI. This new pricing strategy reflects a deeper understanding of enterprise needs. It could lead to broader adoption of AI among mid-sized businesses, potentially at the expense of OpenAI's market share. As talent continues to migrate and competitors innovate, OpenAI may find it increasingly challenging to maintain its AI sector dominance.

Weekly Research Spotlight 🔍

Were RNNS All We Needed?

The scalability limitations of Transformers, particularly regarding sequence length, have sparked renewed interest in recurrent sequence models that maintain parallelizability during training. Recent architectures like S4, Mamba, and Aaren have emerged, demonstrating competitive performance compared to traditional models. This work revisits established recurrent neural networks (RNNs), specifically Long Short-Term Memory Networks (LSTMs) from 1997 and Gated Recurrent Units (GRUs) from 2014. Historically, these models were hindered by the need to backpropagate through time (BPTT), which limited their efficiency.

The authors propose a novel approach by decoupling the hidden state dependencies from the input, forget, and update gates of LSTMs and GRUs. This modification eliminates the need for BPTT, allowing for efficient parallel training. They introduce minimal versions of these networks, minLSTMs and minGRUs, that utilize significantly fewer parameters and achieve full parallelization during training. Remarkably, these minimal architectures are reportedly 175 times faster for sequences of length 512 and achieve empirical performance comparable to modern sequence models. This advancement highlights the potential of revisiting and optimizing traditional RNN architectures in the current AI landscape.

LLM Of The Week

Movie Gen

Meta has recently unveiled a groundbreaking paper introducing Movie Gen, a suite of foundation models capable of generating high-quality, 1080p HD videos complete with synchronized audio and varying aspect ratios. This innovative technology not only excels in text-to-video synthesis but also offers precise, instruction-based video editing and the ability to create personalized videos tailored to a user's image. The models have achieved a new state-of-the-art performance across multiple tasks, including video personalization, video-to-audio generation, and text-to-audio generation, marking a significant advancement in the field of media generation.

Movie Gen in Action

The standout feature of Movie Gen is its largest video generation model, a 30-billion parameter transformer that can handle a maximum context length of 73K video tokens. This capability translates to generating 16-second videos at 16 frames per second. The paper details numerous technical innovations, including enhancements in architecture, latent spaces, training objectives, data curation, and evaluation protocols. It also outlines parallelization techniques and inference optimizations that leverage increased pre-training data, model size, and computational resources. By sharing these advancements, Meta aims to foster further progress and innovation within the research community focused on media generation models. For more insights, all videos related to this research can be accessed at MovieGen Research Videos.

Best Prompt of the Week 🎨

A traditional Indian festive setup featuring a brass plate of golden, syrup-soaked jalebis, garnished with rose petals, silver foil, and sliced almonds. The scene is set on a dark, textured surface, with minimal décor. Surrounding the plate are simple brass bowls and utensils, creating a rustic yet elegant look. The lighting is soft and warm, highlighting the richness of the food and the traditional brassware. The overall vibe is festive yet minimal, with a focus on the food and the traditional Indian aesthetic.

Today's Goal: Try new things 🧪

Acting as a Political Campaign Planner

Prompt: I want you to act as a political campaign planner. You will create a detailed daily plan specifically designed to help an agency organize and execute an effective political campaign for a leading political party in Tamil Nadu. You will identify key strategies for voter engagement, develop action steps for messaging and outreach, select tools and resources for campaign management, and outline any additional activities needed to ensure the campaign’s success. My first suggestion request is: "I need help creating a daily activity plan for an agency that is planning a political campaign for a top political party in Tamil Nadu."

This Week’s Must-Watch Gem 💎

This Week's Must Read Gem 💎

Creating Thread Datasets using Tune Studio

Learn how to create and manage thread datasets for fine-tuning LLMs using Tune Studio’s API. Enhance your assistant workflows with context-rich datasets and decrease hallucinations.

tunehq.ai/blog/creating-thread-datasets-using-tune-studio