- Tune AI
- Posts
- 📚 After Books and Paper, YouTubers to Train LLMs
📚 After Books and Paper, YouTubers to Train LLMs
💥 AI Companies Backing Out, The Chip's Where The Revenue At, and Anthropic Got Money for You
Hello Tuners,
From Corporations being caught red-handed scrapping data from questionable resources, to similar corporations backing out from countries over compliance issues with the government, this week sure is a legal conundrum.
As Anthropic joins hands with Menlo Ventures, things shine better for startups relying on Anthropic’s products, and oh well, another set of models is being trained using illegal data scrapping, I wonder when Figma will bring back their AI features.
Anthropic Takes Inspiration from Apple and Joins Menlo Ventures
Anthropic and Menlo Ventures launched the $100 million Anthology Fund to support early-stage startups using Anthropic’s AI technology. Menlo Ventures will provide funding, while Anthropic offers $25,000 in credits for its large language models. This initiative mirrors Apple's 2008 iFund, aiming to boost developer engagement and innovation in AI.
Menlo Ventures' partner Matt Murphy noted that AI advancements are occurring much faster than previous tech waves. With AI funding more than doubling in the second quarter, the Anthology Fund aims to attract top AI startups by offering enhanced support and resources, driving rapid innovation in the field.
Data Hungry Corporations Backing out From Countries
Meta Platforms suspended its generative AI tools in Brazil following government objections to its new privacy policy. Brazil's National Data Protection Authority (ANPD) halted Meta's use of personal data for AI training, requiring policy adjustments. Brazil, with over 200 million people, is a crucial market for Meta, especially for its WhatsApp user base.
Meta previously launched an AI-driven ad targeting program in Sao Paulo but paused it amid regulatory concerns. Meta is now in discussions with ANPD to resolve issues surrounding generative AI and data privacy, emphasizing the importance of compliance in this significant market. More and more such companies stopping their operations instead of Compliance Issues highlights a challenging path ahead for Innovation in AI.
After Nvidia, TSMC Shows Promising “AI” Revenue
TSMC now expects sales growth to exceed its previous guidance, forecasting up to $23.2 billion in revenue for the current quarter, surpassing analysts' expectations. The chipmaker also raised its capital spending forecast to $30-32 billion, reflecting confidence in sustained AI demand despite US-Chinese trade tensions.
/ima
Driven by high-performance computing and AI infrastructure investments, TSMC reported a 36% rise in June-quarter profit. AI demand has bolstered its revenue, with high-performance computing accounting for 52% of sales. TSMC’s shares rose 4% in New York, underlining its pivotal role in the global AI race. As AI demand surges, the synergy between hardware and software will shape the next wave of tech innovation.
After Reddit, Youtubers are Training AI Models
Tech companies are using controversial methods to gather data for AI models, often without the creators' knowledge. An investigation by Proof News revealed that firms like Anthropic, Nvidia, Apple, and Salesforce used subtitles from 173,536 YouTube videos for training, violating YouTube’s policies. The videos came from educational sources like Khan Academy and popular channels like MrBeast and PewDiePie (also Apple, why would you wrong Marques Brownlee Like this tsk.. tsk..).
This practice raises ethical concerns about consent and data usage. For amateur developers, it's crucial to consider the implications of using such data. Understanding the sources and ethical use of training data can help build more responsible and transparent AI systems, promoting trust and integrity in AI development.
LLM Of The Week
Mistral NeMo
NVIDIA and Mistral have launched Mistral NeMo, a 12B model featuring a 128k token context window and state-of-the-art performance in reasoning, world knowledge, and coding for its size category. The model is user-friendly, serving as a drop-in replacement for Mistral 7B. Pre-trained base and instruction-tuned checkpoints are available under the Apache 2.0 license to encourage widespread adoption among researchers and enterprises.
Mistral NeMo is optimized for global, multilingual applications, supporting numerous languages including English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. It introduces a new tokenizer, Tekken, which is more efficient than previous models, especially in compressing source code and various languages. This advancement brings cutting-edge AI capabilities to a broader audience across diverse linguistic and cultural contexts.
Weekly Research Spotlight 🔍
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs
RankRAG, is a groundbreaking Retrieval-Augmented Generation (RAG) framework that instruction-tunes a single large language model (LLM) for both top-k context ranking and answer generation. By integrating a small portion of ranking data into its training blend, RankRAG excels in context ranking, outperforming existing expert models, including the same LLM fine-tuned exclusively on extensive ranking data.
In retrieval-augmented generation tasks, RankRAG models significantly surpass GPT-4-0613, GPT-4-turbo-2024-0409, and ChatQA-1.5 across nine knowledge-intensive benchmarks. Additionally, RankRAG shows exceptional generalization capabilities to new domains, such as biomedical tasks, setting a new standard in RAG performance and versatility.
Best Prompt of the Week 🎨
A surreal sculpture of a human bust made of fragmented marble-like pieces, interwoven with soft orange flowers and green foliage, serene face with closed eyes, soft gradient lilac background, ethereal and dreamlike atmosphere, nature reclaiming the sculpture. --s 250 --v 6.0
Today's Goal: Try new things 🧪
Acting as a Tax Planner
Prompt:I want you to act as a tax planner. You will create a structured daily plan specifically designed to help individuals manage their taxes efficiently, ensure timely payments, and maintain a clean income tax history. You will identify a target client profile, develop key strategies and action plans, select the tools and resources for effective tax management, and outline any additional activities needed to ensure compliance and financial order. My first suggestion request is: "I need help creating a daily activity plan for someone who wants to pay their taxes on time and maintain a spotless income tax record.
This Week's Must-Read Gem 💎
How did you find today's email? |