LLMs are not enough
In 2023, Large Language Models (LLMs) took center stage in the AI world, showcasing their ability to perform general tasks through simple prompting. However, as we move through 2024, a significant shift is occurring in the AI landscape. State-of-the-art AI results are increasingly being achieved not by single, monolithic models, but by compound systems with multiple interacting components.
The Emergence of Compound AI Systems
Compound AI Systems are defined as systems that tackle AI tasks using multiple interacting components, including multiple calls to models, retrievers, or external tools. This approach contrasts with traditional AI models, which are simply statistical models like Transformers predicting the next token in text.
Several high-profile examples:
- Google’s AlphaCode 2 uses LLMs to generate up to 1 million possible solutions for programming tasks and then filters them down.
- AlphaGeometry combines an LLM with a traditional symbolic solver to tackle complex mathematical problems.
- In enterprise settings, 60% of LLM applications use some form of retrieval-augmented generation (RAG), and 30% use multi-step chains.
- Microsoft’s chaining strategy exceeded GPT-4’s accuracy on medical exams by 9%
- Google’s Gemini uses a complex inference strategy called CoT@32, which calls the model 32 times for benchmarking.
Why the Shift to Compound Systems? Several factors are driving this trend:
- Ease of Improvement: For many tasks, improving system design offers better returns than scaling up model size.
- Dynamic Capabilities: Systems can incorporate timely data and respect access controls, overcoming limitations of static training datasets.
- Enhanced Control and Trust: Compound systems allow for better control of AI behavior and can increase user trust through features like fact verification.
- Flexibility in Performance and Cost: Systems can be tailored to meet varied performance goals and budget constraints.
Challenges in Developing Compound AI Systems
While promising, compound AI systems present new challenges:
- Design Space: The range of possible system designs is vast, requiring careful exploration and resource allocation.
- Optimization: Co-optimizing multiple components to work well together is complex, especially with non-differentiable components.
- Operation: MLOps becomes more challenging, requiring new tools for monitoring, debugging, and securing these complex systems.
Emerging Approaches and Solutions
To address these challenges, several new approaches are emerging:
- Composition Frameworks: Tools like LangChain, LlamaIndex, and AutoGPT help developers build applications with multiple AI components.
- Automatic Optimization: Frameworks like DSPy aim to optimize systems composed of LLM calls and other tools to maximize target metrics.
- Cost Optimization: FrugalGPT and AI gateways help route inputs to different AI model cascades to optimize performance within budget constraints.
- Advanced Operations: New LLMOps and DataOps tools are being developed to monitor and debug complex AI systems more effectively.
The Future of AI Development
As AI continues to evolve, compound systems are likely to remain at the forefront of innovation. They offer a way to maximize AI quality and reliability beyond what single models can achieve. This shift opens new possibilities for AI applications but also requires developers to adapt their approaches and toolsets.
The trend towards compound AI systems represents a maturation of the field, moving from the excitement of individual powerful models to the engineering of sophisticated AI solutions. As we progress through 2024 and beyond, mastering the art of designing, optimizing, and operating these systems will likely become a crucial skill for AI developers and researchers alike.