Working with AI: Under-the-Hood Components vs. In-Cockpit Techniques
Boardroom.AI Staff —
Boundaries are being smashed by AI these days and the relationship between humans and language models like ChatGPT is undergoing a significant shift. Up until now, users had to adapt to the quirks and limitations of AI systems, meeting them on their own terms. However, the latest feature rollouts have made these models able to understand and produce virtually any type of media content. The future of AI lies in an increasingly user-centric approach, where large data models (coined here as LDMs) are designed to understand and cater to the unique needs and preferences of individual users, including many different modes of interaction. We have already seen Google Gemini and OpenAI’s GPT-4o “omni” model embrace the notion of multimodality from the ground up, meaning that pretty much any type of data can work as input or output. These types of improvements will coincide with more and more happening under the hood and less human intervention from the cockpit.
Just as different aircraft are built and configured for specific purposes, deep learning algorithms come in various shapes and sizes to suit different applications. Sometimes, a massive, general-purpose model with billions of parameters is required to tackle a wide range of tasks. Other times, a smaller, faster, and more specialized model is the best fit. It’s all about choosing the right tool for the job, whether it’s a 747 for a long-haul flight or a single-engine Cessna for a short hop.
Under the Hood: The Engine of Language Models
To understand how language models work, it’s essential to examine the key components that power their performance. Some of these under-the-hood elements that are worthwhile for users to understand include:
- Vast training data: Just as an aircraft needs fuel, language models require vast amounts of data to generate accurate and contextually relevant responses. This data comes from pre-training on diverse datasets, continuous learning, and the integration of structured and unstructured sources.
- Custom instructions: Like a GPS navigation system, custom instructions guide the model towards desired outputs, ensuring alignment with user expectations and domain-specific requirements.
- Retrieval of file attachments: Language models have access to a toolbox of external data sources, documents, images, and other media, which provide supplementary context and examples to enhance their performance.
- Tool use: The algorithms and architectures that power language models are like the engines of an aircraft, continuously improving and updating to push the boundaries of what’s possible.
- Fine-tuning and multimodal processing: Fine-tuning jobs allow models to adapt to specific tasks or domains (when given a bit of time and effort to provide specific data or reinforce with human feedback), while multimodal processing enables them to integrate information from various input modalities, such as text, speech, images, and video (as part of the move toward multimodality or what may soon be called omnimodality).
In the Cockpit: Techniques for Steering Language Models
While the under-the-hood components provide the foundation for language models, it’s the cockpit techniques that allow users to steer them toward desired outcomes. These techniques include:
- Zero-shot, one-shot, and few-shot prompting: These techniques enable language models to perform tasks with no explicit examples, or with as many examples as the user is able to provide, adapting to new challenges on the fly and ensuring useful output almost every time.
- Chain of thought and chain of feedback prompting: By generating step-by-step reasoning over multiple prompts and iteratively refining outputs based on human feedback, these techniques help models tackle complex problems that would otherwise be unattainable in one go.
- Scratchpad prompting and end-of-interaction summation: Scratchpad prompting allows models to store intermediate computations and notes while the interaction unfolds. At the end of a chat, it is often extremely useful to ask for the AI to summarize the entire conversation into a single report, or a single paragraph, or even into a concise email.
The Interplay of Components and Techniques
Just as a skilled pilot must understand both the mechanics of their aircraft and the techniques for navigating it, users of AI must grasp the interplay between under-the-hood components and cockpit techniques. By seamlessly integrating these elements, large data models can achieve remarkable performance and provide intricately tailored, user-centric experiences. Nonetheless, the shift will take place whereby more and more happens under the hood, making everyone better prompt engineers over time — without having to lift a finger.
As the field continues pushing forward, ongoing improvements in both components and techniques will continue to open new frontiers as far as what can be achieved in modern computing. By staying on top of these developments and mastering the art of piloting AI models, users can unlock the full potential of these transformative tools and chart a course toward new horizons for their business or their personal life (or both).



