The Rise of AI has impacted everyone and everything. From the early days of custom models to the arms race around large-scale foundation models, AI is disrupting every sector, and debates around everything from copyrights on training data to new challenges around generated content surround this gold rush. One thing seems remarkably clear, there seems to be an insatiable appetite for AI. When we see demand increase like it has, we have to look at the challenges in our supply chain. We are seeing some substantial scalability challenges in infrastructure, power and [the right] talent. In this series, we’ll talk about some of those challenges after all they are also opportunities for innovation. But, before we can talk about where we are going, it’s important to establish a clearer understanding of how AI is delivered.
Instead of yet another description, please checkout this pretty complete summary. But it is important to understand the AI Family:

- AI enables machines to perform cognitive tasks that typically require human intelligence—such as analyzing data, recognizing patterns, making predictions, and automating complex decisions. AI is the broadest category, encompassing all techniques that
- Machine Learning enables systems to automatically learn patterns from data and improve performance without being explicitly programmed for each task—allowing organizations to extract insights and automate decisions at scale. Unlike broader AI, ML specifically learns and adapts from experience with data.
- Deep Learning uses neural networks to process complex, unstructured data like images, speech, and text—powering advanced applications in computer vision, natural language processing, and pattern recognition. Deep Learning differs by mimicking the human brain’s structure to handle the most complex data types.
AI Lifecycle and Value Stream
In the lifecycle of an AI model, two distinct and critical phases exist: model development and model inference. The first, development, is an intensive, iterative process focused on building, training, and validating the model. The second, inference, is the deployment and operational use of that trained model to make predictions or decisions.Model development begins with model design, a foundational stage where the problem is defined and a suitable model architecture is chosen. This involves selecting algorithms (e.g., deep neural networks, decision trees, etc.) and defining the model’s structure, such as the number of layers and neurons in a neural network. This design is followed by model testing, a crucial phase to evaluate the model’s performance on unseen data. This testing validates the model’s accuracy, precision, and robustness, ensuring it generalizes well beyond its training data and mitigates issues like overfitting. Once the model proves its reliability, it undergoes release and promotion to inference, a process that packages the trained model and its dependencies into a deployable artifact. This artifact is then optimized for a production environment, often including optimizations for speed and memory usage, and then pushed to a live system where it can be used for real-world applications.
Model inference, the operational side of AI, refers to the process of using a trained model to make predictions on new data. The style of inference deployment is often dictated by the application’s requirements for latency, throughput, and cost. Online inference is a style where predictions are made in real-time as a request comes in. This is ideal for applications requiring low latency, such as a recommendation engine on a website or a fraud detection system processing a transaction. In contrast, batch inference processes a large volume of data at scheduled intervals, making it suitable for tasks where real-time predictions are not necessary, such as generating monthly sales forecasts or processing nightly reports. A hybrid approach, known as edge inference, moves the computation directly to the device (e.g., a smartphone, a smart camera) rather than a centralized server. This style is critical for applications that need to function offline or where data privacy is paramount, like facial recognition on a personal device. Understanding these different styles is essential for business technologists as it directly impacts system architecture, resource allocation, and the overall user experience, but also business executives as it impacts growth opportunity, cost/efficiency, safety/risk, environmental, social, political and governance aspects.
Stay Tuned for part 2: Differences in Model Building and Inference