Cristian Dordea
Feb 2, 2024
The process Microsoft is using for its ML & AI Projects
The modern process to deliverying and manage ML and AI projects by Microsoft, how their lifecycle compares with the CRISP-DM and the downside of this approach.
Highlights:
This edition reviews the modern process to deliverying and manage ML and AI projects by Microsoft, how their lifecycle compares with the CRISP-DM and the downside of this approach.
As always, see AI for Business News You May Have Missed at the end of our newsletter
Intro
In 2016, Microsoft released its own process for handling ML & AI Projects, called Team Data Science Process (TDSP). The approach is an iterative agile data science approach used for predictive analytics and AI solutions.
In deep-diving more into their process, I realized it resembled more like the CRISP-DM lifecycle combined with Scrum as the delivery framework. Let's dive into more details so you can better understand how it works. At the end, I will point out the biggest drawback of this approach.
The Microsoft process is one of the most complete approaches that I've seen in this space. That is because most of the data science, ML, and AI processes out there usually just have a standalone lifecycle. It's similar to how we have the software development lifecycle (SDLC) in developing software solutions.
In most cases, this is not enough because without a delivery framework integrated, is difficult to coordinate and organize work across larger teams and complex projects., especially in large enterprises. Just as we combine the SDLC with a delivery framework like Scrum or Waterfall in software development, AI & ML also require a delivery framework.
TDSP lifecycle
The TDSP lifecycle is composed of five major stages that are executed iteratively:
- Business Understanding: Define objectives and identify data sources
- Artifacts during this stage:
- Charter Document
- Data Sources
- Data Dictionaries
- Data Acquisition and Understanding: Ingest the data, explore the data, and set up a data pipeline.
- Artifacts during this stage:
- Data Quality Report
- Solution Architecture
- Modeling: Feature engineering and model training to decide if the model is suitable for production
- Artifacts during this stage:
- Generated Features
- Report of the Model
- Deployment: Operationalize the model
- Artifacts during this stage:
- Status Dashboard
- Final Modeling Report
- Final Solution Architecture
- Customer acceptance: System validation & project hand-off
- Artifacts during this stage:
- Exit report of the project for the customer
Compared to CRISP-DM
In contrast, if we compare with the CRISP-DM phase, it looks something like this
CRISP-DM | Microsoft TDSP |
Business understanding | Business understanding |
Data understanding | Data acquisition and understanding |
Data preparation | Data acquisition and understanding |
Modeling | Modeling |
Evaluation | Modeling |
Deployment | Deployment |
None | Customer Acceptance |
You can see how the lifecycles are similar and how the phases map to each other.
Combined with the Delivery Framework
TDSP lifecycle is combined with Scrum using time-based Sprints and regular Scrum artifacts and meetings. They outlined additional details on how to break out the backlog using features, user stories, and tasks, which you can read in more detail here. Using Scrum allows the teams to have an incremental approach to ML & AI work, which is a step up from using the waterfall approach as it used to be done in the past. We all know the risk that comes with Waterfall.
The approach doesn't stop here. They defined 6 different roles and even developed templates for the artifacts, which you can read more about here.
With all of the process outlines and great documentation by Microsoft, there is one big drawback to this delivery approach.
Downside
As we mentioned in our last newsletter post, one of the biggest differences between AI/ML projects and software development is the level of unknowns and the experimentation nature of the work. Estimating timelines and resources is more straightforward in software projects due to the more predictable nature of tasks. For ML/AI projects, the experimental nature and data dependencies make accurate estimation challenging. Trying to commit to specific work in a time-based sprint is even more challenging with this type of work. In software development, most teams have one or two weeks sprints in which they make commitments to what will be delivered in that timeframe.
In data science and ML work, most of the time it is impossible or extremely difficult to make commitments at the sprint level. Sometimes, the work itself is trying to answer a question or confirm a hypothesis, and the team doesn't even know if the question can be answered.
Due to the fixed timed-based Sprint, Scrum is a sub-optimal approach for Ml/AI teams.
In Conclusion,
The Microsoft TDSP is one of the most complete approaches for ML/AI because it combines a data science lifestyle with an agile delivery framework. Sure, the lifecycle is very similar to CRISP-DM, but there is a reason why CRISP-DM has been the most popular lifecycle for the last 25 years: It is fairly simple, and it works. Microsoft TDSP doesn't stop there, they defined a detailed process that has been documented thoroughly, including specific artifacts with templates and defining roles and responsibilities.
I personally haven't seen any other delivery process for ML/AI documented as well as the Microsoft TDSP. With that said, I believe Microsoft TDSP falls short due to time-based regular Scrum Sprints. Data science and AI work is not the same as software development and most of the time the team is not able to predict or make commitments for each sprint outcome.
There must be a better way. Our exploration of better ways of managing AI projects continues.
Generative AI for Business News You May Have Missed
Amazon debuts AI-powered Rufus shopping assistant
Amazon.com Inc. today introduced a new chatbot, dubbed Rufus, that will help online shoppers browse products and decide which ones to buy. (read more)
Financial services introducing AI but hindered by data issues
According to research by EXL, around 89 percent of insurance and banking firms in the UK have introduced AI solutions over the past year. However, issues with data optimisation could hinder their impact. (read more)
Rapid growth in AI and cloud topics evident in latest O’Reilly annual report
A new report released today by O’Reilly Media Inc. finds, not surprisingly, that there was an unprecedented surge of interest in generative artificial intelligence in 2023, based on usage metrics from the O’Reilly Learning Platform. (read more)
Riding the wave: How Dynatrace looks to capitalize on growth of AI and need for transparency
Companies are moving rapidly to integrate artificial intelligence into key workloads, but can anyone really be sure of what’s happening inside the AI engine? (read more)
Google announces UK data centre to meet ‘growing demand’ for AI
Google has announced plans to invest $1 billion in a new data centre in the UK which it says will help to meet “growing demand” for its AI and cloud services. (read more)
Italian data protection regulator accuses OpenAI of violating GDPR rules
Italy’s data protection authority has OpenAI’s ChatGPT lined up in its sights: After briefly banning the service from being used last year because of alleged privacy violations, the Garante today formally filed charges against the artificial intelligence developer. (read more)
How will AI impact the gambling industry?
Using AI in gambling experiences opens up new dimensions of interactivity and engagement. AI algorithms can optimise live dealer games, making them more responsive and adaptable to player preferences. (read more)
AI Training & Certifications
Building AI Applications with Vector Databases:
A 1h beginner-friendly course, you’ll harness the versatility of vector databases to build a wide range of applications using minimal coding!
Andrew Ng Founder of DeepLearning launches “AI for Everyone” course:
AI for Everyone”, a non-technical course, will help you understand AI technologies and spot opportunities to apply AI to problems in your own organization.
Generative AI for Executives by AWS
This class shows you how to mitigate hallucinations, data leakage, and jailbreaks. Incorporating these ideas into your development process will make your apps safer and higher quality.
Introduction to Artificial Intelligence (AI) by IBM on Coursera
In this course you will learn what Artificial Intelligence (AI) is, explore use cases and applications of AI, understand AI concepts and terms like machine learning, deep learning and neural networks.
Microsoft Certified: Azure AI Engineer Associate - Certifications
Target audience: Professionals tasked with building, managing, and deploying AI solutions using Azure AI, covering all phases of AI solution development.
Jetson AI Courses and Certifications | NVIDIA Developer
Target audience: Suitable for anyone interested in AI and Edge AI, from beginners to advanced learners.