While Lean and Agile principles are nothing new in the software development community they have struggled so far to have a recognised impact within the Data science community.
Data Science teams tend to approach data science projects with a mainly academic approach to solving problems involving long periods of research and discovery coupled with modelling and Evaluation exercises carried out in isolation with limited or inaccurate data sets. While these parts of the process are valuable they produce little in the way of tangible deliverable within any reasonable time frame and often produce results which need be re-engineered before they can be of any use to the business.
This results in the value of the data science team being seriously undermined and criticised for lacking accountability and as such ROI to the business.
The application of Lean and Agile methodologies in conjunction with good Data Science practices will ensure that regular milestones can be met and a continuous flow of deliverable items can be identified and measured against a desired objective and result in the overall success of any data science project.
What this does not mean and is a popular misconception, is that the quality of the overall deliverable will be in anyway undermined with the use of Agile. This is achieved by breaking the project into prioritised deliverable iterations which follow a critical path which fit in with the overall data science project. It also allows for many of the traditional linear tasks to be performed in parallel by a cross functional team who are all working to the same end goal of successful delivery within an agreed timescale. The Agile continuous feedback loop with the business on progress and deliverable milestones allows for check-pointing against business strategy and goals on a regular basis and will ensure the project remains relevant.
Using a successful Data Science Model like CRISP-DM the team can break any data science project into a clearly defined list of tasks with clearly understood delivery milestones, resource requirements and logical orders of priority. If we then take our Data science project and convert this into logical user stories, match the various delivery elements into deliverable tasks using an Agile Backlog and then assign a cross functional Agile Scrum team, we can deliver a Data Science project using Lean and Agile techniques and at the same time adhere to CRISP-DM best practices.
What this allows you to do is adhere to recognised Data Science Process and use an Agile framework to keep everyone on track with regular delivery success remaining accountable to the business while being flexible to embrace change throughout.