So what does this actually mean in the real world? In simple terms its about bringing all the core values of Lean or more precisely Agile principles and methodology into the world of Analytics and Data Science.
Traditionally the data science community has had an autonomous approach to data science projects, this stems from a traditional academic approach to research and development of new algorithms and models in order to make future predictions based on historic data and apply these predictions to future behaviour. This is all well and good in a slow-moving environment when slow reactive behaviour is the acceptable norm when it comes to making major changes to modes of business operation.
In the new world of real time data with mass processing capability in the cloud utilising unlimited storage and huge computational capacity the world demands a faster moving data science approach.
In the Mid 90’s the Software development community faced a similar dilemma in terms of its approach to software delivery and the need to deliver priority driven solutions in a more efficient and timely manner. The community looked to the successful Lean techniques which had been introduced with great success to the Automotive Manufacturing industry and from this several initiatives were born using Lean principles and primarily branded under the term Agile.
At the core of this was the Agile Manifesto to Software development which has 4 key principles
Individuals and Interactions more than processes and tools
Working Software more than comprehensive documentation
Customer Collaboration more than contract negotiation
Responding to Change more than following a plan
While the secondary concerns were important the primary concerns were more critical to success as such.
Individuals and interactions - Self-organisation and motivation are important, as are interactions like co-location and pair programming.
Working software - Working software is more useful and welcome than just presenting documents to clients in meetings.
Customer collaboration - Requirements cannot be fully collected at the beginning of the software development cycle, therefore continuous customer or stakeholder involvement is very important.
Responding to change - Agile software development methods are focused on quick responses to change and continuous development.
Using the Agile Manifesto the community were able to agree on twelve key principles when it comes to delivering successful projects.
1. Customer satisfaction by early and continuous delivery of valuable software
2. Welcome changing requirements, even in late development
3. Working software is delivered frequently (weeks rather than months)
4. Close, daily cooperation between business people and developers
5. Projects are built around motivated individuals, who should be trusted
6. Face-to-face conversation is the best form of communication (co-location)
7. Working software is the primary measure of progress
8. Sustainable development, able to maintain a constant pace
9. Continuous attention to technical excellence and good design
10. Simplicity—the art of maximizing the amount of work not done—is essential
11. Best architectures, requirements, and designs emerge from self-organizing teams
12. Regularly, the team reflects on how to become more effective, and adjusts accordingly
These principles form the core of all successful Agile projects which are no longer exclusive to the Software development community with Lean and Agile techniques becoming prolific in many different sectors as a successful agent for change.
In the Data Science community, it is recognised that there are a number of key elements to all Successful data science projects and that very few data science deliverable's remain static. The need to support real time analysis of vast quantities of data and react accordingly is now seen as a necessity to business survival in the online and IoT world that we now live and do business in. Therefore, a faster moving iterative and automated process is required to speed up the path of delivery. This in turn will support the constant need to train and improve on complex data models to meet the fast-moving requirements of the business community.
By combining the principles and techniques of Lean Agile Delivery with the recognised stages of a good Data Science project it’s possible to deliver true Lean Analytics in Data Science.