As Elastacloud is diversifying into its product set it's allowed me to evaluate some of the problems with organisational adoption of data science. The last 18 months have been very prolific in this respect for me as we've had some big customers that just can't achieve anything with data science and can't understand how they can achieve any measurable value so just continue to invest in nothing or not a lot.
Without sounding too nihilistic this is fairly endemic across every industry. We know who the successful companies are at this and the companies that don't know how to achieve anything. Companies like Amazon and Expedia live on data science and understand the basis of the continual improvement cycle for modelling. This is just like software. Most good software developers are never pleased with their code and continue refactoring. The value for software is measurable and it's easy to build cost benefit analysis but it's much harder to quantify the value of data science.
The first issue is that many companies do data science wrong. They cannot organise themselves around goals which are practical. We come across all sorts in Elastacloud and some of the key indicators to a lack of progress can be understood with the following questions:
Do I have different processes to get Data Science models into production than standard software development?
Do I know what my data science teams are working on at any one time?
Is data science centralised in my business or are my data scientists in specific teams?
Do my data science teams write production code for large frameworks like Spark, Storm and Hadoop without the rigour of standard software development TDD, BDD, Mocks, Unit Tests etc.?
Do I have standard software tools which allow data science to collaborate on projects?
Do I understand the implications of using "Agile Data Science"?
Do I understand the drivers of most data scientists?
Can I appreciate that some generalised AI models by the likes of Microsoft may give much better results than my data scientists? Do I care about the economics associated with this?
Do I have ways to measure the impact of the data science changes in production with either cost reductions or unbiased revenue increases?
Do I understand enough about the skills sets in data science to allow me to get the best person for the categorisation of workloads?
If the answers are no or not clearcut then you should probably speak with us or another company that has experience building successful data projects in-house or for customers. One key way to measure success is to build your own data product teams around many projects internally and get an external consultancy to take ownership of a project team and use this as a measure of comparative progress after 6 months. This has been our ongoing litmus test.