The odds appear to be stacked against data scientists, according to multiple reports, up to 85% of their projects never make it to production. Thomas Wood, a data science consultant writing for Fast Data Science, explains why only a minority of these meant-to-be solutions are ever completed, and an even smaller number generate any value.
What is the problem exactly?
Wood explains that a common lament from his data scientist colleagues is that they’ve made plenty of great, and accurate models, but nobody ever used them, or, that they couldn’t get business executives excited about them. Stakeholders counter this with that whilst the models were good, and their qualifications impressive, they didn’t answer the fundamental questions in the brief.
What’s behind project breakdown?
On the business side, the potential reasons why are numerous:
- Those championing data science in the boardroom have often been met with sluggishness or resistance from other executives in implementing the changes recommended by data scientists
- Greater problems still if the data science champion moves department or company
- Data science projects are deprioritised according to other, more immediate, business operations. Without having stakeholders communicating and engaging regularly it can be hard to maintain momentum in bringing change
- Likewise, as most of these projects are long-term, executives may lose patience, or the organisation may change direction
- Project sign-off may not be a democratic decision, meaning that the project could be benched because a single, senior stakeholder who cannot be convinced, simply says no
On the data science side:
- Did the data scientist communicate their findings effectively in a non-technical manner?
- Did the data scientist pursue the correct metric?
- Did the data scientist spend excessive time experimenting with models?
What can be done to forge successful collaboration?
For the best chance of avoiding project breakdown, the data science project should be structured into a series of stages, whereby communication between the analytics team and the business is maximised.
- Business question: The project should first and foremost be based upon this question, rather than a focus on the technologies that can be built. This initial hypothesis should be formulated and refined with the stakeholders and scientists at the very beginning.
- Data collection: To either accept or reject this initial hypothesis the scientist need only to collect relevant data so that findings can be presented as quickly as possible.
- After the initial findings are presented, the stakeholders should be heavily involved in establishing, based on the data insights, what they want to achieve. At this stage, it’s imperative to success that both parties understand where the ROI will be achieved, if the project proceeds
- Investigation can now be undertaken by the data scientist. Wood recommends that scientists meet once a week with the main stakeholder(s) and have slightly less frequent catch-ups with the larger group of executives that are involved. Choosing transparent AI solutions, rather than complex ones, can help the data scientist maintain engagement with stakeholders. And everyone involved must keep in mind the core focus of the project: is it heading towards the ROI for the company?
- Finally, the data scientist should present their insights, as well as their recommendations for the organisation, to all the stakeholders involved. Oversharing is advised here: think videos, presentations, white papers, code-sharing, work notebooks and data with executives, for a comprehensive handover of data to the commissioning company.
If the five above points are adhered to, Wood is confident that the value will be clear to senior stakeholders. As always, communication is key, both during the project and on an ongoing basis between the data science team and the business, to ensure value is delivered.
Source article: Why do data science projects fail?