Mobile Menu

Lessons from building an end-to-end ML platform

Caleb Kaiser, writing for Towards Data Science, shares his experience of trying – and failing – to build an end-to-end framework in a cautionary tale for those similarly seeking to improve upon existing machine learning user experience. Whilst the innate complexity and fluidity of the ML ecosystem threw up some really tricky hurdles, it ultimately led the group to launch their own ML model serving venture.

Kaiser explains how he and his team wrote abstractions for different stages of the ML pipeline, used Kaggle datasets for testing, and eventually open-sourced the software repository. Sadly, while it proved popular on GitHub, no one actually used it.

The appeal of end-to-end frameworks is having a coherent and connected pipeline, with a single interface, to avoid the frustrations and glitches of jumping between different environments. The team were able to develop a tool to do just that, however, its failing lay in its applicability to the real world. It simply did not work with most other stacks.

The take-home problem with end-to-end ML frameworks: “the production machine learning ecosystem is too young for an end-to-end framework to be both opinionated and right”.

Kaiser uses some examples to illustrate how rapidly changing the field of production ML is, and it’s this constant flux that meant their project to build an end-to-end framework was doomed from the outset. It would have been near impossible to build such a tool that supported the supposedly “correct” stack and “correct” feature for every application.

Model serving infrastructure

After making this realisation (which admittedly came later than they would have liked) they decided to focus on building model serving infrastructure, which was at least able to confer greater consistency and stability.

The team found that the “model-as-a-microservice” pattern was a niche yet to be exploited and the opportunity led them to launch Cortex Labs, an early-stage start-up to support data scientists in deploying ML models.

Data scientists and developers can use Cortex open-sourced tools on any stack as follows:

  1. Deploy your model to AWS
  2. Monitor your API
  3. Make a prediction
  4. Make another prediction

Kaiser summarises the philosophical lessons that they’ve learnt from their backgrounds in web development and how these principles should eventually extend to the entire ML ecosystem. In a nutshell, he explains, “Data scientists shouldn’t have to learn Kubernetes, they should get to focus on data science. Engineers shouldn’t have to spend days figuring out how to keep a 5 GB model from blowing up their AWS bill, they should be free to build software.”

You can read the full article here

Title image source

Share this article