In my previous post, I presented the way API products can become data products when they are combined with a data processing strategy. Data-enhanced API products merge the insights obtained from the data being processed with APIs' ubiquity and accessibility. I also presented the process an API product has to go through to become a useful and profitable data product. This process comprises identifying its goal, value, implementation, and evolution. The first two steps of this process were already discussed. This post presents the remaining two.
Implementing your data product
Once the goal and value of an API have been identified, the next logical step is to actually implement a software solution to provide this value. While it is tempting to start building a solution to support all the possible scenarios identified during the first two steps, a better approach is to build a minimal product so that we can get actual feedback from users as soon as possible. The main goal in this stage is to kick-start the data product implementation in order to understand what the users need and, as a consequence, improve and refine the value provided by the solution. Without this initial feedback, it is impossible to know which features are actually relevant and useful. We could easily fall into the trap of building an over-engineered solution that consumes a huge amount of resources at the outset and provides little to no value when finished.
A nice way to prevent this problem is to implement a Minimum Viable Product or MVP. The way an MVP fits in the API delivery process was presented in a previous post. An MVP allows for the creation of a feedback loop by implementing a data product with a minimum set of features so it can be shipped to the final users in the shortest time possible. In other words, instead of adding one a feature at a time during a long period of time, an MVP allows the team to create a fully usable –although limited– product in a relatively short time. Dat Tran wrote a great article on how an MVP can be applied to data products enhanced by Machine Learning. The key takeaway here is to start with the simplest solution that generates results as quickly as possible. With this baseline properly established, it is a lot easier to implement more complex and complete solutions later on.
This same reasoning should be applied to infrastructure. In other words, instead of investing tons of resources in data centers, expensive on-premise servers, GPUs, and clunky frameworks, all initial efforts should be directed toward the use of the cloud. By using cloud services from any provider, you can count on any of these items for a fraction of the cost. Even better, you can skip all of this, forget about infrastructure altogether, and use managed cloud services to streamline the entire implementation process. Examples of managed cloud services include AWS Lambda, Google Cloud Functions, AWS SageMaker, Google Datalab, AWS Glue, AWS Batch, etc. Future posts will cover these and many more of these outstanding tools.
Finally, the implementation of your data product needs to also include an effective way to deploy the finished solution to production. Once again, managed cloud services can be used to expedite this process as much as possible and to allow for an efficient feedback loop that provides valuable insights into the deployment process. In this context, it is necessary to make the deployment process automated, repeatable, and predictable. Specifically, deploying your product should not include any type of manual configuration or human intervention and it should be able to be executed anytime and as many times as necessary, always providing the same expected output.
Improving your data product
A data product is never done. The more your users use it, the more you get to know the problem and its potential solutions. Therefore, it is important for a data product to learn and improve from its own interaction with its users. The information generated from this interaction is unique and valuable. This information should be used by the data product to enhance the value it provides, to self-adapt to the users' needs, and to learn from them. Data product self-improvement is the first direct consequence of the implementation of an MVP, as described previously. This makes user feedback a necessary requirement for the product’s success.
Incorporating an appropriate feedback system in the final product is of paramount importance. This feedback system should provide a way to acquire information from the users in a seamless, subtle way and it should encourage and reward users that are willing to provide additional feedback information such as suggestions, reviews, upvotes, tags, etc. Once the feedback from the users is acquired, the feedback system should be able to validate, transform, and refine this information so that it can be used by the data product as input to improve its value and usage. The insights obtained from user feedback should be directly related to the product features. This helps in identifying the features that provide the highest value and the ones that should potentially be discarded or redesigned.
Finally, product improvement is not based only on user feedback but also on analyzing its own performance over time. To achieve this, it is necessary to define a set of concrete and measurable parameters or metrics that will be observed and scrutinized after the product reaches production. These parameters should include aspects like performance, accuracy, availability, and usability.