Friday, May 13, 2022

Preliminary study at the product design stage

After studying, we approach the product design stage with a clear understanding of integration scenarios, which allows us to create an architecture tailored specifically for the task, and not for the spherical horse. Saving time, finances, reducing the cost of development and subsequent ownership is obvious.


The execution of all these tasks with data science tools turns out to be much more compact and efficient than in classical programming languages ​​and frameworks. And, since the right data science tools support a full-fledged devops methodology, then in most cases all these developments can be transferred “AS IS” to a productive architecture. Adding a little autotests and logging.


For doubters, let's take some classical problem of this class. Let it be the creation of a platform for selling tickets to events. A relevant example is that many large companies are now building their ecosystems based on the symbiosis of their core business and partner synergies.


Tickets are sold from your site. The response time for a user, be it a website or an application, should be tens/hundreds of milliseconds. All relevant information on tickets lives on the partner sites (and there may be several of them). The models of these partners, both technically and meaningfully, may not fight each other in any way. Elementary - one is responsible for cinema, the second - for theaters, the third - for outdoor concert venues.

From here, again, taking into account the characteristic times of the appearance of changes, approximately the following architecture appears: an internal universal stable data model for all events (reference book), daily updating of the reference book offline, online interaction with a partner based on "live" information at the stage of buying tickets ( API or widget). Do not forget that partners are constantly in development. Either a new one will appear, then the current one will have a global refactoring. Everything is very dynamic.


Such an integration scheme is organized and supported in production many times faster and easier by means of data science than Java/C++/SQL/php.

Formulating tasks at the initial stage of creating a new enterprise product

Consider through the prism of API (REST API):

  • characteristic structure and content profile/statistics;
  • optimal pipeline for input transformation of objects and non-rectangular hierarchical representations;
  • assessment of the quality of completeness and correctness of data, identification of potentially problematic elements in terms of content;
  • development of an optimal internal scheme for storing received data with an emphasis on algorithms that are subsequently applied to this data within the product;

assessment of the speed and stability of external APIs, development of optimal parameters for interaction procedures, for example:


  • functions and scope of requests;
  • failure handling;
  • multithreaded integration;
  • characteristic behavior of external sources;
  • detection of errors in the API documentation or the API itself;
  • development and construction of integration pipelines, when the required data set can be obtained only by sequentially calling data-related functions;
  • development of hypotheses and development of algorithms for matching (exact or fuzzy) data obtained from different sources;
  • much more.

Preliminary study at the product design stage

After studying, we approach the product design stage with a clear understanding of integration scenarios, which allows us to create an archi...