Data

The Challenge of Delivering Events in Business Decision-Making

By Oleg Yermilov

In this blog, we will examine the common challenges of event design, ingestion, and processing when it comes to delivering business value. The article is based on Playtika’s experiences and internal observations but addresses common issues faced by the industry today. The next article in the series will illustrate Playtika’s solution to these recurring problems.

Introduction

The gaming industry, especially mobile gaming, is an extremely competitive and dynamic market. New titles, trends, and game mechanics appear on a daily or weekly basis. This vibrant and competitive environment challenges companies to find new ways to develop and operate their products successfully.

It is no secret that AI and analytics play crucial roles in the modern gaming industry. Day-to-day operations, new player acquisition, and most importantly, player retention have become increasingly dependent on these two pillars of decision-making in data-driven companies like Playtika. The demand is constantly growing for rapid insights, in-depth analysis, and immediate actions that require a streamlined integration of AI and analytical solutions with operational processes. At the end of the day, all of these capabilities boil down to high-quality data. We can only extract true value from AI and analytics if the data is accurate, reliable, delivered in low latency, and conveys its business meaning through proper modeling.

Value of Events

The modern gaming industry is currently facing numerous data processing challenges. Playtika’s approach to overcoming these hurdles focuses on the following key aspects::

  1. Gaming is a creative industry where freedom of expression is essential. Each member of the studio team, be it a designer, a developer, or an analyst, must have unlimited access to all necessary tools so that they may implement and verify whatever unique ideas they may have. These ideas often result in new product features which necessitate usage analysis as well as constant feedback.
  2. Rapid feature development is at the core of the mobile gaming industry. Agile methodologies, advancement of CI/CD, and AB Testing are just some of the engineering best practices that help studios quickly and regularly adapt to the ever-changing demands and deliver new, proven features to production.
  3. The players’ feedback loop is crucial for constant improvement and obtaining the right balance in the game. Studios cannot solely rely on customer surveys to understand whether a specific feature is receiving positive or negative feedback from the players. To gain meaningful insights, they must be equipped with an analytical toolset that leverages granular metrics and key performance indicators.
  4. Gaming is one of the most interactive domains where event processing plays a crucial role. The ability to react quickly to game events is what allows Playtika to deliver unique experiences to the players. The advent of AI brought content personalization to an unprecedented level, as player experiences are created in real time based on the chain of in-game events.

These aspects highlight the vibrant creative environment in which Playtika operates and emphasize the importance of the specific type of data – event data. If we were to liken analytics and AI to the eyes and the brain of the business, event processing would act as its nervous system, and events would be the neurons that circulate and bring important information about the environment back to the decision-making processes (both human- and AI-based).

In Playtika, we have thousands of event types that need to be engineered and designed in a certain way so they can serve several purposes:

  • Be a means for cross-service messaging
  • Be at the core of various types of analytics (e.g. real-time, batch, ad-hoc)
  • Serve as a basis for AI model training and inference
  • Unite with other events to serve operational processes like player segmentation

Event data is unique in that it is simultaneously transactional and analytical. Other types of data do not have this dynamic nature in terms of data velocity or rate of change considering the significant analytical value it can deliver.

Even without the complex downstream processing, the event data can drive business insights and answer important business questions about every feature of your game. If event data is modeled well from the very beginning, it can serve as a basis for further data enrichment, aggregations, and more complex analysis.

Challenges in Delivering Events to Data Platforms

Event ingestion plays an exceptional role in our data platform. Serving as a gateway to the data universe, it consumes more than 90% of all of Playtika’s data and acts as a link between operations, analytics, and AI. Event design, ingestion, and processing are “a big deal” not only because of their importance for the business but also because of their complexity due to the number of factors that have to be taken into account.

That is why in Playtika, like in many other companies, we have separate products (Client, Backend) and data (BI, MLE, Big Data Engineering) teams for designing, ingesting, processing, and analyzing an event. Operational and data worlds are different in terms of their requirements, technologies, and infrastructures, and consequently require different engineering skills and SDLC. But sometimes this separation can cause organizational, technological, and quality issues and lead to frustration and misunderstandings within and between the teams.

We carefully examined the challenges that our studios face when transforming their event data into actionable results. Based on observations gathered over the past few years, we identified the following recurring pain points in the process of bringing event data to data platforms and ultimately to actionable insights:

  1. Data teams do not have all of the domain knowledge from the operational data. In contrast, product teams are not familiar with the analytical and data integration requirements that may impact the way the events look. This may lead to data misinterpretation and integration and quality issues which can delay time-to-market and fall short in delivering the expected results.
  2. It is hard to develop event ingestion pipelines, since development and modeling may require specific programming language knowledge or data modeling skills. Delegating ingestion pipeline development exclusively to data engineering teams ultimately leads to a lot of back-and-forth between the R&D and data teams to clarify the requirements.
  3. Ingesting raw events without validating if the events satisfy the quality requirements means delaying the bad quality data issue resolution to later stages of event processing and analytics. This delay in identifying and quickly ratifying the issues negatively impacts the business by not meeting SLAs or producing misleading results. Raw, unfiltered, unconsolidated, and non-standardized data also requires significant data preparation work for warehousing and AI model training.
  4. Changes in the source event may be left unattended and can cause massive downstream pipeline breaks. While modifying an existing event, the developer might not take event data consumers – or their level of dependency on the original event data – into consideration. Identifying the downstream problems after the ingestion is counterproductive and ultimately leads to production incidents.
  5. Operating data pipelines might be too complicated for game developers. It introduces data flow, integration, event modeling, warehousing concepts, data processing framework specifics, and sometimes complicated configuration flow.
  6. A single source of events may be multiplied via different models to serve streaming and batch use cases. This is fairly common, but without centralized governance and clear boundaries, it can lead to uncontrolled growth and ambiguity in data models. This will make it harder for analysts to connect the dots between batch and operational analytics, or for data scientists to compare warehouse models during the discovery phase and operational models for stream inference.
  7. Without centralized governance, personal data from events may proliferate across the data platform. It becomes more challenging to protect players’ privacy when their personal information exists on several different platforms at once.

It would appear that the root cause of these difficulties lies in the traditional separation between product and data teams. Typically these independent teams require specialized skills, have their own release cadence, and scale at various paces. Time after time, the overloaded data teams cannot keep up with the change rate that is set by the product teams, who do not necessarily keep the data team’s requirements in mind. As an organization grows, these issues grow with it and become even more pronounced.

Another factor to consider is the infrastructure topology; data ‘lives’ in a separate environment that requires particular skills to operate. Since many different teams and products have access to and rely on the shared infrastructure, one careless misstep can have a negative impact on a global scale. It is extremely problematic when product teams cannot manage even the most trivial data pipelines that deliver event data to the platform.

How can we enable and encourage our studios’ creativity when designing events without adding extra supervision and micro-management of the data and infrastructure teams?

To answer this question, we considered the following:

  • With regard to event ingestion and processing, we identified that more than 80% of all events do not require specific data modeling knowledge. With a certain level of automation and guidance, product teams can easily transform their events into high-quality datasets. These datasets can serve a variety of analytical, AI, and operational use cases without necessitating further cleansing, transformation, or aggregation.
  • The evolution of Big Data technologies brought about new and convenient means of leveraging data without calling for specialized knowledge or exposure to their technical internals.
  • The development of the infrastructure layer introduced better stability, resource management, scalability, workload isolation, and granular security, and generally leaves more room for maneuvering without constant fear of making critical mistakes.
  • Certain aspects of data quality like schema evolution or deduplication can be optimized with automation (implemented as part of data teams’ requirements) and integration with the feature development tools and SDLC.
  • Centralized governance allows for easier management of things like schemas and configurations. Together with data catalogs, they constitute the metadata layer of the data platform, thus providing a unified interface for all data assets in the company including event data and ingestion/ transformation pipelines.

In the next article, entitled Event Processing in Playtika Data Platform: Part I, we explain how we leverage these considerations to democratize data development for product teams without jeopardizing their velocity or the stability of the data platform.



Tags