How SQLstream Can Reduce Complex Business Logic to a Single SQL Query

In the game industry, complex game logic needs to be applied to streams of events generated by gameplay.  In single player games, this logic is simply handled by applying the correct computations.  However, in an Internet based social game where millions of players interact together online, the problem takes on an entirely different dimension.  Storing the game events on disk inside of a database becomes increasingly difficult as the rate of gameplay events increases.  Logic and computation must be applied to the data and a disparate set of data must be queried to correctly update the game state.

The solution can be surprisingly simple and similar in form to storing all gameplay events in a traditional database.  SQLstream’s powerful streaming and windowed aggregation capabilities can reduce this use case and complex logic to a single query.

For example, consider a game where users create videos that are viewed and rated by other users:

  • A score for the video must be computed based upon the last 4 weeks of gameplay.
  • Let’s say a view is worth 12 points and various ratings levels are worth between 0 and 30 points.
  • The content’s score is the total points accumulated in the last week, plus 50% of the points in the week before that, plus 20% the points in the week before that and 10% of the points in the week before that.

Let’s assume we have two streams of data which contain streams of gameplay events:

--
-- A stream containing 1 row per an individual view of a video
--
CREATE STREAM "s_video_view" (
++++++++++++++++"video_id" INTEGER,+++++-- the viewed video
++++++++++++++++"user_id" INTEGER,++++++-- the id of the viewer
++++++++++++++++"performer_id" INTEGER++-- the id of the performer
++++++++++++++++);


--
-- A stream containing 1 row per an individual rating of a video
--
CREATE STREAM "s_video_rating" (
++++++++++++++++"video_id" INTEGER,+++++-- the viewed video
++++++++++++++++"performer_id" INTEGER,+-- the id of performer
++++++++++++++++"rater_id" INTEGER,+++++-- the id of the rater
++++++++++++++++"rating" INTEGER++++++++-– the rating given
++++++++++++++++);

To handle all the described business logic, we must first compute the total number of points generated from each gameplay event and add that to the stream of tuples entering the system.  The following query accomplishes this:

SELECT STREAM "performer_id", "video_id", 12 AS "points"
+++++FROM "s_video_view"
UNION ALL
SELECT STREAM "performer_id", "video_id",
++++++++++CASE "s_video_rating"."rating" WHEN 2 THEN 1
+++++++++++++++++++++++++++++++++++++++++WHEN 3 THEN 8
+++++++++++++++++++++++++++++++++++++++++WHEN 4 THEN 20
+++++++++++++++++++++++++++++++++++++++++WHEN 5 THEN 30
+++++++++++++++++++++++++++++++++++++++++ELSE 0
+++++++++++++++++++++++++++++++++++++++++END AS "points"
+++++FROM "s_video_rating"

This query will produce a stream of data containing the performer_id, video_id, and points for each scoring event in the system.

Next we must compute rolling time based score over the event stream for each unique video_id.  SQLstream provides the SQL:99, SQL:2003, and SQL:2008 standard WINDOW facility to make this easy:

WINDOW "last_7_days" AS ( PARTITION BY "video_id"
++++++++++++++++++++++++++RANGE INTERVAL '7' DAY PRECEEDING),
+++++++"last_14_days" AS ( PARTITION BY "video_id"
+++++++++++++++++++++++++++RANGE INTERVAL '14' DAY PRECEEDING),
+++++++"last_21_days" AS ( PARTITION BY "video_id"
+++++++++++++++++++++++++++RANGE INTERVAL '21' DAY PRECEEDING),
+++++++"last_28_days" AS ( PARTITION BY "video_id"
+++++++++++++++++++++++++++RANGE INTERVAL '28' DAY PRECEEDING)

These rolling windows contain all the scoring events over the last 7, 14, 21, and 28 days respectively grouped by the video_id.  This means that any aggregation function applied to that window will be applied to the stream events with the same video_id.  So, COUNT(*) OVER "last_7_days" would produce one row for each unique video_id with a scoring event in the last week.  Those rows would contain a count of the number of scoring events for each unique video_id.

By subtracting the SUM of the points in the 7 day window from the number of points in the 14 day window, we can compute the number of points in the week starting two weeks ago and ending one week ago.  This technique allows us to implement computations on rolling windows that are not bounded by the current time.

Putting the entire example together, we get the following view:

CREATE VIEW "v_video_score" AS
+++++SELECT STREAM "video_id", "performer_id",
+++++++++++++++SUM("points") OVER "last_7_days" +
+++++++++++++++++((SUM("points") OVER "last_14_days" -
+++++++++++++++++++SUM("points") OVER "last_7_days") * 0.5) +
+++++++++++++++++((SUM("points") OVER "last_21_days" -
+++++++++++++++++++SUM("points") OVER "last_14_days") * 0.2) +
+++++++++++++++++((SUM("points") OVER "last_28_days" -
+++++++++++++++++++SUM("points") OVER "last_21_days") * 0.1) AS "score"
++++++++FROM ( SELECT STREAM "performer_id", "video_id", 12 AS "points"
++++++++++++++++++++FROM "s_video_view"
+++++++++++++++UNION ALL
+++++++++++++++SELECT STREAM "performer_id", "video_id",
+++++++++++++++++++++++++++CASE "s_video_rating"."rating" WHEN 2 THEN 1
+++++++++++++++++++++++++++++++++++++++++++++++++WHEN 3 THEN 8
+++++++++++++++++++++++++++++++++++++++++++++++++WHEN 4 THEN 20
+++++++++++++++++++++++++++++++++++++++++++++++++WHEN 5 THEN 30
+++++++++++++++++++++++++++++++++++++++++++++++++ELSE 0
+++++++++++++++++++++++++++++++++++++++++++++++++END AS "points"
++++++++++++++++++++FROM "s_video_rating" )
++++++++WINDOW "last_7_days" AS ( PARTITION BY "video_id"
++++++++++++++++++++++++++++++++++RANGE INTERVAL '7' DAY PRECEEDING),
+++++++++++++++"last_14_days" AS ( PARTITION BY "video_id"
+++++++++++++++++++++++++++++++++++RANGE INTERVAL '14' DAY PRECEEDING),
+++++++++++++++"last_21_days" AS ( PARTITION BY "video_id"
+++++++++++++++++++++++++++++++++++RANGE INTERVAL '21' DAY PRECEEDING),
+++++++++++++++"last_28_days" AS ( PARTITION BY "video_id"
+++++++++++++++++++++++++++++++++++RANGE INTERVAL '28' DAY PRECEEDING);

which produces a stream of rows containing the video_id, performer_id, and a score computed over the rolling windows.  This view outputs a row every time a new event is inserted added the system.  The complex business logic has been reduced to one SQL query.

By attaching the results of this query to a foreign table, a nearly real-time cache of the videos and their current scores can be maintained in your database.  Alternatively, by using SQLstream’s JDBC driver, a distributed caching system such as memcached could be kept updated with the latest scores for each video simply by calling this query.  But those are topics for another post.

I hope you enjoyed this simplified real world example of how streaming SQL can:

  • simplify your business logic processing
  • improve your ability to deliver real-time data to your customers, clients, and colleagues.