Updated for Ver 4.002, 2021-11-20
The data one stores on the Rule Game server can be broadly divided into 3 large groups:
This document is written as a guide for the experiment manager, i.e. a researcher who designs new experiment plans and processes the data accumulated in the experiments. The first section concerns the data files you need to create to run an experiment, and the other two, the data that the server will generate for you as players play their games.
Before proceeding with this document, make sure that you are familiar with the main concepts of the Rule Game Server, such as repeat users, players, experiment plans, trial lists, series, episodes, etc.
These files, created by the experiment designer, determine how experiments are run.
The trial list files are in subdirectories of the main trial list directory, /opt/tomcat/game-data. For each static experiment plan, the subdirectory name is identical to that of the experiment plan; so, for example, the trial list files for the experiment plan default, sent by Gary in mid-August 2020, are in /opt/tomcat/game-data/default.
Trial list files should have extension .csv. For example, if your experiment plant has 3 trial lists, named list_A, list_B, and list_C, these lists shoud be described in files named list_A.csv, list_B.csv, and list_C.csv. Please make sure not to name any of your trial lists defect, because the file name defect.csv is reserved, and is used for the defect file.
When a player first enters the system, the server figures what experiment plan the player is associated with (as supplied in the URL, or, early on as encoded in the PlayerId). Among all the trial list files located in that experiment's directory, the server picks one trial list file based on the balancing algorithm, the main consideration being to assign approximately equal number of "usable" players to each trial list. Once the assignment has been completed, it is permanently recorded in the table, for use in all episodes played by this player.
Each row of the trial list file contains a parameter set. The meaning of parameters is explained briefly in the following document compiled by Paul: _0.GameParameters, with additional notes elsewhere. However, parameters n and b mentioned in that document are not actually used. See the document on Control flow for additional discussion of how some parameters are used by the server when directing a player's activities and playing games.
As outlined elsewhere, since GS 4.*, the Game Server also supports dynamic experiment plans, which are convenient for quick experimentation. A dynamic experiment plan does not need an experiment plan directory under trial-lists. Instead, it has a modifier file, somewhere under modifiers. A modifier is a CSV file whose structure is similar to that of a trial list file, but it contains fewer columns. For more details on how modifier files are used, and some examples, see Adding a plan or a rule set to the MLC or APP launch pages.
Optionally, you can create a defect file, named defect.csv in an experiment's trial list directory, to influence the balancing algorithm assigning new players to trial lists. For details on this files syntax and semantics, see The defect file in the balancing algorithm write-up.
Note that because the name defect.csv is reserved for the defect file, it cannot be used for a trial list file, you therefore must not name any of your trial lists defect.
Each line of a trial list file, i.e. each parameter set, contains parameter rule_id, which refers to a particular set of rules for playing the game. The rule set files are all in /opt/tomcat/game-data/rules; the name of each file consists of the value of the rule_id parameter and the extension .txt.
For the syntax and semantics of the rule sets, see the Rule Set Syntax guide. (It overrides the early document, _0.Semantics2020.07.10.txt, from GS 1.*).
When creating an episode, it is possible to load the initial board description from a pre-existing JSON file. Doing that, instead of generating random boards in runtime, allows one to give exactly the same experience to all players.
All JSON files defining initial boards should be placed into subdirectories of /opt/tomcast/game-data/boards. Please refer to the specifying initial boards section of the parameter set document for the details on configuring your trial list file for various modes in which predefined initial board files can be used.
Each initial board is defined in a JSON file, which may look like this:
{"id":"vb", "value":[{"color":"red","shape":"square","id":"1","x":1,"y":1}, {"color":"black","shape":"circle","id":"2","x":1,"y":2}, {"color":"yellow","shape":"star","id":"3","x":1,"y":3}, {"color":"blue","shape":"square","id":"4","x":1,"y":4}, {"color":"red","shape":"circle","id":"5","x":1,"y":5}, {"color":"black","shape":"star","id":"6","x":1,"y":6} ], "name":"A vertical bar"}In this example, the board has 6 pieces on it, all of them located in the leftmost column of the board (x=1), and occupying the entire column (y=1 thru 6). For each piece you want to place on the board you must specify:
It is also possible to pre-create inital boards by using a shell script that will first create a large number of random boards, and then delete some and only keep those that satisfy your criteria. For details, see Creating initial boards with a random board generator.
While Game Server 1.* had just 4 built-in colors, Game Server 2.* allows the experiment manager to use arbitrary user-defined colors. While the colors used in the Game Server are identified by name, they are not necessarily the same as any of the 16 colors supported in HTML 4 standard, or any of the more numerous named colors of the CSS. Instead, the colors used in the Game Server are entirely user defined, as described below.
A single file, /opt/tomcat/game-data/colors/colors.csv is used by the Game Server to define all colors used in all experiment plans. (Therefore, it is not possible for the same name -- e.g. RED to refer to different colors in different experiment plans). The file content may look as follows:
#color_name,R,G,B BLUE,30,90,210 RED,220,20,0 YELLOW,250,240,0 BLACK,0,0,0 GREEN,0,250,0 PINK,220,100,100 PURPLE,200,0,200 ... ... ...Each line consists of 4 fields. The first field is the color name, writen using upper-case English characters; optionally, digits, dashes, and underscores may also be included in a color name. The following 3 colors contain decimal numbers, in the range 0 thru 255, representing the R, G, and B components of the color representation in the standard RGB model.
The experiment manager may add invent any color he or she can think of, but s/he should make sure that each color name is unique. Every color used in your experiment (and other experiments) should be listed in the color map file; otherwise, the GUI client will have trouble displaying the game pieces on the board properly.
Note that since Game Server 2.* does not really have built-in default colors anymore, it is desireable that the color map file contains at the very least the definitions of the four "legacy" colors, i.e. BLACK, RED, YELLOW, and BLUE.
While Game Server 1.* only allowed to use the four "legacy shapes" of game pieces, in Game Server 2.* you can use arbitrary shapes. For every shape you use, you must obtain a SVG file representing it (e.g., download one from the icon collection at the Feather collection, or create one by hand), and place it into the Game Server's shape file directory, /opt/tomcat/game-data/shapes.
At present, shape names are case-insensitive, but the SVG directory and file names are expected to be in the lower case. E.g. if your trial list file, rule set file, or initial board file refers to a shape STAR or star, the corresponding file should be named star.svg. If you use shape subdirectories, it may reduce confusion that you always use the lower case both in the subdirectory names, and when referring to shapes in trial set files and rule files.
At the minimum, the shape file directory must contain at least the four files for the four "legacy shapes": circle.svg, square.svg.star.svg, triangle.svg.
Additionally, the file named blank.svg should be kept in the shape file directory; it is used for blank squares in the HTML play. For this reason, one should not use a shape named BLANK, because if you do, you won't be able to see the pieces of this shape very well on the board.
To better manage shape files, it is also possible to create subdirectories in the shape directory. For example, you have decided to create a game with the game piece shapes named AU/KANGAROO, AU/KOALA, AU/BOOMERANG, and AU/WOOMERA; the slash in the names of the shapes indicates that the files kangaroo.svg, koala.svg, boomerang.svg, and woomera.svg, should go into a new subdirectory, /opt/tomcat/game-data/shapes/au.
If you use shape subdirectories, you must surround shape names with double quotes (e.g. "au/kangaroo") when referring to them in rule set files. This requirement exists because the syntax of the rule sets is quite complex (and, potentially, it may be expanded in the future), and using double quotes ensures correct parsing.
Note that double quotes around individual shape names should not be used when defining the set of shapes in the trial list file. (There, the entire list of shapes can be double-quoted, if desired, but that's not necessary).
It is important that you don't modify the data files directly in /opt/tomcat/game-data, since if you do that, your updates will be lost later. Instead, you should modify the files in your working directory, and deploy from there. While it's possible to carry out the deployment manually, the preferred way to do it is by checking in your work to your repository in Github, which will trigger an automatic deployment. (Thanks, Kevin!) For details, see the section on GitHub inStandard Operating Procedures: Preparing and running an experiment with the Rule Game Server.
For development and testing purposes, it is possible to make Game Server work with rule files and initial board files located in your home directory, rather than in the /opt/tomcat/game-data tree.
Whenever a rule name, initial boards directory, or a boards order file mentioned in a trial list file starts with a slash ("/"), it will be interpreted not as a reference to an entity under /opt/tomcat/game-data, but as an actual path of an arbitrary file or directory on sapir (e.g. /home/kantorp/some-rule.txt, /home/kantorp/my-boards/some-board-dir, /home/kantorp/board-list.csv). In this way, once an experiment plan with such rules and board directories are created, you can just edit the rules and board files in your own directory on sapir, and whenever a new player is created within that experiment plan, it will use those current rules and boards of yours. (Of course, this activity will mean that a later analysis of the experimental data recorded for these players would make no sense, but one can simply delete those players' data from the database later).
Note that the trial list files for your experiment plan still must be located in an appropriate subdirectory under the main trial list directory.
We're using a database named game in the MySQL server on sapir.
There are two tables in this database that contain data of importance to you: PlayerInfo and Episode.
There is one PlayerInfo entry (table row) for each player.
mysql> describe PlayerInfo; +-------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------------+--------------+------+-----+---------+----------------+ | id | bigint(20) | NO | PRI | NULL | auto_increment | | currentSeriesNo | int(11) | YES | | NULL | | | date | datetime | YES | | NULL | | | inBonus | bit(1) | YES | | NULL | | | playerId | varchar(255) | YES | | NULL | | | totalRewardEarned | int(11) | YES | | NULL | | | trialListId | varchar(255) | YES | | NULL | | | experimentPlan | varchar(255) | YES | | NULL | | +-------------------+--------------+------+-----+---------+----------------+
The meaning of the columns is as follows:
The Episode table has one entry (row) for each episode played by each player.
mysql> describe Episode;
+-----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| attemptCnt | int(11) | YES | | NULL | |
| cleared | bit(1) | YES | | NULL | |
| doneMoveCnt | int(11) | YES | | NULL | |
| episodeId | varchar(255) | YES | | NULL | |
| givenUp | bit(1) | YES | | NULL | |
| nPiecesStart | int(11) | YES | | NULL | |
| stalemate | bit(1) | YES | | NULL | |
| startTime | datetime | YES | | NULL | |
| DTYPE | varchar(255) | YES | MUL | NULL | |
| bonus | bit(1) | YES | | NULL | |
| earnedBonus | bit(1) | YES | | NULL | |
| endTime | datetime | YES | | NULL | |
| finishCode | int(11) | YES | | NULL | |
| rewardBonus | int(11) | YES | | NULL | |
| rewardMain | int(11) | YES | | NULL | |
| seriesNo | int(11) | YES | | NULL | |
| PLAYER_ID | bigint(20) | YES | MUL | NULL | |
| guessSaved | bit(1) | YES | | NULL | |
| bonusSuccessful | bit(1) | YES | | NULL | |
| lost | bit(1) | YES | | NULL | |
| guess | varchar(255) | YES | | NULL | |
| guessConfidence | int(11) | YES | | NULL | |
| attemptSpent | double | YES | | NULL | |
+-----------------+--------------+------+-----+---------+----------------+
The meaning of the columns is as follows:
To view data in the mysql client, simply type
mysqlon the Linux command prompt on sapir, and then, once logged in to the MySQL server, type
use game;to start using the relevant database.
While you may use commands such as
select * from PlayerInfo; select * from Episode;to see the entire tables, various other space saving commands are available. For example, this is how you can view all episodes for a given player, in a compact format:
select e.PLAYER_ID, concat(p.playerID, " / ", cast(e.seriesNo as char)) 'P/Ser', e.episodeId, e.startTime, e.endTime, concat(cast(e.attemptCnt as char), ":", cast(e.doneMoveCnt as char), "/", cast(e.nPiecesStart as char)) 'Mv/N0', concat(cast(e.rewardMain as char), "+", cast(e.rewardBonus as char)) as '$$', concat( (CASE WHEN (e.bonus) THEN 'b' ELSE '-' END), "/", (CASE WHEN (e.bonusSuccessful) THEN 'B' ELSE '-' END), "/", (CASE WHEN (e.earnedBonus) THEN 'BB' ELSE '-' END)) as 'Bonus', concat( (CASE WHEN (e.givenUp) THEN 'G' ELSE '-' END), "/", (CASE WHEN (e.lost) THEN 'L' ELSE '-' END), "/", (CASE WHEN (e.cleared) THEN 'C' ELSE '-' END), "=", cast(e.finishCode as char)) as 'GLC=F', substr(concat(cast(e.guessSaved+0 as char), ":", cast(e.guessConfidence as char), ":", e.guess),1,10) 'guess' from PlayerInfo p, Episode e where e.PLAYER_ID=p.id and p.playerId= 'vm012' order by startTime;This produces a table like this:
+-----------+-----------+------------------------+---------------------+---------------------+-------+------+-------+---------+------------+ | PLAYER_ID | P/Ser | episodeId | startTime | endTime | Mv/N0 | $$ | Bonus | GLC=F | guess | +-----------+-----------+------------------------+---------------------+---------------------+-------+------+-------+---------+------------+ | 136 | vm012 / 0 | 20201014-015309-H534HK | 2020-10-14 06:53:09 | 2020-10-14 06:53:21 | 3:1/2 | 0+0 | -/-/- | G/-/-=3 | NULL | | 136 | vm012 / 1 | 20201014-015321-KDSUUF | 2020-10-14 06:53:21 | 2020-10-14 06:53:37 | 5:4/4 | 9+0 | -/-/- | -/-/C=1 | 1:3:testin | | 136 | vm012 / 1 | 20201014-015346-ICKEIG | 2020-10-14 06:53:47 | 2020-10-14 06:54:41 | 6:5/5 | 9+0 | -/-/- | -/-/C=1 | 1:4:testin | | 136 | vm012 / 1 | 20201014-015452-IT11CA | 2020-10-14 06:54:53 | 2020-10-14 06:55:01 | 2:1/6 | 0+0 | -/-/- | G/-/-=3 | NULL | +-----------+-----------+------------------------+---------------------+---------------------+-------+------+-------+---------+------------+The semantics of the columns is as follows:
Instead of and p.playerId='vm012' you can type any other selection clause, e.g. and p.playerId like 'Aria%' or and e.startTime>'2020-10-10'. e.startTime> '2020-10-10' order by startTime;
There are many ways to export the content of SQL tables to CSV files. As an example, I have supplied a script that resides on sapir in ~vmenkov/w2020/game/sql/export.sh
. In order to be able to use this particular script, you must have possess several qualifications:
Before using the script, switch to your mysql group:
newgrp mysqlThen run the script:
~vmenkov/w2020/game/sql/export.shThis will create 2 CSV files, PlayerInfo.csv and Export.csv, in your current directory. There are no header lines in them, but the columns are ordered the same way as described above.
Note: for its successful operation, this script relies on the fact that your MySQL server allows saving data files into the directory /var/lib/mysql-files. If the script does not worl on your server, check the server variable @@secure_file_priv (using the SQL command SELECT @@secure_file_priv;). If this variable's value on your server is null (which is often the case on MacOS installations), the script won't work, unless you figure how to change the server's configuration.
While export.sh has many restrictions on which hosts it can be used on, we also have a JDBC-based tool, scripts/export-2.sh which can be used on any host. It writes files into the current directory, so run it in the directory to which you want the outpiut files to go. The usage, for exporting locally accumulated data (database game), is as follows:
mkdir export-directory cd export-directory export-2.sh
(For brevity, we assume that you have added the directory in which the scripts are, such as ~/w2020/game/scripts, to you PATH).
If you have Pulled data from a remote server, they are in a separate database on your server. The name of the database is in the config file produced during the pull process.
mkdir pulls cd pulls pull-remote-data.sh wwwtest.rulegame export-2.sh -config w2020_game_wwwtest_rulegame_2024_01_30.conf(The file name you provide in the -config option is the name of the config file that pull-remote-data.sh has produced).
At the end of each episode, the Game server saves the essential information about the episode into CSV file. That includes the initial board description, all the attempted moves (successful and unsuccessful), and any guess submitted by the player.
The initial boards are saved in /opt/tomcat/saved/boards/ , in file names based on the PlayerID + ".board.csv". The GS 1.* and 2.* format:
more /opt/tomcat/saved/boards/qt-07.boards.csv #playerId,episodeId,y,x,shape,color qt-07,20200910-122815-2S2435,4,1,STAR,BLACK qt-07,20200910-122815-2S2435,5,4,STAR,BLACK qt-07,20200910-123116-PRDLET,3,1,TRIANGLE,BLACK qt-07,20200910-123116-PRDLET,5,1,TRIANGLE,BLACKIn the GS 3.*, one more column, objectType is added to identify image-and-property-based objects (for which the shape and color columns contains the null value); for such objects, this column contains an identifying string for the image. For the traditional shape-and-color-tuple objects, the objectType column contains a string based on the shape and color, e.g. RED_STAR (analogously to the objectType column that has existed in he detailed transcripts file since GS 1.*).
#playerId,episodeId,y,x,shape,color qt-07,20200910-122815-2S2435,4,1,null,null,vm/image_test_01/cat-01.png ....
For each piece, we have it y and x coordinates (y=row, x=column), shape and color.
The transripts are saved in /opt/tomcat/saved/transcripts/ , in file names based on the PlayerID + ".transcripts.csv".
more /opt/tomcat/saved/transcripts/qt-07.transcripts.csv #pid,episodeId,moveNo,timestamp,y,x,by,bx,code qt-07,20200910-122815-2S2435,0,20200910-122851,4,1,0,0,0 qt-07,20200910-122815-2S2435,1,20200910-122902,5,4,7,0,0 qt-07,20200910-123116-PRDLET,0,20200910-123125,3,1,7,7,0 qt-07,20200910-123116-PRDLET,1,20200910-123151,5,1,7,7,0
For each move, we have the coordinates (y=row, x=column) of the piece that the player tries to move, and of the destination bucket. (The four buckets are supposed to be located in columns 0 and 7, and rows 0 and 7).
The last column, code, contains the server response. The value of 0 is recorded when the move attempt is successful; positive values (usually 4, sometimes 2) are recorded on rejections, as per CODE.
Note: in certain circumstances the Game Server may save a partial transcript before the episode is completed, and then write out the full transcript again at the end of the episode. It is recommended that the data user remove duplicate lines from each transcript before analyzing the data.
The detailed transripts are saved in /opt/tomcat/saved/detailed-transcripts/ , in file names based on the PlayerID + ".-detailed-transcripts.csv".
The schema of these CSV files is as per Aria Duan's request (see paul01.transcripts_revised.xlsx, originally posted on on Slack ca. August 2020). All sequential indexes (series No., episode No., move No.) are 0-based.
/opt/tomcat/saved/detailed-transcripts$ more qt-02.detailed-transcripts.csv #playerId,trialListId,seriesNo,ruleId,episodeNo,episodeId,moveNo,timestamp,reactionTime,objectType,objectId,y,x,bucketId,by,bx,code,objectCnt qt-02,trial_1,1,TD-02,0,20200923-130509-VP7ECA,0,20200923-130521.677,11.829,BLUE_TRIANGLE,0,3,3,1,7,7,0,1 qt-02,trial_1,1,TD-02,0,20200923-130509-VP7ECA,1,20200923-130535.256,13.579,BLUE_TRIANGLE,1,5,6,1,7,7,0,0 qt-02,trial_1,1,TD-02,1,20200923-205001-F18F6J,0,20200923-205021.973,20.923,YELLOW_TRIANGLE,1,4,1,2,0,7,0,2 qt-02,trial_1,1,TD-02,1,20200923-205001-F18F6J,1,20200923-205036.976,15.003,YELLOW_TRIANGLE,2,4,3,2,0,7,0,1 qt-02,trial_1,1,TD-02,1,20200923-205001-F18F6J,2,20200923-205053.298,16.322,YELLOW_TRIANGLE,0,3,5,2,0,7,0,0 qt-02,trial_1,1,TD-02,2,20200923-211803-VPL2LF,0,20200923-211813.309,10.306,BLACK_CIRCLE,1,6,2,3,0,0,0,1 qt-02,trial_1,1,TD-02,2,20200923-211803-VPL2LF,1,20200923-211821.644,8.335,BLACK_CIRCLE,0,2,4,3,0,0,0,0 qt-02,trial_1,1,TD-02,3,20200923-211834-AR21RU,0,20200923-211846.699,12.453,RED_TRIANGLE,1,3,4,2,0,7,4,3 qt-02,trial_1,1,TD-02,3,20200923-211834-AR21RU,1,20200923-211858.671,11.972,RED_TRIANGLE,1,3,4,0,7,0,0,2 qt-02,trial_1,1,TD-02,3,20200923-211834-AR21RU,2,20200923-211911.614,12.943,RED_TRIANGLE,2,6,4,0,7,0,0,1 qt-02,trial_1,2,TD-03,0,20200923-211954-8M8E8V,0,20200923-212014.455,20.221,RED_SQUARE,1,4,3,3,0,0,0,1 qt-02,trial_1,2,TD-03,0,20200923-211954-8M8E8V,1,20200923-212025.978,11.523,RED_SQUARE,0,3,3,1,7,7,0,0
Note that even though the name of the experiment plan to which a player belongs is not explicitly listed as a separate column; however, it can be extracted from the player Id.
One may also "join" this information with information from the Episode table if, for example, one wants to look at the finishCode column.
The objectType column contains, for the traditional shape-and-color-tuple objects, a value such as BLACK_SQUARE. For the image-and-properties-based objects in GS 3, this column contains the same value as the image column of the initial boards CSV files, e.g. vm/image_test_01/cat-01.png.
The guesses are saved in /opt/tomcat/saved/guesses/ , in file names based on the PlayerID + ".guess.csv".
/opt/tomcat/saved/guesses$ cat qt-02.guesses.csv #playerId,trialListId,seriesNo,ruleId,episodeNo,episodeId,guess,quessConfidence qt-02,trial_1,1,TD-02,2,20200923-211803-VPL2LF,the guess will go here,3 qt-02,trial_1,1,TD-02,3,20200923-211834-AR21RU,this episode will have lost=true,3 qt-02,trial_1,2,TD-03,0,20200923-211954-8M8E8V,"here, is also a guess, for the next series maybe",4
Below are some examples of how, in general terms, the data can be processed. While the discussion of the data processing in each example is given in general terms, it is not difficult to convert it to specific SQL queries or operations with lists and hash tables in a language such as Perl or Python.
This can proceed as follows:
We have several tools that can be used to extract the data and analyze them. See the Tools Guide