Managing MLC results submissions

Updated 2022-07-13 for GS 5.006

Registering participants

At present, the participants in the MLC (Machine Learning Challenge) are registered by the project staff manually. Once a prospective participant has contacted Paul or Eric, and his applcation has been approved, the MLC manager should login on sapir and run the following command:

/home/vmenkov/w2020/game/scripts/create-mlc-user.sh nickname email password

For example:

/home/vmenkov/w2020/game/scripts/create-mlc-user.sh bob1 bob1@yahoo.com xxxbob1

Here, nickname is an identifying name for the project participant (a person, a team, or a ML algorithm). This should appear as part of the first column of the CSV files that the participant will be submitting. (In that file, the particicpant's nickname will be separated by a '+' sign from the rule set name).

You may want to enclose the password in single or double quotes if it contains some special characters that may affect command-line processing.

As a confirmation, the script will print out the list of all users in the system, one of which should be your new user:

	....
	User record: id: 11; nickname: bob1; digest: 702F09166EEFAB06F3DDA3262EF8DF75; roles: [mlc];
	....

One can save the output into an HTML file and then view it in a web browser:

/home/vmenkov/w2020/game/scripts/create-mlc-user.sh bob1 bob1@yahoo.com xxxbob1 > tmp.html

As the list may be old, and it also includes the participants for the old APP and MLC launch pages, you may want to pipe it through more, or to save it in a file.

You can now send the nickname and the password to the participant, for use in future uploads.

If a particular ML research team wants to enter several algorithms into the MLC, we will register several nicknames for them (one per algorithm), so that all data they submit can be easily identified as belonging to specific algorithms. In this case, it's OK to run the script several times, with different nicknames but the same email and password.

The same script can also be used to change the password of an already registered participant.

Uploading

MLC participants upload via the Participants' Dashboard. It shows the already uploaded files, if any, and offers to upload a new file.

Once the files have been uploaded, they go to /opt/tomcat/saved/mlc, each participant's data going to a separate directory named after their nickname. Our research team can later process those data with its tools.

The summary of each run is recorded as a row in the SQL table MlcEntryM.

Fake learning methods

Since, as of July 2022, I don't have any "real" learning data in the desired format at my disposal, and the scope of my emplotyment does not involve building actual machine learning tools, I have set up a fake machine learning script, so that I could have some test data to test the uploading process. The "fake learner" plays randomly (i.e. without any learning at all) for a certain number of episodes, and then starts cheating, by using the list of allowed buckets that the server shows to it (but which a true learner must not use). So you can make it demonstrate "full learning" after a desired number of episodes.

Sample command line:

	/home/vmenkov/w2020/game/scripts/captive-python-mlc-fake.sh 15 >& t.tmp &

The command-line argument (15, in the example above) indicates at which point (after how many episodes) in each run the player will start cheating.

The scripts completes 100 runs, with 100 episodes each, for each of the 15 sample rule sets of the MLC. This takes about 1.5 hours of clock time on sapir. The resulting file (which is automatically named Fake-15.csv by the script, based on the parameter value) contains 15*100*100=150,000 data lines (plus the header line), and is about 6.5 MB in size:

vmenkov@SAPIR:~$ wc Fake-1*.csv
  150001   150001  6380450 Fake-10.csv
  150001   150001  6484567 Fake-15.csv

The "imitation participants" Fake-10, Fake-15, Fake-20 imitate learners who have "fully learned" all rules (because the can demonstrate a streak of at least 10 error-free episodes at the end of each run), while Fake-80 imitates successful learning in only some runs (because it only rarely manage to get all of episodes No. 90 thru 99 error-free, and then the run ends).

Eventually, of course, we should purge the data for these "fake learners" from the server.

Dashboard tools

The Participants' Dashboard shows the basic information about the file(s) the participant has uploaded, as well as links to the "summary" and "comparison" (leader board) pages for each rule set the participant has competed on.