Loebner Prize for Artificial Intelligence
"The First Turing Test"
2010 Competition

Saturday, 23 October 2010 Los Angeles, California, US
California State University Los Angeles

$5,000 Prize Money

  • First Place US$3000 and a Bronze Annual Medal
  • Second Place US$1000
  • Third Place US$750
  • Fourth Place US$250
  • The US$25,000 and Silver Medal will be at risk.

Rules for Loebner Prize 2010

Individals and teams interested in participating in The Loebner Prize are advised to join the group http://tech.groups.yahoo.com/group/Robitron/ for advice and suggestions.

UPDATE

Four Finalists Are:

  1. Richard Wallace drwallace [at] alicebot [period] org
  2. Robert Medeksza medeksza [at] zabaware [period] com
  3. Rollo Carpenter rollocarpenter [at] me [period]com
  4. Bruce Wilcox gowilcox [at] gmail [period]com

 

1. IMPORTANT DATES:

  • 3 May 2010 - Opening Date for Entries
  • 7 Jun 2010 - Closing Date for receipt of Entries
  • 5 Jul 2010 - Final Four announced - US$250 will be awarded to each of the four finalists at this time.
  • 23 Oct 2010 - Contest

The date and venue are subject to change but NO changes to the date and venue will be made after 1 May 2010.
In the case that the date is moved, it will be not be moved to a date earlier than 4 October 2010 or later than 1 November 2010
No individual may be associated with more than one entry.

Entrants have three options for submitting their programs.

All entrants must submit their entries on CD, DVD or USB Flash media via a message service requiring a receipt signature and having a time/date stamp (E.g. Certified, Registered, FedEx, UPS, etc). Entries must be mailed on or after 3 May 2010 and on or before 7 June 2010. Date of mailing, and not date of receipt, will be used to determine priority.
  1. Entrants may chose to allow Contest Management to install and test the programs.
  2. Entrants may install their previously submitted programs at the testing site on a management supplied Windows OS computer.
  3. Entrants may bring a computer to the testing site and install and run the programs at the testing site.

Entrants choosing options 1 or 2 must submit programs that are able to run on Windows XP, Vista and/or Windows 7 OS machines.

All entrants must transmit their entry to:

Loebner Prize Contest
c/o Crown Industries, Inc.
155 North Park St.
East Orange, NJ 07017

Entrants choosing options (2) or (3) must schedule the time/date of their appearance with me PRIOR to 7 June for testing at the mailing address given. The testing date must be between 8 June and 30 June.

Final Four entrants who chose submission option (1) do NOT have to be present at the competition. Those who choose options (2) or (3) MUST be present to install and operate their entries.

No entry will be tested by contest management which requires contest management to key in path names.

No entry will be tested by contest management which requires contest management to modify system variables (although these may be modified by a supplied installer).

No entry will be tested by contest management which does not provide, on the transmittal media, all necessary programs, interpreters, etc (e.g. Perl, MySQL, etc).

Only the first 16 compliant entries will be evaluated in depth. This means that all entries will be tested in order of receipt for compliance with the rules. The 16 compliant entries having the earliest time stamps will be screened according to the criteria in point 4, below.

If there is no compliant Entry for the 2010 Competition, the total $5000 prize money will be added to the 2011 Competition prize, and the 2011 Competition will be held under these rules.

2: COMMUNICATIONS PROTOCOL.

The Loebner Prize Protocol (LPP) will be used in the 2010 through 2014 competitions. Each Entry Program must communicate with a "Judge Communications" program in the following manner:
The LPP is a character by character asynchronous communications protocol.
Each program, upon startup, must provide a “browse” function to select a directory. Communications shall be by means of the creation, detection, and deletion of sub-directories within the specified communications directory.

A. To simulate a key press the entry program must create a sub-directory within the communications directory with the following format:
“sequence_number.key press-name.extension”

a. Where sequence_number is unique and monotonically increasing both lexically and numerically.
b. “keypress-name” is either a single letter (case sensitive) or the name of the special character, as appended to these rules.
c. The extension is “.other”

For example, were a program to create the following pattern of sub-directories:

0000000123.H.other
0000000235.e.other
0000000456.l.other
0000000789.l.other
0000000888.o.other
0000001234.comma.other
0000002222.space.other
0000002345.J.other
0000004567.i.other
0000006789.m.other
0000007777.period.other
0000008123.Return.other
0000010000.H.other
0000010001.o.other
0000010002.w.other
0000010005.space.other
0000020000.a.other
0000020001.r.other
0000020005.e.other
0000030000.space.other
0000030001.y.other
0000040000.o.other
0000050000.u.other
0000050010.question.other

the program would have transmitted the utterance:

Hello, Jim.
How are you?

The judge program will post the letters on the appropriate window in "real-time," that is as quickly as the operating system and program permit. Entrants may wish to incorporate a time delay between creating subdirectories (ie indicating key presses) to mimic typing. Sequence numbers themselves do not indicate more than the sequence of characters, not the inter-character timing.

There are no restrictions on the sequence numbers except that they be monotonically increasing lexically and numerically.

B. To detect a key press by the judge, the program must detect, within the communications directory a sub-directory with the same format, but extension “.judge” and then must remove or delete the judge’s sub-directory from the communications directory.

A previous version of the judge program is available at:

http://loebner.net/Prizef/JComm.txt

Note: This is a Perl program stored as .txt to enable downloading. To run the program after Perl has been installed, change the extension to .pl Note also that there will be an update to this program but the basic communications strategy will not change

3: INTERACTION SEQUENCE.

Judges will begin each round by making initial comments with the entities. Upon receiving an utterance from a judge, the entities will respond. Judges will continue interacting with the entities for 25 minutes. At the conclusion of the 25 minutes, each judge will will declare one of the two entities to be the human.

The both human and entry program must wait until the judge starts the interaction.

Entries will be expected to respond to the judges' initial comment or question. There will be no restrictions on what names etc the entries, humans, or judges can use, nor any other restrictions on the content of the conversations.

Contest management reserves the right to enter one or more publicly available open source programs,

3: SCORING THE "FINAL FOUR".

We wish

(a) each Entry to be compared at least once with every Confederate;
(b) each Judge to evaluate every Entry,
(c) each Judge to evaluate every Confederate.

Label the four Entries E1..E4, four Confederates C1..C4, and four judges J1..J4

The following matrix has Judges as rows and Entry Programs as columns. The intersection of each row and column shows which human Confederate is assigned to the combination of Entry and Judge.

        E1 .... E2 .... E3 .... E4

----------------------------------

J1 .... C1 .... C2 .... C3 .... C4

J2 .... C4 .... C1 .... C2 .... C3

J3 .... C3 .... C4 .... C1 .... C2

J4 .... C2 .... C3 .... C4 .... C1

For example, reading across the row 2 we see that J2 compares E1 with C4, E2 with C1, E3 with C2, and E4 with C3.  J2 will have scored every Entry and every Confederate, but in different combinations than J's 1, 3 and 4.

Reading down the third column, we see in the first row that E3 is judged by J1 against confederate C3 (marked in red).

Let us enter a 1 in that cell if E3 was chosen as the human and 0 otherwise.

We may continue down the column, entering a 1 in the second row if E3 was evaluated as the human against confederate C2, zero otherwise. The sum of the column will be the number of times E3 was judged as "more human" than a Confederate. We may do this for each Entry.

The Entry with the highest column total will be declared the winner.

At the completion of the contest, Judges will rank all participants on "humanness."

If two or more Entries tie for highest column totals, the programs shall be evaluated by the mean of their rankings.

There will be 4 rounds of 30 minutes total, 25 minutes for interaction and 5 minutes for scoring and rearrangement of judges and humans.

Round 1 J1E1C1 J2E2C2 J3E3C3 J4E4C4

Round 2 J2E4C1 J1E3C2 J4E2C3 J3E1C4

Round 3 J3E3C1 J4E1C2 J2E4C3 J1E2C4

Round 4 J4E2C1 J3E4C2 J1E1C3 J2E3C4

* Thanks to Martin Sondergaard for the Graeco-Latin design (chat with his chatbot Asimov)

If any entry fools two or more judges comparing two or more humans into thinking that the entry is the human, the US$25,000 and Silver Medal will be awarded to the submitter(s) of the entry and the contest will move to the Audio Visual Input US$100000 Gold Medal level

4: SELECTING THE FINALISTS.

The finalists will be chosen based upon ability to respond "intelligently" to the following types of question.

The 4 entries with the highest scores will be selected as finalists.

It is not necessary that a program be able to respond to the selection questions. If no entries can respond "intelligently" to these questions I will evaluate the entries on a general quality of responses.

I will not ask about rare or unusual things. All nouns, adjectives and verbs will come from a dictionary suitable for children or adolescents under the age of 12.

Set 1 - Questions relating to time:

Background facts: For testing purposes, I will consider these to be correct whether or not the time and venue of the contest has been changed.

a. The system clock will be accurate to within a minute or two.
b. The competition is scheduled to start at 9:30 AM Saturday, 23 Oct 2010.
c. There will be 6 rounds of 30 minutes each.

Sample Questions

• What time is it?
• What round is this?
• Is it morning, noon, or night?

• etc.

Set 2 - General questions relating to things.

Sample Questions

• What would I use a hammer for?
• Of what use is a taxi?
• etc.

Set 3 Questions relating to relationships

Sample Questions

• Which is larger, a grape or a grapefruit?
• Which is faster, a train or a plane?
• John is older than Mary, and Mary is older than Sarah. Which of them is the oldest?
• Etc.

Set 4 - Questions demonstrating "memory"

**Sample** Questions

I have a friend named Harry who likes to play tennis.

<Following this assertion there follows one or more intervening questions or statements, followed in turn by questions about the assertion, e.g.>

• What is the name of the friend I just told you about?
• Do you know what game Harry likes to play?
• etc.


Appendix

Names for special characters in LPP.

Name           Key


braceleft      '{',
braceright     '}',
bracketleft    '[',
bracketright   ']',
parenleft      '(',
parenright     ')',
space          ' ',
comma          ',',
period         '.',
greater        '>',
less           '<',
slash          '/',
backslash      '\',
bar            '|',
quotedbl       '"',
quoteright     "'",
Tab            "\t",
equal          '=',
underscore     '_',
plus           '+',
minus          '-',
exclam         '!',
at             '@',
numbersign     '#',
dollar         '$',
percent        '%',
asterisk       '*',
asciicircum    '^',
asciitilde     '~',
quoteleft      '`',
ampersand      '&',
Return         "\n",
colon          ":",
semicolon      ";",
question       "?",
BackSpace      "BackSpace"