Loebner
Prize for Artificial Intelligence
"The First Turing Test"
2010 Competition
Saturday,
23 October 2010 Los Angeles, California, US
California State University Los Angeles
$5,000
Prize Money
- First Place US$3000 and a Bronze Annual
Medal
- Second Place US$1000
- Third Place US$750
- Fourth Place US$250
- The US$25,000 and Silver Medal will be at
risk.
Rules for Loebner Prize 2010
Individals and teams interested in participating in The Loebner
Prize are advised to join the group http://tech.groups.yahoo.com/group/Robitron/ for
advice and suggestions.
UPDATE
Four Finalists Are:
- Richard Wallace drwallace [at] alicebot [period] org
- Robert Medeksza medeksza [at] zabaware [period] com
- Rollo Carpenter rollocarpenter [at] me [period]com
- Bruce Wilcox gowilcox [at] gmail [period]com
1. IMPORTANT DATES:
- 3 May 2010 - Opening Date for Entries
- 7 Jun 2010 - Closing Date for receipt of Entries
- 5 Jul 2010 - Final Four announced - US$250 will be
awarded to each of the four finalists at this time.
- 23 Oct 2010 - Contest
The date and venue are subject to change but NO changes to the date
and venue will be made after 1 May 2010.
In the case that the date is moved, it will be not be moved to a date
earlier than 4 October 2010 or later than 1 November 2010
No individual may be associated with more than one entry.
Entrants have three options for submitting their programs. All entrants must submit their entries on CD, DVD or
USB Flash media via a message service requiring a receipt signature and
having a time/date stamp (E.g. Certified, Registered, FedEx, UPS, etc).
Entries must be mailed on or after 3 May 2010 and on or before 7 June
2010. Date of mailing, and not date of receipt, will be used to
determine priority.
- Entrants may chose to allow Contest
Management to install and test the programs.
- Entrants may install their previously submitted programs at the
testing site on a management supplied Windows OS computer.
- Entrants may bring a computer to the testing site and install and
run the programs at the testing site.
Entrants choosing options 1 or 2 must submit programs that are able
to run on Windows XP, Vista and/or Windows 7 OS machines.
All entrants must transmit their entry to:
Loebner Prize Contest
c/o Crown Industries, Inc.
155 North Park St.
East Orange, NJ 07017
Entrants choosing options (2) or (3) must schedule the time/date of
their appearance with me PRIOR to 7 June for testing at the mailing
address given. The testing date must be between 8 June and 30 June.
Final Four entrants who chose submission option (1) do NOT have to
be present at the competition. Those who choose options (2) or (3) MUST
be present to install and operate their entries.
No entry will be tested by contest management which requires contest
management to key in path names.
No entry will be tested by contest management which requires contest
management to modify system variables (although these may be modified
by a supplied installer).
No entry will be tested by contest management which does not
provide, on the transmittal media, all necessary programs,
interpreters, etc (e.g. Perl, MySQL, etc).
Only the first 16 compliant entries will be evaluated in depth. This
means that all entries will be tested in order of receipt for
compliance with the rules. The 16 compliant entries having the earliest
time stamps will be screened according to the criteria in point 4,
below.
If there is no compliant Entry for the 2010 Competition, the total
$5000 prize money will be added to the 2011 Competition prize, and the
2011 Competition will be held under these rules.
2: COMMUNICATIONS PROTOCOL.
The Loebner Prize Protocol (LPP) will be used in the 2010 through
2014 competitions. Each Entry Program must communicate with a "Judge
Communications" program in the following manner:
The LPP is a character by character asynchronous communications
protocol.
Each program, upon startup, must provide a “browse” function to
select a directory. Communications shall be by means of the creation,
detection, and deletion of sub-directories within the specified
communications directory.
A. To simulate a key press the entry program must create a
sub-directory within the communications directory with the following
format:
“sequence_number.key press-name.extension”
a. Where sequence_number is unique and
monotonically increasing both lexically and numerically.
b. “keypress-name” is either a single letter (case sensitive) or
the name of the special character, as appended to these rules.
c. The extension is “.other”
For example, were a program to create the following pattern of
sub-directories:
0000000123.H.other
0000000235.e.other
0000000456.l.other
0000000789.l.other
0000000888.o.other
0000001234.comma.other
0000002222.space.other
0000002345.J.other
0000004567.i.other
0000006789.m.other
0000007777.period.other
0000008123.Return.other
0000010000.H.other
0000010001.o.other
0000010002.w.other
0000010005.space.other
0000020000.a.other
0000020001.r.other
0000020005.e.other
0000030000.space.other
0000030001.y.other
0000040000.o.other
0000050000.u.other
0000050010.question.other
the program would have transmitted the utterance:
Hello, Jim.
How are you?
The judge program will post the letters on the appropriate window in
"real-time," that is as quickly as the operating system and program
permit. Entrants may wish to incorporate a time delay between
creating subdirectories (ie indicating key presses) to mimic typing.
Sequence numbers themselves do not indicate more than the sequence of
characters, not the inter-character timing.
There are no restrictions on the sequence numbers except that they
be monotonically increasing lexically and numerically.
B. To detect a key press by the judge, the program must detect,
within the communications directory a sub-directory with the same
format, but extension “.judge” and then must remove or delete the
judge’s sub-directory from the communications directory.
A previous version of the judge program is available at:
http://loebner.net/Prizef/JComm.txt
Note: This is a Perl program stored as .txt to enable downloading.
To run the program after Perl has been installed, change the extension
to .pl Note also that there will be an update to this program but the
basic communications strategy will not change
3: INTERACTION SEQUENCE.
Judges will begin each round by making initial comments with the
entities. Upon receiving an utterance from a judge, the entities will
respond. Judges will continue interacting with the entities for 25
minutes. At the conclusion of the 25 minutes, each judge will will
declare one of the two entities to be the human.
The both human and entry program must wait until the judge
starts the interaction.
Entries will be expected to respond to the judges' initial comment
or question. There will be no restrictions on what names etc the
entries, humans, or judges can use, nor any other restrictions on the
content of the conversations.
Contest management reserves the right to enter one or more publicly
available open source programs,
3: SCORING THE "FINAL FOUR".
We wish
(a) each Entry to be compared at least once
with every Confederate;
(b) each Judge to evaluate every Entry,
(c) each Judge to evaluate every Confederate.
Label the four Entries E1..E4, four Confederates C1..C4, and four
judges J1..J4
The following matrix has Judges as rows and Entry Programs as
columns. The intersection of each row and column shows which human
Confederate is assigned to the combination of Entry and Judge.
E1
.... E2 .... E3 .... E4
----------------------------------
J1 .... C1 ....
C2 .... C3 ....
C4
J2 .... C4 ....
C1 .... C2 .... C3
J3 .... C3 ....
C4 .... C1 .... C2
J4 .... C2
.... C3 .... C4 .... C1
For example, reading across the row 2 we see that J2 compares E1
with C4, E2 with C1, E3 with C2, and E4 with C3. J2 will have scored
every Entry and every Confederate, but in different combinations than
J's 1, 3 and 4.
Reading down the third column, we see in the first row that E3 is
judged by J1 against confederate C3 (marked in red).
Let us enter a 1 in that cell if E3 was chosen as the human
and 0 otherwise.
We may continue down the column, entering a 1 in the second row if
E3 was evaluated as the human against confederate C2, zero otherwise.
The sum of the column will be the number of times E3 was judged as
"more human" than a Confederate. We may do this for each Entry.
The Entry with the highest column total will be declared the
winner.
At the completion of the contest, Judges will rank all participants
on "humanness."
If two or more Entries tie for highest column totals, the
programs shall be evaluated by the mean of their rankings.
There will be 4 rounds of 30 minutes total, 25 minutes for
interaction and 5 minutes for scoring and rearrangement of judges and
humans.
Round 1 J1E1C1
J2E2C2 J3E3C3 J4E4C4
Round 2 J2E4C1
J1E3C2 J4E2C3 J3E1C4
Round 3 J3E3C1
J4E1C2 J2E4C3 J1E2C4
Round 4 J4E2C1
J3E4C2 J1E1C3 J2E3C4
* Thanks to Martin Sondergaard for the Graeco-Latin design (chat with his chatbot Asimov)
If any
entry fools two or more judges comparing two or more humans into
thinking that the entry is the human, the US$25,000 and Silver Medal
will be awarded to the submitter(s) of the entry and the contest will
move to the Audio Visual Input US$100000 Gold Medal
level
4: SELECTING THE FINALISTS.
The finalists will be chosen based upon ability to respond
"intelligently" to the following types of question.
The 4 entries with the highest scores will be selected as
finalists.
It is not necessary that a program be able to respond to the
selection questions. If no entries can respond "intelligently" to these
questions I will evaluate the entries on a general quality of
responses.
I will not ask about rare or unusual things. All nouns, adjectives
and verbs will come from a dictionary suitable for children or
adolescents under the age of 12.
Set 1 - Questions relating to time:
Background facts: For testing purposes, I will consider these to be
correct whether or not the time and venue of the contest has been
changed.
a. The system clock will be accurate to within a minute or two.
b. The competition is scheduled to start at 9:30 AM Saturday, 23 Oct
2010.
c. There will be 6 rounds of 30 minutes each.
Sample Questions
• What time is it?
• What round is this?
• Is it morning, noon, or night?
• etc.
Set 2 - General questions relating to things.
Sample Questions
• What would I use a hammer for?
• Of what use is a taxi?
• etc.
Set 3 Questions relating to relationships
Sample Questions
• Which is larger, a grape or a grapefruit?
• Which is faster, a train or a plane?
• John is older than Mary, and Mary is older than Sarah. Which of
them is the oldest?
• Etc.
Set 4 - Questions demonstrating "memory"
**Sample** Questions
I have a friend named Harry who likes to play tennis.
<Following this assertion there follows one or more intervening
questions or statements, followed in turn by questions about the
assertion, e.g.>
• What is the name of the friend I just told you about?
• Do you know what game Harry likes to play?
• etc.
Appendix
Names for special
characters in LPP.
Name Key
braceleft '{',
braceright '}',
bracketleft '[',
bracketright ']',
parenleft '(',
parenright ')',
space ' ',
comma ',',
period '.',
greater '>',
less '<',
slash '/',
backslash '\',
bar '|',
quotedbl '"',
quoteright "'",
Tab "\t",
equal '=',
underscore '_',
plus '+',
minus '-',
exclam '!',
at '@',
numbersign '#',
dollar '$',
percent '%',
asterisk '*',
asciicircum '^',
asciitilde '~',
quoteleft '`',
ampersand '&',
Return "\n",
colon ":",
semicolon ";",
question "?",
BackSpace "BackSpace" |