honours project 2 - carleton universitypeople.scs.carleton.ca/~arpwhite/documents/honours... ·...
TRANSCRIPT
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
1
Abstract
This project involves an application that runs in the background that waits for
incoming messages, and responds with an answer. This application is what is called
“chatbot” and it responds to English questions about stock quotes. A chatbot is a
computer program that runs without human interaction and replies to messages that are
sent to it. A chatbot is short for “chatting robot”. This chatbot combines the
functionality of the Jabber protocol for messaging, Alice for natural language rule based
processing and Cocoa for the user interface. It runs under Mac OS X and has been
verified to run on version 10.2.2 of the OS.
Acknowledgements
The author would like to thank Dr. Tony White for all of his advise and guidance
throughout this project. The author would also like to thank all the people that worked on
the Alice tool, especially the ones that created J-Alice.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
2
Table of Contents:
1) Introduction pg. 4 1.1) Chatbot pg. 4 1.2) Jabber pg. 4 1.3) Alice pg. 5 1.4) Cocoa pg. 5 2) Alice pg. 5 2.1) How Alice Works pg. 6 2.2) Rules pg. 6 2.3) AIML pg. 7 3) Jabber pg. 8 3.1) Architecture pg. 8 3.2) Message Example pg. 9 4) Chatbot pg. 10 4.1) Purpose pg. 11 4.2) Choice of Technology pg. 11 5) User Interface pg. 12 6) Program Flow pg. 14
6.1) Receiving a Question pg. 14 6.1.1) Connecting to the Jabber Server pg. 15
6.1.2) Receiving a Message pg. 17 6.1.3) Parsing the Message pg. 18 6.2) Processing the Question pg. 19 6.2.1) Interacting with Alice pg. 19 6.2.2) Stock Handler pg. 20 6.2.3) Rules for Stock Handler pg. 22 6.3) Replying With an Answer pg. 23 7) Testing pg. 24 7.1) What was expected pg. 24 7.2) Connecting to the Jabber Server pg. 24 7.3) Responding to Messages pg. 25 7.4) Results pg. 27 8) Conclusion pg. 28 8.1) Future Work pg. 29 8.2) Bugs pg. 30 9) References pg. 30 10) Licenses pg. 31 11) Appendix A A-1
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
3
List of Figures:
FIGURE 1 - Screen shot of the chatbot user interface. pg. 12
FIGURE 2 - Screen shot of the chatbot after connecting to the server. pg. 14
FIGURE 3 - Two messages received while connecting to the server. pg. 25
FIGURE 4 - Screen shot of the chat session within Fire. pg. 26
FIGURE 5 - A screen shot showing the received messages. pg. 27
FIGURE B-1 - Simple system overview. B-1
FIGURE B-2 - UML overview of the chatbot architecture. B-2
FIGURE B-3 - Message Sequence Chart for Chatbot Connection B-3
FIGURE B-4 - Message Sequence Chart for Incoming Messages B-4
FIGURE B-5 - Message Sequence Chart for Alice Response B-5
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
4
1) Introduction
This project is a chatbot that responds to queries about stock prices and returns
current price of that stock. It contains a User Interface (UI) that was written in Cocoa
using Objective-C. In addition to the UI, the chatbot contains code that works with the
Jabber Instant Messaging (IM) protocol, and Alice for natural language processing. An
simple overview of the system can be seen in Fig. 6.
1.1) Chatbot
A chatbot is a program that runs in the background on a computer connected to a
network that waits for messages to be sent to it. Once a message is received from a user,
the chatbot decides what to respond with, and sends a message back to the user. This
way, the program can run unattended, and it makes its own choices of what to respond
with without human interaction or supervision.
1.2) Jabber
Jabber is an open-source Instant Messaging protocol that is based on XML.
Jabber. Jabber has other attractive features, such as: the server is free, it has transports
that allow it to work with other IM schemes, and the protocol is simple. The use of other
transports for MSN, ICQ, etc. are not used in this project and therefore will not be
discussed.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
5
1.3) Alice
Alice (Artificial Linguistic Internet Computer Entity) is an open-source program
that does natural language processing using rules. It is mostly used as a robot for
chatting. It uses AIML (Artificial Intelligence Markup Language), which is an XML-
compliant language for the rules. There is a version called J-Alice that is written in C++,
which is the version used for this project.
1.4) Cocoa
Cocoa is a framework from Apple that runs under Mac OS X. Cocoa was
developed from OpenStep and is therefore tied in with Objective-C. Objective-C is an
object oriented language that is very similar to C++. The other language that works with
Cocoa is Java, but since the Jabber and Alice code is in C++, Objective-C is great
because it can work with unison with C++. In addition, the development tools to Cocoa
are free and included with Mac OS X.
2) Alice
The next three sections describe how Alice works, AIML files, and the rules that
are inside these files.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
6
2.1) How Alice Works
The internal workings of Alice are basically composed of two items. The first is
the Kernel, the second is the Handlers that work with the additional XML tags. The
kernel is responsible for loading the AIML files, which contain the rules, and to process
statements. In the chatbot, a new kernel is created when the application is launch. This
new kernel reads in specified AIML files. Once the files are read in, Alice is ready to
match up statements to rules, and provide an answer. When a statement is passed into
Alice, it will match it up to a rule, and based on the information in the rule, provide a
logical response.
2.2) Rules
Once the AIML files are read in, each rule is “learned”. When Alice learns a rule,
it is able to match the rule up to a message passed in that was received from an outside
user. Matching up a rule occurs by looking at the message and seeing if the grammar in a
rule matches it. For example, if the rule is “_ school”, which matches up the word
“school” after the beginning of a sentence, Alice would match it up to the message “I am
at school”. This is true since the word school is after the beginning of the sentence. By
having multiple rules, Alice can give a meaningful answer to a wide range of questions
about the same topic.
There are two main symbols that are used for most of the rules for the chatbot. They are
the ‘*’ and ‘_’ symbols. The ‘*’, or wildcard symbol tells Alice to match up any word
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
7
for this symbol. So the rule “I like ” would match up to “I like computers” or “I like
skiing”. The ‘_’ symbol can be used a the beginning or the end of a rule. If it is used at
the beginning, like the rule”_ dogs”, Alice will match up any sentence that has the word
“dogs” after the beginning of the sentence. If it is used at the end of a rule, like “_
school”, Alice will match up any sentence that ends with the word school. The
‘*’ and ‘_’ symbols can be used together in a rule, but Alice only supports one ‘*’ per
rule.
2.3) AIML
The files that are read in that contain the rules are called AIML files. The files are
based on XML and contain information about how to handle specific questions. For the
use in the chatbot, there are 4 important tags that are used in the AIML files. An example
of an AIML containing one rule and the meaning of each tag are as follows:
<aiml version=”1.0”>
<category>
<pattern>_ happy *</pattern>
<template>Why are you happy?<template>
</category>
</aiml>
<aiml> - This tag indicates that the file is of type AIML and the version is 1.0.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
8
<category> - This tag tells Alice that everything inside of it is a unit of knowledge which
is also referred to as a rule.
<pattern> - This tag is what needs to be matched up for the rule to occur. In this case if
the question passed in contains the word “happy” somewhere other than the beginning of
the sentence, the result “Why are you happy?” will be passed back to the user.
<template> - This tag contains the answer that Alice will pass back to the user if the rule
is matched up with the question.
There are many more tags available to use, such as the <srai> tag for recursive pattern
matching, but since they were not in the scope of this project, they will not be discussed
here. More information can be found on the web at http://www.alicebot.org.
3) Jabber
The next two sections describes how Jabber works and an example of an instant
message.
3.1) Architecture
Jabber uses a client-server architecture as opposed to a client-client architecture
that some other IM systems use. This enables Jabber user to message other users who are
not on the same Jabber server. When a user is ready to login to a server, a TCP/IP
connection is made on port 5222. This connection will stay alive until the user logs off.
When a message arrives on the server for a user, the user’s client is set the message. This
means that the client does not have to poll the server to see if there are messages waiting.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
9
This reduces the amount of traffic flowing to the clients. Because each user could be on a
different server, the user’s address is their username @ servername. For example, it
could be [email protected] All messages that are exchanged between the server
and clients, including connection, registration, and instant messages use Jabber’s XML
protocol.
If two users are communicating together, but they are on different servers, Jabber
ensures that the message goes to the right person. Suppose there are two users, user1 and
user2, and they are on server1.com and server2.com respectively. If [email protected]
sends a message to [email protected], server1 will connect with server2 and deliver the
message. server2 will then forward the message on to user2.
3.2) Message Example
In the Jabber protocol, there are two different types of instant messages that can
be passed between the server and client once the client has logged in. The first is a single
instant message, and the second is a chat message. A single message is exactly what it is;
it is one message that is not part of a group of messages. The message is sent and the
client waits for a response. A chat message on the other hand is part of a chat session,
where each message is part of a group of messages that each user can see. The advantage
to chatting is that you don’t need to fill out a new message each time you want to say
something. The chat window stays open and all communications between users stays
visible until the session is over.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
10
For the example, a single instant message will be used since it is very similar to a
chat message but contains a little more information. Here is a sample message, and a
description of what it contains:
<message to=’[email protected]’ from='[email protected]’><body>How are you doing
today?</body><subject>Hello</subject>
<body>How are you doing today?</body >
</message>
In this example, a message is being sent from [email protected] to [email protected]. The
Notice that the name of who it is from and directed to contains the username followed by
the address of the server. In this case, the address of the servers are different. The
subject of the message is “Hello”, and the body of the message is “How are you doing
today?”. The message is user readable since it is in XML which makes developing code
for Jabber easier.
4) Chatbot
For this project, a chatbot was developed that used the Jabber for the IM protocol,
Alice for the natural language processing and Cocoa for the UI. The following sections
description in detail the inner workings of this application. A UML overview of the
different parts of the chatbot can be seen in Fig. 7.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
11
4.1) Purpose
The purpose of the chatbot is to allow user to send an IM to it asking about stock
prices. The user sends a question about a particular stock, and the chatbot responds with
the current price of that stock. For example, if the user sent the message “What price is
ERICY at?”, the chatbot could respond with “ERICY is at $11.53” (ERICY is the stock
ticker for Ericsson). There are two main benefits of the chatbot over a user manually
going to a webpage and finding the price. The first is that the message that the user needs
to send is small and only takes a few seconds. The second is that the user can ask in
English what stock they would like to look at. This means they can ask in a variety of
ways. Examples are: “How is Ericsson doing?”, “How is ERICY doing in the market?”
or simply “ERICY”.
4.2) Choice of Technology
For the instant messaging part of the chatbot, Jabber was chosen for a number of
reasons. The first being that the protocol is public, the second is that the server is free
and runs under Mac OSX. The final reason is that there are numerous open source clients
that made learning the Jabber protocol easier.
For the natural language processing, Alice was chosen because of the fact that it is
mature, and there is a C++ version of the library. In addition, the C++ version (J-Alice)
is free and open source which made the integration into the chatbot easier.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
12
For the UI, Cocoa was used because this project runs under Mac OS X and Cocoa
is a powerful and easy to use framework. This allowed the UI to be developed quickly,
so the focus could remain on the internal workings of the chatbot.
5) User Interface
The user interface for the chabot it quite simple. There are text fields for the
username and password of the Jabber account that the chatbot will be using and for the
address of the Jabber server. In addition, there are buttons for connecting to the server
and a button to refresh the rules that Alice uses.
FIGURE 1 – Screen shot of the chatbot user interface.
(1) This is the text field for the IP address of the Jabber server.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
13
(2) This is the text field for the username that the chatbot will be using to connect with to
the Jabber server.
(3) This is the text field for the password for the given username. Notice that it is a
secure text field, in that the password is hidden from the user. This is to prevent someone
who is looking at the screen from knowing the password.
(4) This is the button that the user must press to have the chatbot connect with the Jabber
server.
(5) This button is used to have Alice re-read the AIML files that contain the rules.
(6) This is the area where incoming message from the server are displayed.
To use the chatbot, the user must first fill in the username, password and server address
text field with the appropriate information. Once this is done, the user must press the
“Connect” button. After pressing this, the chatbot will connect with the Jabber server,
and any incoming messages from the server will be displayed in the “Incoming
Messages” text area.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
14
FIGURE 2- Screen shot of the chatbot after connecting to the server.
6.) Program Flow
There are 3 major steps that the chatbot goes through from the time it receives a
question, to the time it responds. The first is communicating with Jabber to receive the
instant messages (questions). The second is processing the question with Alice to come
up with an answer. The last is returning this answer back to the user that sent the original
message. Each of these steps will be detailed in the sections below.
6.1) Receiving a Question
Before the chatbot can receive a question, it must first connect with the Jabber
server and establish a connection. For this project, it is assumed that the user account that
the chatbot is using has already been setup. Once the chatbot is connected to the server,
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
15
it waits for and instant message (the question). After receiving a question, the chatbot
parses it and prepares it for processing in Alice. Each of these three steps is detailed in
the sections below.
6.1.1) Connecting to the Jabber Server
Connecting to the Jabber server is a fairly straightforward process since all the
messages are in human readable XML. To begin the connection, the user must first fill in
the “Server IP”, “Username” and “Password” fields in the chatbot’s user interface. Once
this is done, the user must click on the “Connect” button. This will initiate the
connection. For the connection, a BSD socket is made and an attempt to connect to the
Jabber server begins. Once the socket connection is made, the registration process
begins. This process has 3 steps to it, which are: connect, registration and presence. All
of the sending and receiving with the Jabber server is done through BSD socket calls. A
message sequence chart for the connection can be see in Fig. 8.
Now that a socket has been established, the first XML string is sent to the server. It
contains information telling the server that the chatbot would like to connect as a Jabber
user. This string contains the address of the server, and a message that identifies us as
type “client”. This string is:
"<?xml version=\"1.0\" encoding=\"UTF-8\" ?><stream:stream
to='24.42.217.7' xmlns='jabber:client'
xmlns:stream='http://etherx.jabber.org/streams'>"
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
16
The string also contains a web address which can be used for connections to AIM, ICQ,
MSN, etc. This is not relevant in the scope of this project since only the Jabber protocol
is begin used. Therefore, it will not be discussed here.
The server receives the request for a connection, and it replies with an answer. This
response contains a unique identification number (id) that is assigned to all of the
messages in this connection coming from the server to the chatbot. This string is:
<?xml version='1.0'?><stream:stream
xmlns:stream='http://etherx.jabber.org/streams' id='3DF53FBB'
xmlns='jabber:client' from='24.42.217.7'>
The next step is to send a login to the server as a user with a registration request. The
request contains the username and password. In addition, the id of the message is
“auth2” which tells the server we are authenticating ourselves, as well as the <resource>
tag is set to “client”, since we are a client of the server. The string sent is:
"<iq id='auth2' type='set'>
<query xmlns='jabber:iq:auth'>
<username>myUsername</username>
<password>myPassword</password>
<resource>client</resource>
</query>
</iq>"
The response from the server to this message is:
<iq id='auth2' type='result'/><stream:stream
xmlns:stream='http://etherx.jabber.org/streams' id='3DF53FBB'
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
17
xmlns='jabber:client' from='24.42.217.7'>
This message is close to the first message from the server, except the type is “result”.
This refers to the fact that the server has accepted our registration, and we must now
announce our presence, so that we are visible to other users. This means that once the
chatbot is visible on the Jabber server, it can start to receive instant messages.
The last step is to send a presence message to the server, which is:
"<presence/>"
This is a simple message indicating that the chatbot is available and can accept messages
from other users.
6.1.2) Receiving a Message
Now that the chatbot is connected to server, it must wait for incoming messages
(questions). This is accomplished by setting up a timer that polls the socket using the
select() method every 0.10 seconds. By polling the socket, the chatbot can quickly see if
an incoming message is waiting to be read. If select() method comes back saying that
there is a message, the chatbot reads the message into a buffer, and the next step is to
parse this data. A sample instant message could be:
<message type='chat' to='[email protected]’
from='[email protected]/everybuddy'><body>What is the price of
AAPL?</body></message>
Notice that the message has a minimal amount of data, with the type of message, who it
is from, who it is for, and the body. A message sequence chart for receiving a message
can be see in Fig. 9.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
18
6.1.3) Parsing the Message
After receiving a message, it must be parsed to determine the specific information
needed for a reply. This information includes who is it from, the subject, the body of the
message, and whether it is a single instant message, or part of a chat. The first, who it is
from, is determined by examining the “from” property. It contains the username and the
address of the sender. The subject will be enclosed in the <subject></subject> tag if
there is a subject since some messages only have a body. The body of the message is
contained in the <body></body> tag. Finally, to determine if the message is a single
instant message or part of a chat, the “type” property is examined. If is it equal to “chat”
then the message is from a chat session, otherwise it is a single message. Here is an
example using the sample message from the previous part:
<message type='chat' to='[email protected]’
from='[email protected]/everybuddy'><body>What is the price of
AAPL?</body></message>
From: “[email protected]”
To: “[email protected]”
Subject: “” (empty)
Body: “What is the price of AAPL?”
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
19
6.2) Processing the Question
Alice now needs to process the message so it can find an appropriate answer.
This is done though a new handler called the StockHandler, and its job is to find the price
of a specific stock. How the chatbot interacts with Alice, and how the StockHandler
work are described in the sections below. A message sequence chart for the processing
of a message within Alice can be seen in Fig. 10.
6.2.1) Interacting with Alice
Now that a question has been received from a user, it must be passed on to Alice
for processing. Alice will determine the correct response to send back to the user based
on the rules it previously learned, and the context of the question. The way this works is
the chatbot sends the body of the message to Alice. The chatbot sends this information by
passing the body of the message to the Kernel class (which is part of Alice). Recall that
the Kernel was created when the application was launched. This Kernel finds the
appropriate handler, which in this case will be the StockHandler and passes the
information to it. So, for example, if the message body was “What is the price of
Corel?”, then the StockHandler will end up receiving that sentence. The way the
StockHandler works, and the rules involved are detailed in the two sections below.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
20
6.2.2) Stock Handler
Since the chatbot is needs to respond with the stock price of a company, a new
handler was added to Alice to deal with this. This job of this handler, called
StockHandler was to look up a ticker symbol on Yahoo Finance
(http://finance.yahooo.com), parse the web page and return the price of that ticker at that
moment. This is also known as “screen scraping”, which refers to the fact that the chatbot
is only interested in one small part of the web page. To have this handler called when a
stock price was needed, a new XML tag was used. This tag is defined as: <stock>ticker
symbol</stock>. The ticker symbol that is enclosed in the tag is the stock that will be
looked up. So when Alice matches up the question to a rule, the rule would have the
stock ticker inside the <stock> tag. The StockHandler would then be called with the
ticker symbol. An example of this would be:
Question:
“What is the price of apple stock?”
Rule:
<category>
<pattern>APPLE STOCK *</pattern>
<template><stock>AAPL</stock></template>
</category>
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
21
Result:
Alice matches up the string “Apple Stock” to the ticker “AAPL”. The StockHandler
would then be called with AAPL as the ticker symbol since that is what is enclosed in the
<stock> tag. The price of Apple Computer (AAPL) would then be returned.
The StockHandler is a fairly simple piece of code. It finds the stock price of a ticker
symbol off the web, and returns the current price. The steps involved are detailed below.
The first step was to build the correct web address for the given ticker symbol. This was
done by using the page ������������������������<ticker
symbol> ��� and replacing the <ticker symbol> with the actual symbol. For example,
if the stock ticker was “MTIB.TO”, then the webpage would be
������������������������MTIB.TO���. This page provides basic
information for a stock including the current stock price. Once this address was formed,
the source for the page was downloaded. Within the source, a specific, unique string was
searched for. This string is “</font></td><td nowrap><font face=arial size=-1><b>”.
Immediately after this string is the price of the stock. Following the price is the string
“</b>” so it is easy to find where the price starts and ends. The following is a clipping of
HTML source from the Yahoo Finance webpage with the price in bold:
<font face=arial size=-1>Dec 11</font></td><td nowrap><font face=arial size=-
1><b>15.49</b></font>
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
22
Once the price has been found, the string “<ticker symbol> is at <current price>” is
returned to the chatbot. An example of this could be “MTIB.TO is at $0.15”.
6.2.3) Rules for Stock Handler
To match up the questions that users send to the chatbot, and special set of rules
for Alice needed to be made. These rules are contained in AIML files that Alice reads in
when then chatbot is launched.
Each stock ticker (ex: AAPL, CORL, etc.) has its own AIML file that Alice reads in on
startup. This way, when the user sends a question to the chatbot, Alice can try and match
up the question to a rule. Since each rule matches up to a stock ticker, Alice can send the
stock ticker to the StockHandler for further processing. The alternative would be to have
a list of every since stock ticker and company name, and have Alice check if the user
picked any of them along with certain keywords.
Each rule looks for a different way that the user could be asking the price of a stock.
Since there are many different variations, the most common were chosen. This amounts
to about 30 rules for each stock. There are different groups within the rules. There are
groups that look for the company’s name with specific keywords, and others that look for
the stock ticker. Examples of each group of rules for Apple Computer (AAPL):
Company name: _ STOCK MARKET, * APPLE COMPUTER *
_ APPLE COMPUTER'S PRICE *
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
23
Stock ticker: _ AAPL *
AAPL _
A complete list of the rules used for AAPL can be seen in Appendix A.
6.3) Replying With an Answer
The final step in processing a question from a user is to respond with a valid
answer. Since the chatbot deals with stock prices, it would be logical to send a message
back to the user with the stock price they were asking about. Once the chatbot has the
answer from Alice, it sends a message with the answer back to the user. This message
with either be part of a chat, or a single instant message.
An example of a chat message is:
<message type='chat' to=’[email protected]’
from=’[email protected]/Chatbot'><body>ERICY is at $8.93</body></message>
An example of a single instant message is:
<message to='[email protected]’
from='[email protected]/Chatbot'><subject>Hello</subject><html
xmlns='http://www.w3.org/1999/xhtml'><body>ERICY is at
$8.93</body></html></message>
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
24
So, an example of a sequence of messages could be:
User: “What is the price of Ericsson?”
Chatbot: “ERICY is at $9.10”
7) Testing
There are two parts to the testing phase. The first was to ensure that the chatbot
connected to the Jabber server properly. The second was to use an extenal Jabber client,
in this case Fire, and have it initiate a chat session with the chatbot. These steps are
detailed, along with what was expected and the actual results in the sections below.
7.1) What Was Expected
It was expected that the chatbot would be able to successfully connect to the
Jabber server as well as respond with the correct answer to incoming messages from an
outside Jabber client.
7.2) Connecting to the Jabber Server
First off, the chatbot was launched. Once the user interface appeared, the
username was set to “maustin” and the password to “test”. This is an account that was
created earlier on the Jabber server for the chatbot to use. The server address was set to
“24.42.217.7” since this is my home machine’s address (the server was running on it).
Then the “Connect” button was pressed so that the chatbot would begin to connect to the
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
25
server. Two messages appeared in the chatbot’s “Incoming Messages” text box which
confirmed that it had connected successfully to the server.
FIGURE 3 - Two messages received while connecting to the server.
In addition, debug output from the Jabber server shows that there is now 1 user, which is
the chatbot. This text is: “Sat Dec 14 17:02:53 2002 usercount 1 total users”
7.3) Responding to Messages
The next step in the testing was to ensure that the chatbot could respond properly
to incoming messages. The external Jabber client used in called “Fire” and was used to
connect to the server with username “user” and password “user”. This account was set
up earlier and used for testing purposes. Messages asking for stock quotes was sent to
the chatbot. The chatbot responded to each question.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
26
FIGURE 4 - Screen shot of the chat session within Fire. Notice that the same question
was asked using different sentences, but the answer was always the same.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
27
FIGURE 5 - A screen shot showing the received messages.
7.4) Results
The result of the testing was that the chatbot performed exactly as expected. It
was able to connect to the Jabber server and respond correctly to the messages it
received. In addition it was able to screen scrape the price of AAPL (Apple Computer)
and figure out the correct answer using Alice.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
28
8) Conclusion
Overall, the chatbot project was a success. Integrating the Cocoa UI, Jabber and
Alice was a challenge considering the UI was developed in Objective-C, the Jabber
routines were in C/Objective-C and the Alice code was in C++. The fact that it all works
together is nice to see, and the end result is a functioning chatbot that is able to give stock
quotes from English questions sent to it.
This project allowed me to learn about topics I didn’t even know about. Jabber
was new to me, and after seeing how easy it is to implement a client I was impressed.
Learning about natural language processing was a bit of a challenge since I had never
researched anything on this topic. After finding an open source version of Alice that was
written in C++, I found it was fairly easy after a few modifications to have it working
with the chatbot. Learning AIML took a while since it was new, but after learning it, the
rules were easy to create. Putting it all together gave me the experience of working with
multiple packages, integrating them, and having a user interface to show the user what
was happening.
Because the chatbot can be easily extended to handle other topics, such as weather
reports, how many items are in stock, or even someone’s telephone number, it leads itself
to being useful in a business aspect. For example, it could be used in e-commerce to
quickly find out how much a product costs from a particular store. This would be quick
given that the user could ask the question in English, and have a rapid response. It would
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
29
save the user time from having to go to the store’s webpage, search for the item, and then
find the price.
Perhaps asking questions in a natural manner will be the future of computer
interactions. Certain web pages could be replaced, or complemented with a chatbot that
would respond to simple user questions. Either way, this project is an excellent start for a
full fledged chatbot, or a specialized, custom chatbot.
8.1) Future Work
There are a few things that could be added to the chatbot to make it even more
useful. They are:
a) Have an intermediate server with a static IP that the chatbot connects to. This means
that any user wishing to communicate with the chatbot only has to know the address of
the intermediate server, which is static.
b) Add a “disconnect” feature that gracefully logs off from the Jabber server.
c) Add functionality to create a new Jabber user through the chatbot.
d) Have the chatbot keep a log of all the incoming/outgoing messages.
e) Add new handlers for Alice to respond to topics other that stock prices. This could
include weather reports, or current news.
f) Allow the user to save or print the incoming/outgoing messages.
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
30
g) Add rules for different stock tickers.
8.2) Bugs
a) The chatbot crashes in the Alice code after numerous messages are sent to the chatbot.
Because of time constraints, this bug has not been fixed. Since the Alice code is 3rd
party, perhaps an update of this code from the author might fix the problem.
b) If the Jabber server is not available, the application might crash when it tries to
connect.
9) References
Jabber Software Foundation – “Jabber :: About”, 2002. [On-line]
http://www.jabber.org/about/overview.html
Horn, Max & Moore, Jason – “JabberFoX – A Jabber Client for Mac OS X”, 2002. [On-
line]
http://jabberfox.sourceforge.net
Func@all - “Func@ll : How to build Jabber 1.4.1 on Apple Mac OS X”, 2002. [On-line]
http://www.funcall.com/Documentation/Jabber/BuildingJabberOnMacOSX.html
The A.L.I.C.E. AI Foundation – “A.L.I.C.E. AI Foundation”, 2001-2002. [On-line]
http://www.alicebot.org
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
31
SourceForge.net – “J-Alice :: Home”, 2002. [On-line]
http://j-alice.sourceforge.net
Ringate, Thoman – “ALICE AIML Primer”, 2001. [On-line]
http://www.comp.mq.edu.au/courses/comp248/Resources/aiml-primer.html
Apple Computer, Inc. – “Cocoa Developer Documentation”, 2002. [On-line]
http://developer.apple.com/techpubs/macosx/Cocoa/CocoaTopics.html
10) Licenses
License for J-Alice (from http://www.opensource.org/licenses/mit-license.php):
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the
Software without restriction, including without limitation
the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to
do so, subject to the following conditions:
Natural Language Processing Chatbot for Stock Quotes Honours Project Report Matt Austin - 264681
32
The above copyright notice and this permission notice shall
be included in all copies or substantial portions of the
Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS
OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
A-1
Appendix A: Rules for the StockHandler
The file stocks.aiml contains the rules that the StockHandler uses to find
prices for AAPL. There are 3 different groups of rules that are in this file.
One looks for the stock ticker (AAPL), one looks for the company’s name
with specific keywords, and the other does the same but uses an abbreviated
version of the company’s name. Here are the rules that are used and which
group they belong in:
Stock Ticker Group:
AAPL
AAPL_
_AAPL
_AAPL *
Company’s Name with Keywords Group:
APPLE COMPUTER STOCK
_ APPLE COMPUTER STOCK *
_ APPLE COMPUTER STOCK
_ STOCK MARKET, * APPLE COMPUTER *
_ APPLE COMPUTER * STOCK MARKET
_ MARKET * APPLE COMPUTER *
_ APPLE COMPUTER * MARKET *
_ MARKET, * APPLE COMPUTER *
_ PRICE * APPLE COMPUTER *
_ PRICE * APPLE COMPUTER
_ APPLE COMPUTER'S PRICE *
A-1
_ APPLE COMPUTER'S PRICE
Company’s Abbreviated Name with Keywords Group:
APPLE STOCK
_ APPLE STOCK *
_ APPLE STOCK
_ STOCK MARKET * APPLE *
_ STOCK MARKET, * APPLE *
_ APPLE * STOCK MARKET *
_ APPLE * STOCK MARKET
_ MARKET * APPLE *
_ APPLE * MARKET *
_ MARKET, * APPLE *
_ PRICE * APPLE *
_ PRICE * APPLE
_ APPLES * PRICE *
_ APPLES PRICE *
_ APPLE'S * PRICE *
_ APPLE'S PRICE
_ APPLE'S PRICE *