Coffee Talk A Peer to Peer Protocol for Instant Messaging and Chat Contents -------- Introduction Protocol Overview Peer Session State Protocol State Desynchronization Introduction ------------ There are several existing chat and instant messaging (IM) protocols existing in the market today. However, most of these protocols exist with the need for a central server to route and manage these chat networks. A good example is the IRC network. When a client wants to participate in an IRC chat session, he must first connect to one of several thousand IRC servers around the world, specific to a particular IRC network. For example, to connect to the Undernet Network (http://www.undernet.net), the client must connect eu.undernet.net to connect to a random Undernet server in Europe. Each server is connected to each other, relaying messages from clients connected to it to other servers, as well as relaying messages from other servers to its clients. The backbone therefore of such a network is the many servers that relay messages to and from clients and other servers. A problem with such a model is that the network suffers from a single point of failure -- the server. In the case of IRC, such is quite evident when the network suffers from what is known as a net split. A net split occurs when the link between two or more IRC servers is severed (either because of hardware or software failure), causing a block in the relaying of messages. Messages from one side of the split cannot reach the other because of the severed link, and vice versa. In recent years, there has been a shift in network models to what is now known as peer-to-peer computing. An example of a peer-to-peer (or P2P as it is abbreviated) network is the Gnutella network. The Gnutella protocol is an open protocol that allows users to share files. There is no central source of connection-- a Gnutella client does not need to connect to a central server. In this model, the network becomes more robust, as each node (called a peer in P2P parlance) is in a sense both a server and a client in the network. A Gnutella network is therefore more ad hoc than a network based on a server-client architecture. Coffee Talk is a p2p protocol that allows users to exchange messages over a TCP/IP-based network. Version 0.1 of the protocol uses a UDP broadcast/multicast to look for and discover peers. It uses a TCP connection to connect to peers to send messages. Coffee Talk is not meant to be used in any mission-critical infrastructure; rather, it is a proof-of-concept protocol for peer-to-peer services. Protocol Overview ----------------- The Coffee Talk protocol is subdivided into two component protocol: the peer session state protocol and the peer messaging protocol. The peer session state protocol is a datagram-oriented connectionless protocol over UDP that manages the state of peers in a Coffee Talk network. The peer session state protocol uses either a multicast address or the network broadcast address. The peer messaging protocol is a stream-based connection-oriented protocol over TCP that relays messages from one peer to another; in the case of IM, a peer directly relays its messages to the intended peer. The peer messaging protocol creates a dynamic network of connected hosts by assigning each host a dynamic sequence number, unique to a particular multi-peer session (called a conversation in the protocol specification). Peer Session State Protocol --------------------------- The Peer Session State Protocol is a low-level, datagram-oriented connectionless protocol used by Coffee Talk peers to keep track of the state of other peers as well as inform peers of changes in a peer's state. Currently as of version 0.1 of the protocol, three state messages are supported: JOIN, LEAVE, and RESOURCE. A session state message is a coded packet containing information about the peer broadcasting the state. It has the following format: Offset Content 0 (short) 0xADF0, the packet descriptor 2 (byte) Major version of the protocol 3 (byte) Minor version of the protocol 4 (integer) Message descriptor 8 (integer) Length of peer name string 12 (array of bytes) Bytes of peer name string 12+n Start of message data All multi-byte numeric formats are stored in backwords. Strings are encoded with their integral length prefixing the actual bytes of the string; note that Unicode is used by default in Java, and is what is used by the protocol to represent strings. The message descriptors used by the protocol are as follows: Message Descriptor Message Name 0x00000001 Session JOIN 0xFFFFFFFF Session LEAVE 0x00000064 Peer RESOURCE advertisement Each packet is exactly 256 bytes in length, padded with 0x00. Note that this length may change in future versions of this protocol. Each session state message can be accompanied by message data specific to that particular message: JOIN: Offset Contents 0 (integer) Session ID LEAVE: Offset Contents 0 (integer) Session ID RESOURCE Offset Contents 0 (string) Resource ID When a peer wants to join the network, it must announce its state by broadcasting/multicasting a JOIN message. Peers already in the network must respond with a RESOURCE state message specifying the peer existence resource. However, if a conflict occurs between an existing peer's name and that of the joining peer, the peer currently in the network must broadcast a RESOURCE state message specifying the conflict in names. It is up to each peer to resolve the conflict in their own tables. Session IDs The JOIN and LEAVE messages specify a session ID integer. This session ID may be used in future versions of the protocol for peer groups. There are 1025 reserved IDs, from 0 to 1024. The session ID 0 denotes the global session, which all peers can be a member of. Session ID 1024 is reserved for use by diagnostic peers. Resources The RESOURCE state message identifies a resource that the peer holds which may be of use by other peers. Resources that may be exchanged by peers include conversation resources, file resources, and tunneling resources. Conversation resources: A peer that has started a conversation or is part of a conversation notifies peers of this resource through a RESOURCE state message. See Conversations for more details. Other resources: In the future, peers may also notify other peers in the Coffee Talk network of other available resources such as files for sharing, etc. A possible network resource is a tunneling resource, which is a multi-homed peer which may be used to relay messages from one network segment to another. State Desynchronization ----------------------- It may be possible for a peer's current network state to be out of synch with another peer's record of that state. In such a case, the peers involved are said to be state desynchronized. For example, a peer that has not properly notified its peers of its exit from the network (either because the peer has suddenly become unreachable, or the peer's client has terminated unexpectedly) will become desynchronized with peers currently in the network. In such cases, the desynchronization is ignored by the current implementation of the protocol; i.e. the unreachable/dead peer's state will not and cannot be reflected back to the network. However, peers discovering such a change of state in another peer may notify other peers via a RESOURCE state message. Note that the RESOURCE state message may be ignored by participating peers, and that this is not a requirement. Simple Messaging Protocol ------------------------- Two peers in the network communicate with each other through a TCP connection. The simple messaging protocol is a stream-oriented connection-based protocol dependant on TCP. It is a line-oriented message protocol, based on verb-noun lines. Each line is a UTF-8-formatted string with two parts: a single word verb, and a noun. The following verbs are supported as of v0.1 of the protocol: Verb Meaning peer-id Redundant ID of the connected peer message In-band text message multi Multi-recipient relay message; used for conversations Verbs other than message are considered meta-data verbs. Meta-data verbs describe details about the dialogue or about the message payload of the dialogue. Nouns under the simple messaging protocol are simply the rest of the line. Below are the nouns corresponding to each verb: Verb Noun peer-id The ID of the peer involved message The in-band text message multi A compound noun containing the intended conversation ID as well as the raw message to be relayed; note that the latter part of this compound noun is in fact a raw peer-id or message verb-noun pair. Examples below: peer-id Barako! - identifies that the peer's (redundant) ID is 'Barako!' message Hello, World! - the message "Hello, World!" was relayed to the peer multi #coffee message Hello, Conversation! - the raw verb-noun pair 'message Hello, Conversation!' is destined for multiple recipients in the #coffee conversation. Connections All connections under the simple messaging protocol are duplex, and must be reused; i.e. two peers involved in both a dialogue and a conversation must use the same connection to relay messages. Conversations ------------- Conversations are simple messaging protocol sessions involving multiple peers and message relaying. Peers called conversation loci are responsible for relaying messages and for selecting the next locus. Each peer involved in a conversation can act as the conversation locus. The conversation starts when a single peer creates a conversation channel. Peers are notified of this channel by the initiating peer by a RESOURCE message identifying the channel. Once the channel has been advertised in this manner, the conversation is said to be live. A live conversation can be participated in by any number of peers. A peer wishing to join a live conversation first connects to a peer already participating in that conversation. If the peer is already connected to another peer who happens to be already participating in the said conversation, the peers must reuse the connection between them. Distinguishing between dialogues and conversations: A peer participating in a conversation where it is also participating in a dialogue with one of the peers in the conversation should have a way to distinguish between messages relayed to it from the conversation and messages sent to it by the peer it has a dialogue with. It is up to the client to perform such distinctions-- specifically by processing message verbs differently from multi verbs. Message relaying: A conversation locus must act as a relaying center; that is, when a conversation locus receives a message from a participating peer, it must relay this message to all other peers also participating in the same conversation. It must identify the source peer of the message by prefixing the peer's name on the message.