Content-Type: text/html
TOC |
|
The Gnutella flooding search algorithm has a fatal flaw: it sends too many messages for widely distributed files while sending too few messages for rare files. Common queries consume the processing and bandwidth resources of nodes, diminishing network performance. "GUESS" is a technique for dramatically improving the searching architecture to alleviate these problems. In GUESS, nodes perform iterative unicast searches of Ultrapeers, or "Ultrapeer crawling." The crawl terminates as soon as a desired number of results is achieved, limiting the horizon of searches for widely distributed content. While GUESS improves search results for rare files, switching to GUESS should also reduce the number of messages passing through Ultrapeers by several orders of magnitude. This substantially reduces the bandwidth, memory, and CPU costs of remaining an Ultrapeer, making it more likely users will keep their nodes running instead of turning them off to free resources.
TOC |
TOC |
The use of broadcast searches with high Time To Live (TTL)s on the Gnutella network uses a great deal of bandwidth and provides little control over the propagation of messages.[1] This document seeks to alleviate both problems through the use of iterative unicast searches of Gnutella Ultrapeers.[2] In this scheme, a client continuously queries Ultrapeers with a TTL of 1 until the desired number of search results is achieved. Due to the number of nodes that may be dynamically queried in this model, these messages are sent over UDP in the absence of static TCP connections. This proposal is not intended to replace work done in areas such as query meshes. (See [3] and [4]) It does not, for example, easily allow existing web servers to be modified to service Gnutella queries. Rather, it combines aspects of several powerful ideas from different parties, including the importance of carefully controlling query propagation and the potential for queries and hits to be sent over UDP, making such a system feasible.
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.[5] An implementation is not compliant if it fails to satisfy one or more of the MUST or REQUIRED level requirements for the protocols it implements. An implementation that satisfies all the MUST or REQUIRED level and all the SHOULD level requirements for its protocols is said to be “unconditionally compliant”; one that satisfies all the MUST level requirements but not all the SHOULD level requirements for its protocols is said to be “conditionally compliant.”
The current broadcast search model consumes excessive bandwidth and produces a high load on nodes. This occurs because:
The number of nodes queried per search is uncontrolled. Even if the number of nodes queried per search were constant, the number of query hits generated per search would still be highly variable.
The first problem is a result of the volatile, ad-hoc nature of Gnutella. Nodes come and go unpredictably, making the connectivity of different parts of the network highly variable, or at least potentially so. The current query model accounts for this volatility by flooding -- it always takes whatever it can. It does this by:
As a result, searches for common keywords in highly connected areas of the network have a disproportionate impact on network load, while searches for less common keywords in less connected areas are not able to reach enough nodes to obtain a satisfactory number of query hits.
While the unpredictable nature of queries presents the first half of the problem, the unpredictable nature of query hits has a comparable debilitating effect. The problem with queries spills over into hits -- a variable number of nodes queried results in a variable number of query hits. The problem is more serious than this, however. The number of query hits generated per search also varies independently because:
Some searches match a far higher percentage of files than other searches (a search for "txt" produces more results than a search for "The_Gettysburg_Address.txt"). Some nodes share more files than others, so the query hits depend not only on the number of nodes queried, but on which nodes queries happen to reach.
With this system, users frequently get more results than they need for popular content while they have a difficult time finding files that are not as widely distributed. The current system provides no mechanism for dynamically adjusting the search depending on these factors -- the central issue GUESS addresses.
This proposal mitigates these issues by reducing the TTL to 1 on outgoing queries and by sending queries to one Ultrapeer at a time until some desired number of results is received or a limit on the number of Ultrapeers queried is reached. Such a change grants the client initiating a query substantially more control over the number of nodes the query reaches and over the number of query hits generated. As such, it takes a significant step towards solving the primary two problems with the current query model noted above. It does not eliminate these issues because Ultrapeers have varying numbers of leaves, nodes still share varying numbers of files, and some searches will still return far more results from given Ultrapeers than others. Nevertheless, this change dramatically mitigates the effects of these problems and makes the Gnutella network far more scalable.
TOC |
To implement GUESS, clients must send queries iteratively
to known Ultrapeers supporting GUESS,
stopping when enough results are received. If a TCP connection
to an Ultrapeer is available, it MAY be sent a query with TTL=1
regardless of whether or not that Ultrapeer supports
GUESS. Otherwise, queries are sent to GUESS Ultrapeers over
UDP. On the server side, GUESS nodes must listen on a
UDP port, and they must send their responses over the same
UDP port.
The following sections discuss the details of these changes. In
this discussion, the "client" is the node initiating the
query, either on its own behalf or on behalf of one of its
leaves. The client can be either a leaf or an Ultrapeer,
although, again, leaf-controlled queries are OPTIONAL. The
"server" is the receiver of the query, which is always an
Ultrapeer. Developers implementing this proposal MUST implement
both the client side and the server side. This means that any
developers wishing to implement GUESS also MUST implement
the Ultrapeer proposal[2].
Clients send queries to Ultrapeers one by one until one of the following occurs:
If TCP connections to Ultrapeers are available, the client
SHOULD first send TTL=1 queries to those Ultrapeers. This makes
it possible to search Ultrapeers that do not implement GUESS. To
make sure queries do not flood the network with too
much traffic, the client MUST pause for a reasonable amount
of time between each query, perhaps about 200 milliseconds. This
pause accounts for network latency, as it takes a variable
amount of time to receive results from an Ultrapeer
and its leaves. The interval allocates time to receive these
hits. During the interval, the desired number of results may
be reached, making sending the query to another Ultrapeer
unnecessary. The optimal
interval between searches should be determined by
experimentation, but developers MUST send as few queries
as possible without degrading user experience and without
prohibitively increasing the load on participating
Ultrapeers. Implementations may also vary the interval
between queries depending on how many Ultrapeers
have already been reached. For example, a simple algorithm would
be to set the initial interval between queries to 1,500
milliseconds and multiply that interval by .8 on each
iteration. This allows the query to stop quickly if it receives
enough results from the first few Ultrapeers searched. In
this scheme, the interval could eventually reach some absolute
minimum interval and stay there. In any case, the interval
MUST never fall below the absolute minimum required by GUESS.
Developers also have some flexibility in how their algorithms
determine how many results are "enough." A simple
algorithm would be for the client to continue
querying until it receives 100 results or queries 1,000
Ultrapeers, for example. An alternative would be for the
number of results considered "enough" to decrease as a function
of the number of Ultrapeers searched. This algorithm would
recognize that if a search has returned no
results after querying 500 Ultrapeers, it is unlikely
to get very many results from querying the next 500,
and it may be satisfied as soon as it receives any results
at all and stop there. This would reduce the total number
of Ultrapeers required to service queries for rare files.
The details of these algorithms should also be determined
through experimentation by Gnutella developers, again keeping
in mind that the overall goal is to reduce query and query
hit traffic for everyone while maintaining current search
performance for common files and improving search performance
for rare files. Here, "search performance" is measured as
providing results for the desired file and does not necessarily
correspond with the raw number of results received. Clients
should also keep in mind that an overly aggressive
implementation will ultimately damage their own clients
through increasing everyone's overall network load.
When performing this search, there are several rules that
clients MUST follow. These are:
These numbers should be considered the absolute limits, and they are not the settings developers should use. Again, the optimal limits should be determined by experimentation, but the above rules always apply. Developers should keep in mind that Gnutella is a network that relies on the fact that other clients are not overly selfish or abusive -- Gnutella relies on trust to a large degree. The reduction in traffic should reduce the bandwidth, CPU, and memory load on all Ultrapeers, but this is only possible if developers use conservative values when writing their implementations.
Ultrapeers acting as search proxies for leaves SHOULD an individual leaf query if the leaf that initiated the query disconnects.
If leaves are able to receive incoming UDP packets, they are REQUIRED to perform their own GUESS queries. If leaves are not firewalled, they will be able to receive incoming UDP packets without a problem. Even if leaves are firewalled, however, they will likely still be able to receive incoming UDP packets. This is possible because many firewalls will allow incoming UDP packets if the firewalled host has sent an outgoing packet to the same IP and port of the incoming packet. To determine whether or not they are able to receive incoming UDP packets, leaves MUST send UDP pings to GUESS Ultrapeers upon joining the network. If the leaf receives an incoming UDP pong, it MUST perform GUESS queries on its own without going through an Ultrapeer proxy.
The changes on the server side are less significant. There are no changes for leaves, as leaves do not act as servers, although leaves MAY choose to accept incoming UDP messages. Ultrapeers MUST, however, open a port for incoming UDP traffic, and they MUST use the same port that they are using for incoming Gnutella messages over TCP. The details of this are discussed in the section on UDP. When a server receives a message over its open UDP port, it MUST send any query hits via UDP and over the same port -- it MUST NOT use an ephemeral port for sending the reply. As in the current Gnutella network, all hits are sent back to the sender. In addition, the server MUST respond with a pong, as discussed in the following section.
This pong serves the following two purposes:
These pongs also give Ultrapeers moderate control over the number
of incoming messages. If an Ultrapeer is becoming overloaded, it
MAY choose to stop sending pongs to incoming queries, effectively
removing it from client lists of Ultrapeers to query. If developers
choose to do this, they MUST stop sending pongs regardless of the
vendor sending the incoming query -- they MUST NOT preference certain
clients over others when sending pongs, unless specific vendors are
clearly violating the requirements of GUESS. Finally, these pongs
are REQUIRED to have the same guid as the incoming query.
As a result of these rules, all pong acknowledgements MUST have
the GGEP extension indicating GUESS
support.
In all other respects,
servers should respond to messages just as if they
received them over TCP. Servers MUST accept all of the
traditional Gnutella messages over their UDP port. These
messages are defined in
the Gnutella Protocol Specification v0.4[13].
Ultrapeers also MUST start forwarding TTL=1
queries received over UDP to leaves. Without this change, queries
would have to be sent with TTL=2, which would lessen the fine-grained
control over the query and would eliminate benefits such as
no longer needing to concern ourselves
with cycles.
Finally, developers MAY choose to implement only the server side
of GUESS during their initial testing. If developers choose to
implement the client side, they are REQUIRED to implement the
server side as well.
TOC |
For this scheme to work, hosts must have the ability to discover Ultrapeers that support GUESS. In fact, Ultrapeer discovery may be one of the most challenging components of this proposal, as Ultrapeers do not simply have to discover other Ultrapeers -- they have to discover LOTS of them. This section discusses the various techniques for discovering GUESS Ultrapeers. These techniques can be used together to discover enough GUESS Ultrapeers to support searches while not flooding the network with ping and pong traffic.
The first method for discovering Ultrapeers that support GUESS is to use the traditional Gnutella broadcast ping. In this method, hosts simply send broadcast pings as they normally would. The host then checks incoming pongs for the GGEP extension marking GUESS support, and adds these marked pongs to its cache. This method of host discovery has significant disadvantages, however. First, it uses a lot of bandwidth if nodes are not implementing pong caching. Second, it is quite likely that a second broadcast ping will yield many pongs for hosts that are already in the cache from previous broadcasts. Given these factors, broadcast pings should be the least preferred method of host discovery.
Over the course of a query, clients discover new servers through the pong acknowledgements they receive. These pongs contain host information for other GUESS Ultrapeers the client may not have previously known about, allowing the query to continue. This is a preferred method of discovery, as the host information is built into the acknowledgement, and so creates no extra network traffic.
Hosts wishing to refresh their cache can also send unicast pings over UDP to known hosts supporting GUESS. These pings MUST be sent with TTL=1, as they are not intended for broadcast. Upon receiving such a ping, hosts MUST reply with cached pongs for other Ultrapeers supporting GUESS. The receiver MUST send a moderate number of these pongs, if available, anywhere from 5 to 20. The best number of pongs to return should also be determined by experimentation. The host returning these pongs, however, MUST NOT include a pong for themselves, as the host sending the ping presumably already has this information.
Hosts capable of sending GUESS queries are REQUIRED to report this fact in a new Gnutella 0.6 connection header.[14] The inclusion of this header indicates that host sending the header may perform GUESS style queries if it acts as an Ultrapeer proxy. The new header field name is "X-Guess," and the new header field value is the version number supported. The version of this document corresponds with the version of the protocol. So, for example, a complete GUESS connection header would be:
X-Guess: 0.1
This allows leaves to prefer connections to Ultrapeers supporting GUESS, or for GUESS Ultrapeers to prefer connections to other GUESS Ultrapeers.
TOC |
In the current Gnutella network, all messages are
sent using TCP, so the most obvious implementation of this
proposal would use a new, transient connection also over TCP.
Opening and closing TCP connections incurs significant CPU,
bandwidth, and memory costs, however, potentially making such a
change in architecture unworkable using TCP. Moreover, Windows
95/98/Me do not allow more than 100 TCP connections. While this
setting can be changed in the Windows registry, these systems
were clearly not designed to handle large numbers of
simultaneous connections. As opposed to UDP, TCP also
uses significantly more bandwidth and increases delay
due to re-transmission.
As others have noted,
the reliability of TCP is not a requirement for Gnutella
messages.[6] If a message is lost,
who cares? In fact, these queries and their associated hits
can easily be sent over UDP. In many ways, UDP is the more
appropriate transport layer protocol, as this scheme sends a
large number of messages to volatile set of nodes
very quickly, making performance a concern while reliability
is not a requirement. In fact, with the high transience of
Gnutella nodes, reliability cannot be expected and is an
impediment to search performance. UDP also arguably
simplifies the algorithm for searching a large number of
nodes, as you no longer need to worry about issues such
as timeouts.
Clients wishing to
implement this change MUST do so over UDP, as a TCP
implementation would incur excessive overhead for other
nodes, and would be impossible without a new, transient
connection. If a TCP connection already exists, Ultrapeers
MAY send messages just as if the connection were over UDP,
using TTL=1.
To implement this change, Ultrapeers MUST open a UDP port that listens for incoming UDP traffic, as mentioned in the section on server-side changes. It is RECOMMENDED that Ultrapeers listen on port 6346, the same port registered for Gnutella for TCP. Ultrapeers MAY, however, listen on a different port, particularly when, for example, there is another Gnutella client listening on 6346, or when another application is using that port for any reason. In all cases, clients MUST listen on the same port for both TCP and UDP traffic. While this makes the implementation slightly more rigid, the IP and TCP port are already reported in a number of Gnutella messages, headers, and extensions, however, and this choice makes the reuse of that information possible.
One difference between UDP and TCP is that UDP does
not perform any segmenting of datagrams on its own:
it sends a single datagram that may be split into
multiple packets at the IP layer, either at the
originating host or at an intermediate router.
This fragmentation depends upon the
Maximum Transmission Unit (MTU) of the underlying
link-layer.[7]
Fragmentation of datagrams in itself is far from
disastrous. The IP layer reassembles packets into
complete datagrams at the destination host, making
the process largely transparent to application
developers. The danger lies, however, in the
possibility that individual packets are lost. If
any fragment is lost, the entire datagram
is lost.[7] It is
therefore RECOMMENDED that clients take steps
to minimize the size of their datagrams to avoid
excessive fragmentation. The MTU of modem links
can be prohibitively small, as low as 296 bytes,
so we make no attempt to remain below this
threshold.[8] These links
should only occur on the edges of the network,
however, as long as Ultrapeer election algorithms are
correctly measuring bandwidth. This means that any
fragmentation that may occur along modem links will
likely result in little to no packet loss, so we
need not consider this barrier when determining
datagram sizes. In general, clients SHOULD limit the
size of their datagrams whenever appropriate. A limit
of 512 is very conservative, and limiting datagrams
to 1,500 bytes or less should avoid fragmentation on
the vast majority of
routers.[8] This is because
1,500 bytes is the MTU for Ethernet links, which most
TCP/IP stacks take into account. Clients should stay
significantly under this limit if possible. Moreover,
IP headers are at least 20 bytes, and UDP headers are
8 bytes. In addition, there are many other
protocols, such as PPP, that can add bytes of their
own. As a result, developers should stay significantly under
the 1,500 byte limit for Gnutella message data. A limit
of 1K should suffice in all cases, with a hard upper limit
of 1,400 bytes.
It will often not be possible to keep query hits under
this limit. To address this problem,
it is RECOMMENDED that developers break up large
query hits into multiple smaller
query hits. This will increase the bandwidth required
to return results only slightly in most cases (due to sending
the same header multiple times) while reducing or eliminating
fragmentation. It also avoids the current "all or nothing"
approach where all results from a host are lost if one
packet is lost. In this scheme, some hits can still get
through when a packet from another hit is lost.
Another significant difference between TCP and UDP is that UDP does not provide congestion control. Given that this query scheme dramatically reduces overall message traffic, congestion may not be a concern. Particularly because clients will no longer receive the floods of query hits currently associated with queries for popular content (probably the most severe case of congestion on the current network), packet loss rates under this scheme should be significantly lower. If congestion does prove to cause a high degree of packet loss, however, clients may be forced to implement congestion control at the application layer. Currently, Ultrapeers have some control over the traffic coming into their UDP receive buffers because they have the option to stop sending pongs if they detect incoming packets are being dropped due to congestion. Beyond this simple step, however, GUESS provides no way of controlling congestion. GUESS takes the preventative approach of designing a lightweight searching architecture from the outset. While other steps to control congestion may be necessary (such as implementing flow control algorithms), they are outside the scope of this proposal.
TOC |
Ultrapeers that support GUESS MUST advertise that fact in a new GGEP extension in pongs.[9] The GGEP extension should have the value "GUE" as its extension header. The extension value will be 1 byte describing the protocol revision number. The most-significant nibble will be an unsigned integer describing the major revision (the current major revision number is 0, hence 0000b). The least-significant nibble will be the minor revision number (the current minor revision number is 2, hence 0010b). Note that the nibbles represent the numbers with the most-significant bit first. Moreover, this limits the revision numbers - 15 for major and minor revisions (therefore, there will never be a 1.16 or a 16.5 revision). This allows 256 possible unique revision numbers which should do for the life of the protocol.
TOC |
GUESS makes several careful design decisions. In particular, the choice to have leaves send query hits through their Ultrapeer instead of sending them directly to the node initiating the query warrants more discussion. Sending the reply through the Ultrapeer first allows the Ultrapeer to add the leaf to its push routing tables. If the leaf is firewalled, this makes it possible for the Ultrapeer to act as a proxy for the push request. In addition, if leaves were to send replies directly back to the node initiating the query, the IP address and port of the querying node would have to somehow be added to the query itself, either through a GGEP extension or through the Ultrapeer modifying the query on the fly before sending it to the leaf. This makes the system significantly more complicated, and it could easily allow a DDoS attack by spoofing the host to reply to, depending on the implementation.
TOC |
During the initial rollout, GUESS can peacefully co-exist with
the current network quite easily. Leaves not supporting
GUESS can still connect to GUESS Ultrapeers. Similarly,
leaves supporting GUESS can connect to non-GUESS
Ultrapeers. In a first implementation, hosts MAY choose to
implement a hybrid query scheme until enough nodes on the
network support GUESS. For example, a node could combine
a GUESS-style query with conservative values for the total
numbers of nodes to query and the desired number of results
along with a traditional broadcast query sent with
TTL=4. If developers decide to do this, it MUST be only a
temporary solution, as GUESS improvements will
only be fully seen if traditional broadcasts are
abandoned in favor of GUESS.
As a first step, developers also MAY choose to only
implement the server side of GUESS. This will allow GUESS
searches to be easily tested without at first using the
GUESS infrastructure.
In addition, when GUESS nodes receive incoming messages over
TCP, they SHOULD handle them just as they handled them
prior to GUESS.
TOC |
In the past, a principal objection to using UDP has been that it allows anyone to easily execute a DDoS attack on any target machine. This concern has been based on the assumption that queries would require an extension listing the IP address and UDP port to reply to, however. In this proposal, this extension is not required, as responses are always sent directly back to the node that sent them, rendering such an attack impossible.
TOC |
Adoption of this proposal has several additional benefits. For example, concern for cycles in intra-Ultrapeer connections is eliminated. In the current network, cycles can be a serious problem in the worst case. As a general rule, the number of cycles increases as the connectivity of the network graph increases. This is problematic because there are significant benefits to having a more highly connected graph. These cycles result in nodes receiving many duplicate messages, wasting bandwidth, CPU, and memory.(See [1] and [11]) GUESS eliminates these duplicates except in the case where leaves are connected to multiple Ultrapeers, and two or more of their Ultrapeers are sent the same query.
This query scheme gracefully handles push downloads. In fact, it incorporates many of the ideas of the Push Proxy proposal.[12] This scheme does not, however, allow two firewalled hosts to download from each other, as in the "Download Proxy" proposal.[10] In the current network, push requests frequently fail, primarily because the node serving a file may be 7 Gnutella hops away from the node requesting a file, and the request has to travel through all intervening nodes. As a result, if any node along that path leaves the network or is otherwise unable to pass the push request, the push will not reach the intended node. With the adoption of this proposal, success rates for push requests should increase dramatically, as the node serving the file will only be from 1 to 3 hops away (depending on whether the searching and replying nodes are leaves or Ultrapeers).
Another benefit of this scheme is that the user manually "stopping" a query can, in fact, stop that query from being sent to more hosts, saving network resources. This does not apply, of course, in the case where an Ultrapeer proxies a query on a leaf's behalf.
TOC |
TOC |
Susheel Daswani | |
LimeWire LLC | |
EMail: | sdaswani@limewire.com |
URI: | http://www.limewire.org |
Adam A. Fisk | |
LimeWire LLC | |
EMail: | afisk@limewire.com |
URI: | http://www.limewire.org |
TOC |
The authors would like to thank Christopher Rohrs and the rest of the LimeWire team. In addition, we would like to thank Gordon Mohr of Bitzi, Inc., Jakob Eriksson, Ph.D. student at the Computer Science department at the University of California, Riverside, Jason Thomas of Swapper, Inc., Raphael Manfredi of Gtk-Gnutella, Michael Stokes of Shareza, Phillipe Verdy, Sam Berlin, all participants in the Gnutella Developer Forum (GDF), and all members of the LimeWire open source initiative.