TOC |
|
The traditional Gnutella broadcast search model handles all searches identically -- a search for "mp3" is broadcasted to as many nodes as a search for "Gettysburg Address". As a result, searches for "mp3" return far more results than they need, wasting bandwidth for both query and query hit traffic, while searches for "Gettysburg Address" do not receive the desired number of results. This proposal improves upon this search model by making the query dynamic, with Ultrapeers connecting to more nodes and dynamically adjusting the TTL of outgoing queries along specific connections to use just enough bandwidth to satisfy the query.
TOC |
TOC |
Dynamic queries offer a way to receive the desired number of results for a given search, with the goal of maximizing the likelihood that a user will successfully download the desired file. Put another way, dynamic querying seeks to hit the minimum number of nodes necessary to obtain the desired number of results for a given search.
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.[1] An implementation is not compliant if it fails to satisfy one or more of the MUST or REQUIRED level requirements for the protocols it implements. An implementation that satisfies all the MUST or REQUIRED level and all the SHOULD level requirements for its protocols is said to be “unconditionally compliant”; one that satisfies all the MUST level requirements but not all the SHOULD level requirements for its protocols is said to be “conditionally compliant.”
TOC |
On a high level, this change involves increasing the number of connections that Ultrapeers maintain, decreasing the Time to Live (TTL) of outgoing searches, and sending a search to one Ultrapeer at a time, waiting for results and adjusting the TTL on each iteration based on the number of results received, the number of connections left, and the number of nodes already theoretically reached by the search. When the desired number of results is received, the query ends. Each of these changes is dicussed in detail below.
In this scheme, the Ultrapeer controls all aspects of the query. Upon receiving a query from a leaf, the Ultrapeer ignores the TTL of that query and begins a dynamic query. When an Ultrapeer receives a query from another dynamic-query Ultrapeer, it MUST broadcast that query just as it normally would.
Clients implementing dynamic querying MUST increase the degree of their Ultrapeer connections, as increasing the degree significantly increases control over the number of nodes hit in a query, maximizing its bandwidth-saving effects. Clients implementing this proposal MUST maintain a minimum of 15 connections to other Ultrapeers, and it is RECOMMENDED that clients increase their degree to 30 or more.
Clients implementing dynamic querying also MUST decrease the TTL of outgoing
queries. One of the biggest problems with the traditional broadcast search model
was that a node had little control over the horizon of a query. Once a query was
sent, it would attempt to hammer hundreds of thousands of nodes all at once. The
combinate of a higher degree network and lower TTLs dramatically reduces this
problem because each connection represents a much smaller node horizon.
For example, with a degree 6 network where each query is sent with a TTL
of 7, a query sent down one connection theoretically would reach 23,436
Ultrapeers (barring cycles in the network graph). In contrast, on a degree 32
network where each query is sent with a TTL of 3, a query sent down one
connection theoretically would reach 1,024 Ultrapeers. While the total horizon
is also less with a degree 32, TTL 3 network, the point is that increasing the
degree and lowering the TTL allows the our algorithm to send the query to far
fewer nodes on each iteration, allowing finer- grained control over the query
and a more accurate dynamic calculation of what the TTL for the next iteration
should be.
In choosing the degree and the maximum TTL, clients MUST NOT
exceed 200,000 Ultrapeers for their maximum theoretical search horizon -- the
number of Ultrapeers they will in hit in searches for the rarest of files. This
translates into an overall horizon of 6,000,000+ nodes when taking leaves into
account, which should be plenty for any search, and which is only possible on a
network performing close to optimally.
This section discusses all aspects of a single dynamic query, including probe queries, search intervals, TTL calculation, etc.
When a client first begins a dynamic query, it has no way of knowing how
common the desired file is. As such, it's unclear what TTL is appropriate for
the first connection. To gather more infomation, it is necessary to send a
"probe" query that gives some idea as to the popularity of the file. A simple
approach is to send a TTL=2 query down 3 connections and wait about 6 seconds
for results. If no results, are received, the next query may be sent with a high
TTL, such as 3. If the probe returns 30 results, however, it would make sense
for the subsequent TTL to be lower, pe perhaps 1.
If the client
implementing dynamic queries also implements Ultrapeer
query routing[2], further optimizations are also possible for the probe. For
connections that have supplied query route tables, the dynamic querier MAY
choose to check those route tables for hits and to use that information to both
determine the popularity of the file more closely and to decide which nodes to
send the query to. If, for example, a node has query route tables from 20
connections and 10 of them have hits for the query in their tables, it is clear
that the desired content is widely distributed, and it is very likely that the
probe alone will satisfy the query. Note that the querier has reached this
conclusion before sending a single query down any connection! In this case, the
querier may choose to send the query to 5 or more connections that have hits in
their tables, knowing that each of them should return results and likely
achieving the desired number of results. Here, the query is only sent to nodes
that are known ahead of time to contain hits and nodes without hits are not sent
the query.
Once the probe is sent, the standard dynamic query algorithm takes effect. In many cases, the probe will return the desired number of results, terminating the query. When the probe does not achieve the desired number of results, however, the querier must begin the process of dynamically calculating the TTL, per hop wait times, etc. for the remaining connections, as described below.
The algorithm for calculating the TTL for the next connection is one of the
most important parts of the dynamic query. This calculation must take into
account the number of hosts already queried, the number of remaining
connections, and the number of results received. On each iteration, a reasonable
algorithm is to divide the number of results still needed by the the average
number of results received per Ultrapeer queried to get the number of Ultrapeers
you should hit, and then to divide this number by the number of remaining
connections to get the total number of hosts to query per connection. With the
number of hosts to query per connection, you can easily calculate the
appropriate TTL based on the degree of the connection being queried. More
formally:
Let r = the number of results received Let R = the number of results needed, or the number of results desired - r Let H = the number of Ultrapeers theoretically queried Let RH = the number of results per Ultrapeer, or r/H Let HQ = the number of hosts to query to reach the desired number of results, or R/RH Let C = the number of connections that have not yet received the query Let HQC = the number of hosts to query per connection, or HQ/C Let DEGREE = the degree of the next connection TTL = the minimum TTL that will reach the desired number of hosts for this connection
Note that we calculate the TTL using the degree of the connection that
will receive the query. For our purposes, we assume that all nodes connected to
that node have the same degree -- we don't take into account the likelihood that
there is significant heterogeneity in the degrees of the connected nodes.
The TTL calculation must also take into account the new X-Max-TTL header.
The X-Max-TTL header indicates the maximum TTL for fresh queries the connection
will accept. As such, the TTL sent along a connection MUST NOT exceed the
X-Max-TTL for the given connection. X-Max-TTL is discussed in more detail in the
section on headers.
Finally, the difference between one TTL and the next is so great in terms of
the number of nodes reached that it makes sense to weight this algorithm towards
lower TTLs. Given that we are always working with incomplete data, favoring
lower TTLs mitigates the danger of choosing a TTL that is too high and wastes
network resources.
At the end of each iteration, the client must add the new number of hosts
theoretically queried to the total. For one connection, this calculation is:
hosts(degree,ttl) = Sum[(degree-1)^i, 0 <= i <= ttl-1]
Note that this only calculates the theoretical horizon for the query, and not the actual horizon. If there are cycles in the network graph, or if the nodes on subsequent hops have different degrees, this number will be innaccurate. Unfortunately, it's the best we can do without imposing overly burdensome restrictions on the overall topology, and this estimate is accurate enough for our purposes. This is the number that should be used in calculating the next TTL.
After sending a query to one connection, an Ultrapeer MUST wait for hosts to
return query hits before sending the next query to the next connection. It must
wait for results because those results are vital in determining the popularity
of the file, and therefore in calculating the TTL for sending the query down
subsequent connections. In experimental observations, 2400 milliseconds per hop
is a reasonable latency to expect. So, if a query is sent with TTL=3, the
querier would wait 7200 milliseconds for results before sending the query down
another connection. While some query hits may arrive after this wait time, 2400
milliseconds per hop is enough to capture the majority of hits that will be
returned.
As a general rule, the querier MUST wait 2400 milliseconds per
hop before sending a query down another connection. Waiting for this period of
time is vital because it's otherwise easy for a query to become "out of
control", particularly if the content is popular. If the querier does not wait
for an appropriate period before sending the query to another connection, the
algorithm may incorrectly think that the content is very rare, sending a high
TTL query as a result. For queries like "mp3", this is extremely destructive and
defeats the purpose of dynamic querying.
There are, however, two
exceptions to the 2400 millisecond rule. First, for probe queries, it makes
sense to send the query down multiple connections simultaneously without waiting
for results. Once the probe is sent, however, the querier MUST wait the number
of milliseconds appropriate for the hops of the probe. In addition, the querier
MAY scale the wait time up depending on how many connections are probed. So, if
a probe is sent to 3 connections with a TTL of 2, the querier MUST wait for a
minimum of 7200 milliseconds, but also MAY choose to wait for an extra second
per connection probed. This is due to the heterogeneity of latencies on the
network, and is particularly useful when the querier is taking advantage of
query route tables from Ultrapeer
query routing[2], where the probe may be sent to 5 or more connections with
a TTL of 1. An accurate probe is vital to avoid sending a query with high TTLs
for widely distributed content, as should become clear over the course of
writing specific dynamic query algorithms.
The second exception to the
2400 millisecond rule is an optimization for searches for particularly rare
content. When content is rare, waiting for 2400 milliseconds for every hop can
make the overall latency of the query unacceptable. If a query is sent to 32
connections at TTL=3, for example, the query would take almost 4 minutes to
complete! If a query has taken a significant length of time and the theoretical
number of nodes searched is high, the querier can be reasonably sure that the
content is quite rare, and MAY choose to become more aggressive with wait times.
For example, if the query has theoretically reached 3000 Ultrapeers and has
returned under under 10 results, we may choose to decrease the per-hop wait time
by 100 milliseconds for the next iteration. Decreasing the wait time by 100
milliseconds on ten successive iterations will decrease overall latency
significantly, but implementors must take care not to be overly aggressive in
this regard. This is particularly important for content in the "middle range" --
content that is not too common, but not too rare. The danger with this type of
search is again that the algorithm will select overly aggressive TTLs and waste
network resources.
There are several conditions that will terminate the query. These are:
The number of results to search for is covered in further detail in the "Number of Results" section. Algorithm authors also MAY choose to impose other restrictions on the query to safeguard against attack or abnormal conditions. For example, a client MAY wish to limit the total lifetime of the query to, say 3 minutes.
Clients MUST target a maximum of 150 total results for a given query. This
many results will give the client enough sources to successfully download the
file while saving network resources on searches for popular content. When 150
results are received, clients MUST NOT send the query to any more hosts.
Because most queries are initiated by leaves, and because most clients
connect their leaves to 3 Ultrapeers, each Ultrapeer MUST attempt to get only 50
results for a query received from one of its leaves.
There are several new connection headers that clients MUST add to indicate their support for these features. Each header is outlined below.
Clients MUST include a new "X-Max-TTL" header in their handshakes. This
header indicates that we should not send fresh queries to this connection with
TTLs higher than the X-Max-TTL. If we are routing traffic from other Ultrapeers,
the X-Max-TTL is irrelevant. The X-Max-TTL MUST NOT exceed 4, as any TTL above 4
indicates a client is allowing too much query traffic on the network. This
header is particularly useful for compatibility with future clients that may
choose to have higher degrees but that would prefer lower TTL traffic from their
neighbors. For example, if future clients connect to 200 Ultrapeers, they could
use the X-Max-TTL header to indicate to today's clients that they will not
accept TTLs above 2. A typical initial value for X-Max-TTL is 3, as it the
following example:
X-Max-TTL: 3
The X-Degree header simply indicates the number of Ultrapeer connections this
nodes attempts to maintain. Clients supporting this proposal must have X-Degrees
of at least 15, and higher values are preferable. This header takes the
following form:
X-Degree: 32
Finally, the client MUST indicate the version of dynamic queries supported.
This will allow future versions to make connections only to hosts supporting
newer dynamic query features. This header takes the following form:
X-Dynamic-Querying: 0.1
TOC |
This proposal addresses one of the most glaring flaws in the traditional Gnutella search model, namely that searches for widely distributed content were handled in the same way as searches for rare content. This change uses much less bandwidth for searches for common files while doing more work for searches for rare files, improving overall network performance considerably.
TOC |
[1] | Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 (TXT, HTML, XML). |
[2] | Fisk, A., "Gnutella Ultrapeer Query Routing, v. 0.1", May 2003. |
[3] | Rohrs, C. and A. Singla, "Ultrapeers: Another Step Towards Gnutella Scalability", December 2001. |
[4] | Osokine, S., "Search Optimization in the Distributed Networks", December 2002. |
TOC |
Adam A. Fisk | |
LimeWire LLC | |
EMail: | afisk@limewire.com |
URI: | http://www.limewire.org/ |
TOC |
The author would like to thank the rest of the LimeWire team and all members of the Gnutella Developer Forum (GDF).