Posts: 5,362
Threads: 2,998
Joined: Feb 2011
[attachment=9034]
BitTorrent is a peer-to-peer file sharing protocol used for distributing large amounts of data. BitTorrent is one of the most common protocols for transferring large files, and it has been estimated that it accounted for roughly 27% to 55% of all Internet traffic (depending on geographical location) as of February 2009.[1]
Programmer Bram Cohen designed the protocol in April 2001 and released a first implementation on July 2, 2001.[2] It is now maintained by Cohen's company BitTorrent, Inc. There are numerous BitTorrent clients available for a variety of computing platforms
Description
The BitTorrent protocol can distribute a large file without the heavy load on the source computer and network. Rather than downloading a file from a single source, the BitTorrent protocol allows users to join a "swarm" of hosts to download and upload from each other simultaneously . The protocol works as an alternative method to distribute data, and even small computers, like mobile phones, with low bandwidth are able to distribute files to many recipients.
A user who wants to upload a file first creates a small torrent descriptor file that he distributes by conventional means (web, email, etc.). He then makes the file itself available through a BitTorrent node acting as a seed. Those with the torrent descriptor file can give it to their own BitTorrent nodes which, acting as peers or leechers, download it by connecting to the seed and/or other peers.
The file being distributed is divided into segments called pieces. As each peer receives a new piece of the file it becomes a source of that piece to other peers, relieving the seed from having to send a copy to every peer. With BitTorrent, the task of distributing the file is shared by those who want it; it is entirely possible for the seed to send only a single copy of the file itself to an unlimited number of peers.
Each piece is protected by a cryptographic hash contained in the torrent descriptor.[3] This prevents nodes from maliciously modifying the pieces they pass on to other nodes. If a node starts with an authentic copy of the torrent descriptor, it can verify the authenticity of the actual file it has received.
When a peer completely downloads a file, it becomes an additional seed. This eventual shift from peers to seeders determines the overall "health" of the file (as determined by the number of times a file is available in its complete form).
This distributed nature of BitTorrent leads to a flood like spreading of a file throughout peers. As more peers join the swarm, the likelihood of a successful download increases. Relative to standard Internet hosting, this provides a significant reduction in the original distributor's hardware and bandwidth resource costs. It also provides redundancy against system problems, reduces dependence on the original distributor[4] and provides a source for the file which is generally temporary and therefore harder to trace than when provided by the enduring availability of a host in standard file distribution techniques.
Operation
In this animation, the colored bars beneath all of the 7 clients in the upper region above represent the file, with each color representing a individual piece of the file. After the initial pieces transfer from the seed (large system at the bottom), the pieces are individually transferred from client to client. The original seeder only needs to send out one copy of the file for all the clients to receive a copy.
A BitTorrent client is any program that implements the BitTorrent protocol. Each client is capable of preparing, requesting, and transmitting any type of computer file over a network, using the protocol. A peer is any computer running an instance of a client.
To share a file or group of files, a peer first creates a small file called a "torrent" (e.g. MyFile.torrent). This file contains metadata about the files to be shared and about the tracker, the computer that coordinates the file distribution. Peers that want to download the file must first obtain a torrent file for it and connect to the specified tracker, which tells them from which other peers to download the pieces of the file.
Though both ultimately transfer files over a network, a BitTorrent download differs from a classic download (as is typical with an HTTP or FTP request, for example) in several fundamental ways:
• BitTorrent makes many small data requests over different TCP connections to different machines, while classic downloading is typically made via a single TCP connection to a single machine.
• BitTorrent downloads in a random or in a "rarest-first"[5] approach that ensures high availability, while classic downloads are sequential.
Taken together, these differences allow BitTorrent to achieve much lower cost to the content provider, much higher redundancy, and much greater resistance to abuse or to "flash crowds" than regular server software. However, this protection, theoretically, comes at a cost: downloads can take time to rise to full speed because it may take time for enough peer connections to be established, and it may take time for a node to receive sufficient data to become an effective uploader. This contrasts with regular downloads (such as from an HTTP server, for example) that, while more vulnerable to overload and abuse, rise to full speed very quickly and maintain this speed throughout.
In general, BitTorrent's non-contiguous download methods have prevented it from supporting "progressive downloads" or "streaming playback". However, comments made by Bram Cohen in January 2007 suggest that streaming torrent downloads will soon be commonplace and ad supported streaming appears to be the result of those comments.
Creating and publishing torrents
The peer distributing a data file treats the file as a number of identically sized pieces, usually with byte sizes of a power of 2, and typically between 32 kB and 4 MB each. The peer creates a hash for each piece, using the SHA-1 hash function, and records it in the torrent file. Pieces with sizes greater than 512 kB will reduce the size of a torrent file for a very large payload, but is claimed to reduce the efficiency of the protocol.[6] When another peer later receives a particular piece, the hash of the piece is compared to the recorded hash to test that the piece is error-free.[7] Peers that provide a complete file are called seeders, and the peer providing the initial copy is called the initial seeder.
The exact information contained in the torrent file depends on the version of the BitTorrent protocol. By convention, the name of a torrent file has the suffix .torrent. Torrent files have an "announce" section, which specifies the URL of the tracker, and an "info" section, containing (suggested) names for the files, their lengths, the piece length used, and a SHA-1 hash code for each piece, all of which are used by clients to verify the integrity of the data they receive.
Torrent files are typically published on websites or elsewhere, and registered with at least one tracker. The tracker maintains lists of the clients currently participating in the torrent.[7] Alternatively, in a trackerless system (decentralized tracking) every peer acts as a tracker. Azureus was the first[citation needed] BitTorrent client to implement such a system through the distributed hash table (DHT) method. An alternative and incompatible DHT system, known as Mainline DHT, was later developed and adopted by the BitTorrent (Mainline), µTorrent, Transmission, rTorrent, KTorrent, BitComet, and Deluge clients.
After the DHT was adopted, a "private" flag — analogous to the broadcast flag — was unofficially introduced, telling clients to restrict the use of decentralized tracking regardless of the user's desires.[8] The flag is intentionally placed in the info section of the torrent so that it cannot be disabled or removed without changing the identity of the torrent. The purpose of the flag is to prevent torrents from being shared with clients that do not have access to the tracker. The flag was requested for inclusion in the official specification in August, 2008, but has not been accepted.[9] Clients that have ignored the private flag were banned by many trackers, discouraging the practice.[10]
Downloading torrents and sharing files
Users browse the web to find a torrent of interest, download it, and open it with a BitTorrent client. The client connects to the tracker(s) specified in the torrent file, from which it receives a list of peers currently transferring pieces of the file(s) specified in the torrent. The client connects to those peers to obtain the various pieces. If the swarm contains only the initial seeder, the client connects directly to it and begins to request pieces.
Clients incorporate mechanisms to optimize their download and upload rates; for example they download pieces in a random order to increase the opportunity to exchange data, which is only possible if two peers have different pieces of the file.
The effectiveness of this data exchange depends largely on the policies that clients use to determine to whom to send data. Clients may prefer to send data to peers that send data back to them (a tit for tat scheme), which encourages fair trading. But strict policies often result in suboptimal situations, such as when newly joined peers are unable to receive any data because they don't have any pieces yet to trade themselves or when two peers with a good connection between them do not exchange data simply because neither of them takes the initiative. To counter these effects, the official BitTorrent client program uses a mechanism called "optimistic unchoking", whereby the client reserves a portion of its available bandwidth for sending pieces to random peers (not necessarily known good partners, so called preferred peers) in hopes of discovering even better partners and to ensure that newcomers get a chance to join the swarm.[11]
The community of BitTorrent users frowns upon the practice of disconnecting from the network immediately upon success of a file download, and encourages remaining as another seed for as long as practical, which may be days – especially when there are a lot of downloading peers and when the ratio of seeders to downloading peers is low