Everyone of you have probably interacted with Torrents or atleast heard about Torrenting, PirateBay, Kickass Torrents, Seeders/ leechers etc. The reputation of Torrents is not so great since it has been used to share copyrighted material for quite some time now. However, it still is a great technology. Facebook uses it to push updates to facebook servers and so does twitter.
Torrents are the files that can be shared over a Peer-to-Peer network without any centralized intermediaries. So, there is no server from whom you are downloading stuff. Instead, any ordinary user can create a torrent which can then be shared over the P2P network.
In brief, what actually happens is, a user (or a group of users) have some file. They create a torrent associated with the said file and then send the file in small chunks to whoever wants to receive it. And once the receiver has received the file, they can also facilitate in providing the said file to other users. So, a user torrent client may download the file from thousands of diffferent senders and they may also upload it to thousands of different receivers.
The BitTorrent Protocol
BitTorrent protocol allows its users to send/ recieve files over P2P network using the BitTorrent Clients. BitTorrent client are just softwares that allows you to communicate with the BitTorrent protocols. I recommend using qBittorent which is an open-source software for PC's that is free to use and has every feature that you will ever need from your BitTorrent client.
So, now we have our BitTorrent client via which we can download/ upload our files. But how do we search for torrents? You just can't head over on Google and search for torrent. I mean, you can, but google will probably not provide you with the torrent file.
Here comes the torrent indexes. These are websites that hosts a large majority of torrent files. They do not hold the data, instead they hold the ".torrent" file or a magnet link (unique way to identify torrent). A user can download the ".torrent" file or copy the magnet link from these sites which can then be fed into the torrent client to download the file from the other clients on the network. ThePiratebay, 1337x, rarbg, limetorrents are some of the popular torrent indexes.
Once a user adds the torrent file in his torrent client, the client will try to connect to the tracker(s) mentioned in the file. But what is a tracker? Well, tracker is a remote host that contains list of all the other torrent clients that are currently downloading/ uploading the mentioned torrent. It will keep record of the progress of each torrent client, so that when new users ask it for information regarding that torrent, it will simply send the list of these clients to the user so that they can connect to them and start downloading the file.
Important Terms
Peers is a client connected to the network. They can transfer and recieve data to and from other clients. Peer can be a client that has parts of a torrent or any client in the swarm.
Seeding is the process where you are providing chunks of file from your system to some other client connected to you. Seeding is an important characteristic of torrents. If there is no client actively seeding a torrent, then no one can will ever be able to download it. When a peer has certain proportion of torrent, they can start seeding it. However, the amount to seed, the time to seed and the maximum speed etc can all be manipulated by the peer as according to their choice.
Super-seeding is a great way to efficiently share the file. Suppose you create a torrent which is being downloaded by 9 other people. You are suffering from data-cap. So, you do not want to upload the same file to 9 people seperately. When super-seeding mode is enabled, only those chunks of file will be shared which are not available to any peer (who is uploading) in the network at all. So, if your file is divided into 3 different chunks, and you have already seeded 2 of those chunks to other peer (who is seeding those chunks), then you will not be seeding them to other client. Instead, you will only seed the remaining chunk which is currently available only to you.
Hit-and-run is appropriate term for those leechers who download the torrent while seeding as little as possible.
Torrents are not anonymous in nature i.e., your IP address is known to the whole swarm when you are downloading a torrent. Torrents can be used with VPN for some amount of anonymity if necessary. Various ISPs (coughs Comcast) have been known for throttling users who were seeding torrents by actively monitoring the communication between peers. Clients then started to encrypt the data chunks as well as protocol header so as to make it hard for an ISP to seperate out torrent packets from ordinary encrypted chunks of data being sent over the internet by its users.
Since the torrent client internally manages the split and merge of chunks of torrent, a torrent can be paused for indefinite amount of time. When it is resumed, if any seeder is available, it will just continue the download as if there was no pause. It will simply connect to the available seeder and ask them for the next chunk of data needed. There is no time-limit whatsoever to finish the download in.
Torrenting itself is not illegal. You can head over to Ubuntu's webpage and there will probably an option to download the iso using torrent. Why?
- Well, torrents are fast because there is little to no overhead.
- Torrents are efficient (remember the superseeding mode).
- Torrents are easy to work with since there is no time-limit for download nor is there any need to make the server remember all the active sessions for all the downloads.
- You can probably get away with down-time for maintainance etc provided there are enough seeder to seed the torrents while one of your system is down.
Torrenting copyrighted material whose rights you or your seeders or your peers do not own is illegal. Also, while downloading torrents, double check the source. A malicious user may try to seed you some malware etc by making it seem like it is the file you actually need. It is relatively easy to find out malicious torrents though, since the torrent client will show you everything that will be downloaded by any particular torrent before you start downloading it by checking its metadata. You can even download certain file(s) provided by a particular torrent by simply unchecking the needless.
Last but not the least How does torrent client knows that the chunk of data it has just downloaded from some random seeder is not manipulated?
There is a great cryptographic function called Hashing. We will call it checksum for this case. Basically it creates a checksum of each chunk of data. These checksums are provided in the torrent file itself. When a chunk of data is downloaded, its checksum is calculated and then matched with the one provided in the torrent. If they match then data integrity is maintained else the chunk of data is rejected. Hashing basically gurantees that it is computationally infeasible to find two chunks of data with same hash value. So, the data is validated.
Do point out any mistakes and share your thoughts. Until next time!