In the 15th century, the printing press brought about an information revolution; in the 20th century, radio and television did the same. Today, however, this role is played by the Internet. What is this global network, without which it is increasingly difficult to imagine our daily lives, and how does it work?
The Internet was born in the Cold War, which began just 2 years after the end of World War II. The period from 1947 to 1991 was a time of ideological, political and military tensions between the Eastern Bloc, consisting of the USSR and allied communist countries, and the Western Bloc led by the United States. We associate this period with constant fear, uncertainty of the future and an arms race between the two superpowers. The Internet, or rather its ancestor, was one element of this race.
In 1960, Joseph Licklider proposed the concept of a global computer network. Just two years later, Paul Baran, an American computer scientist of Polish origin, published a comprehensive design of a computer network without central points. Thanks to this architecture, a potential destruction of one part of the network would not cause a failure of the whole system. Data transmission could continue. It was therefore a potentially powerful tool in the hands of the Western bloc in its war against the Eastern bloc.
Licklider’s concept and Baran’s project were noticed by the US military technology development agency ARPA. The organization became involved in funding a project to connect American universities with an experimental network. On October 29, 1969, the first message was sent from the University of California at Los Angeles to Stanford University as part of the ARPANET experiment. More universities joined the ARPANET network over time. The term “internet” appeared in the 1970s, when academic users began calling the ARPANET that. In 1980 there was separation of military network ARPANET from academic network already officially called Internet. The ARPANET project was abandoned in 1989 and the Internet remained at the disposal of universities and scientific organizations. In 1991 the Internet was finally made available for commercial purposes.
The Internet was gaining popularity very quickly. In the 1990s more and more companies wanted to advertise on the Internet. Any company with its own website was seen as very modern. Investors were eager to buy shares of such companies which caused their prices to rise rapidly. Within five years the Nasdaq index quadrupled. It was the period of so called speculative “Internet bubble”. When the bubble burst Nasdaq index fell by 78% within two years. It took 11 years for the index to rise again to its previous peak. It was the biggest financial crisis to date directly related to the internet.
So how big is the internet today? According to Internet World Stats, in March 2021 as much as 65.6% of the population had access to the Internet. The undisputed leader here is North America, where as much as 93.9% of the population had access to the Internet. Europe is on the second place with the result of 88.2%. Within the European Union Poland stands on the 5th place in terms of the number of citizens using the Internet. However, in terms of percentage of citizens using the Internet the result of 78.6% is much less impressive. The record holders within the EU are Estonia, Luxembourg and Sweden with the result of over 96%. The impressive career of the Internet has brought it unimaginable growth. From the year 2000 to 2021, the number of internet users increased by as much as 1330%. During that time, the human population grew, of course, but only by about 30%. The Internet is reaching farther and farther corners of the world, so it is very likely that the trend will continue.
The Internet, that is, the network of networks or the network between networks. When we pick up our home router, we see that the sockets it has for the cable network are labeled with the abbreviations LAN, and WAN. Sometimes they are the exact same type of jacks. LAN stands for Local Area Network, which means our local network. WAN stands for Wide Area Network. And although WAN is our exit to the world, it is not at all certain that the other end of the cable connected to this socket has already an exit to the Internet. What is certain is that the other end of the cable goes into a larger network, which brings together networks like our home one. The “larger network” described here may also be one element of an even larger network, and so on, until we finally arrive at the Internet. As you can see, the internet is a network made up of smaller, nested networks.
To understand how devices on a network make communications, we need to know the basics of how networks work. I purposely said “networks” here, not “the Internet,” because the general principles apply to both the smaller and larger networks described earlier. Each device operating in a network gets a unique number called IP from “Internet Protocol”. This number becomes the identifier that allows the device to be recognized. Just as telephones have their numbers, senders and recipients of mail have their addresses, devices operating in a network have IP numbers.
When we need to establish a connection with another device, our smartphone, tablet or laptop sends one or many so called “packets” within the network to which it has access. Packets are sets of bits, ordered in a specific order and structure, containing message content and header. The header contains all the data that is used by network devices to properly transmit the packets on the network. Data such as the IP number is found in the header of the packet. Devices that operate in the network infrastructure such as switches or routers read each header, and on its basis take appropriate actions. Please note here that any person who has proper access to these devices would in principle be able to read both the header and the body of the message. I am leaving aside here the issue of message encryption.
This kind of eavesdropping on communication is called “sniffing”. Snooping can be done by virtually anyone who is able to mimic a network device between our laptop and the target server, and has freely available software.
There are very many types of packets. Some work faster as a rule, others are designed for tasks where high reliability is important, still others are rather used for network diagnostics. What they all have in common is that their headers are always as easy to read as possible. If not, it would be impossible to know where to send such a packet next. So we see that the sender and receiver of the message is known to anyone who might at some point read the contents of the packet. The contents of the message can be encrypted, but the header remains public.
When packets go outside our home network, their headers are modified. The header of a packet going to the WAN via our home router is modified in such a way that the sender of this packet is no longer our computer’s IP, but the router’s IP. Or to be more precise, the IP that the router uses to identify itself in the “larger” network. The packet goes on to other networks and each time it jumps from one network to another, its header changes. When the server we’re trying to contact replies to our message, it sends its packet to the address it received our packet from. There is one problem here. Since the header of the packet sent by us has changed many times, how to know where to direct the packet sent in response. Well, this functionality is already in the routers themselves. Each router not only changes headers, but also keeps track of the state of the connection and knows exactly which packets are sent in response and which are not as long as the connection is open. Such a mechanism is called a “routing” mechanism.
Routing mechanisms in a home network are very simple. However, if we look at the structure of the Internet on a slightly larger scale, we see that the Internet has no central points. It is a distributed network. This means that if we want to contact a website, the packet travel can take very different routes, depending on the load of the given sections of the global network. How does it work that there are so many possible routes, and yet packets don’t get lost anywhere along the way to their destination?
The answer is to divide the Internet into Autonomous Systems. An Autonomous System is usually a large facility that has some pool of unique IP addresses. Internet providers, data centers, research centers, corporations can register as Autonomous Systems. Autonomous systems are responsible for monitoring and optimizing the traffic on the Internet. Of course, they usually do this in order to maximize their benefits. Autonomous systems establish connections among themselves. Each system can have many such “neighboring” systems, to which it knows the routes and monitors the traffic on the fly, so it can optimize it. So whenever you type a web page address into the URL bar of your browser, a request in the form of a set of packets is sent to the closest autonomous system, which is your Internet provider. The ISP, knowing the routes to the target server, chooses one of its routes to send the packets to the next system. In the next system the procedure is repeated until the system to which the destination server is connected. Finally, the packets reach the server’s network interface and are interpreted. The travel of packets through the Internet almost always takes place through several or more intermediary devices.
Of course, the very division of the Internet into Autonomous Systems would not make sense if it were not for the specification of the connections between them. Border Gateway Protocol (BGP for short) is the actual foundation of the Internet. Thanks to this protocol Autonomous Systems exchange routes and optimize Internet traffic. The list of active systems changes all the time. New systems are registered and some cease to operate. Keeping track of so many dynamically changing routes would be a very difficult goal. BGP automates this process and optimizes traffic.
Of course BGP, like any other protocol, needs to be properly configured. Unfortunately in case of this protocol, errors in configuration may bring serious consequences. In 2004, a Turkish Internet provider accidentally sent incorrect routes to neighboring autonomous systems. The neighboring systems decided that the ISP in question was the best route for all connections. This misrouting later propagated to other autonomous systems, resulting in a huge load on the Turkish ISP’s network and ultimately cutting off many users worldwide from the Internet for a day.
In 2008, a Pakistani ISP, using BGP, tried to cut off Pakistani citizens from YouTube. The mistake sent routes that caused all users worldwide to be cut off from the service. The crisis was resolved after a few hours.
Hackers are also not indifferent to BGP. Many of them are well aware of how critical this protocol is to the internet. In April 2018, as a result of BGP interception, a hacking group took control of the DNS service managed by the Amazon corporation. The hackers stole cryptocurrencies worth about $100,000.
The moral of this story is quite simple. Using the Internet should involve a great deal of caution in many ways. Whether we like it or not, the foundation of the internet stands on trust. We trust the administrators of Autonomous Systems, the security settings of our own operating systems, or even the people who manage the router at the coffee shop where we drink our morning coffee. Encrypting data, choosing strong passwords and secure connections, limiting trust in other users or minimizing our private data on the Internet should be standard for us. Remember that if we have never fallen victim to a hacking attack, it does not mean that someone did not get our data by standing between us and a service we wanted to use. Data, once acquired, can be stored for years. Administrators managing infrastructure in office buildings where we work apply various security policies, but we must remember that always the weakest link in the security of IT systems is not the software, hardware or security policy, but us – the users. We decide how we connect to the Internet, what data we put into the Internet and how we secure it. We do it at our own risk.