Experiment: How to disguise the use of Tor to bypass blocking

Internet censorship is becoming an increasingly important issue worldwide. This leads to increasing of the "race of armaments" – in different countries government agencies and private corporations try to block various content and fight against the ways of circumventing of such restrictions. At the same time developers and researchers strive to create effective tools to combat censorship.

Scientists from Carnegie Mellon University, Stanford University, and SRI International conducted an experiment in which they developed a special service for masking the use of Tor – one of the most popular tools for bypassing blocking. Here is the story about the work done by the researchers.

Tor against blocking

Tor ensures anonymity of users by using special relays – in other words, intermediate servers between the user and the site they need. Usually there are several relays between the user and the site, each of which can decrypt only a small amount of data in the packet being sent – just enough to figure out the next point in the chain and send it there. As a result, even if a relay controlled by intruders or censors is added to the chain, they will not be able to find out both the addressee and the destination of the traffic.

Tor works effectively as an anti-censorship tool, but censors still have the opportunity to block it completely. Iran and China have conducted successful blocking campaigns. They were able to identify Tor traffic by scanning TLS handshakes and other distinctive characteristics of Tor.

Subsequently, the developers managed to adapt the system to bypass the blocks. The censors responded by blocking HTTPS connections to a variety of sites, including Tor. The project developers have created an obfsproxy program that encrypts traffic additionally. This competition continues all the time.

Initial data of the experiment

The researchers decided to develop a tool that would mask the use of Tor in order to make its use possible even in regions where the system is completely blocked.

The scientists have put forward the following assumptions as initial:
The censor controls an isolated internal segment of the network that connects to the external uncensored Internet.
The blocking authorities control the entire network infrastructure within the censored network segment, but not the software on end-user computers.
The censor seeks to prevent users from accessing materials that are undesirable from the censor's point of view. It is assumed that all such materials are located on servers outside the controlled network segment.
Routers on the perimeter of this segment analyze the unencrypted data of all packets in order to block unwanted content and prevent the corresponding packets from entering the perimeter.
All Tor relays are located outside the perimeter.

How it works

To mask the use of Tor, the researchers created the StegoTorus tool. Its main task is to improve Tor's ability to resist automated protocol analysis. The tool is located between the client and the first relay in the chain, it uses its own encryption protocol and steganography modules to make it difficult to identify Tor traffic.

At the first step, a module called chopper comes into operation – it converts traffic into a sequence of blocks of different lengths, which are sent further out of order.

The data is encrypted with the use of AES in GCM mode. The block header contains a 32-bit sequence number, two fields of length (d and p) – they indicate the amount of data, a special field F, and a 56-bit verification field, the value of which must be null. The minimum block length is 32 bytes, and the maximum is 217+32 bytes. The length is controlled by steganography modules.

When establishing a connection, the first few bytes of information are a handshake message, with which the server understands whether it is dealing with an existing or a new connection. If the connection belongs to a new link, the server responds with a handshake, and each of the participants of the exchange allocates session keys from it. In addition, the system implements a rekeying mechanism – it is similar to allocating a session key, but uses blocks instead of handshake messages. This mechanism changes the sequence number, but does not touch the link ID.

After both data exchange participants have sent and received the fin block, the link is closed. To protect against replay attacks or block delivery delays, both participants must remember the ID for some time after the closing.

The built-in steganography module hides Tor traffic inside the p2p protocol, similar to Skype's work with secure VoIP communications. The HTTP steganography module simulates unencrypted HTTP traffic. The system mimics a real user with a regular browser.

Resistance to attacks

In order to test how much the proposed method improves the efficiency of Tor, the researchers developed two types of attacks.
The first of them is to separate Tor streams from TCP streams based on the fundamental characteristics of the Tor protocol – this method was used for blocking by the Chinese government. The second attack is to study already known Tor streams to extract information about which sites the user visited.

The researchers confirmed the effectiveness of the first type of attack against "corny Tor" – for that they collected traces of visits to sites from the top 10 Alexa.com twenty times through regular Tor, obfsproxy, and StegoTorus with the HTTP steganography module. The CAIDA dataset with port 80 data was used as a reference for comparison – almost certainly all these are HTTP connections.

The experiment showed that it is quite easy to identify an ordinary Tor. The Tor protocol is too specific and has a number of characteristics that are easy to identify – for example, when using it, TCP connections last 20-30 seconds. The Obfsproxy tool also does not hide these obvious moments in any way. StegoTorus, in its turn, generates traffic that is much closer to the CAIDA reference.

In the case of an attack with the identifying of visited sites, the researchers compared the probability of such data disclosure in the case of "corny Tor" and their solution – StegoTorus. The AUC (Area Under Curve) scale was used for the assessment.

According to the results of the analysis, it turned out that in the case of ordinary Tor without additional protection, the probability of disclosure of data about visited sites is significantly higher.

Conclusion

The history of confrontation between the authorities of countries that impose censorship on the Internet and developers of the systems that bypass blocking suggests that only comprehensive protection measures can be effective. Using only one tool cannot guarantee access to the necessary data, and it cannot guarantee that information about bypassing the block will not become known to censors.

Therefore, when using any tools to ensure privacy and access to content, it is important not to forget that there are no ideal solutions, and if possible, combine different methods to achieve the greatest efficiency.

DEV Community

Experiment: How to disguise the use of Tor to bypass blocking

Tor against blocking

Initial data of the experiment

How it works

Resistance to attacks

Conclusion

Top comments (0)

Read next

A mid-career retrospective of stores for state management

What was your win this week?

Part 1: Master Authentication and Role-Based Access Control (RBAC) with Kinde and Convex in a File-Sharing Application

The guiding light of a North Star - Bringing long-term vision to our Frontend transformation at Hotjar