This is a republishing of an old article I wrote as a companion piece to a short talk I delivered at the Leeds Beckett Ethical Hacking Society in 2015. I’m restoring it because the RAPTOR site produced by the team behind the academic paper still points to it and the idea of a broken link bothers me.
Presenting itself as the latest iteration in an increasingly long line of poorly conceived backronyms, RAPTOR represents an up-scaled version of the existing traffic confirmation attacks against the Tor platform.
Traffic analysis attacks have long been known to represent a fundamental threat to Tor users, primarily because the service doesn’t (or rather cannot) protect or prevent monitoring of traffic at the entry and exit guard relays. The traffic leaving an exit node is actively “anonymised” by Tor but since it doesn’t engage in packet obfuscation, due to its focus on delivering low-latency connections. As such, it cannot protect against end-to-end correlation.
This structural vulnerability represents a grave threat to the Tor model since being able to positively identify, and correlate, entry and exit traffic, subverts the very anonymity that Tor is designed to protect.
What is RAPTOR?
The work behind RAPTOR was carried out by several PhD students and professors at Princeton University who have successfully implemented the attacks they outline in the 17-page report which is available from arXiv.org. The report covers three methods of attack which include asymmetric analysis of traffic over the Tor network, exploitation of the natural “churn” of BGP paths, and direct manipulation of the BGP inter-domain routing, through hijacking of IP prefixes and interception of routing queries.
Asymmetric Traffic Analysis
Tor has long been susceptible to statistical correlation attacks. If traffic between the client and the entry relay is routed through the same autonomous system as traffic between the exit relay and the destination, it can be compared to determine correlating patterns and potentially identify the destination that a particular client has communicated with.
The RAPTOR report demonstrates that if an adversary can observe the traffic, going in any direction, from both ends of the communication, they can potentially deanonymise the client. The figures above show how an attack which might focus only on forwards traffic would be limited to comparing traffic routed through AS5. Alternatively, an attack that analyses both forward and backward traffic can collate traffic from AS3, AS4, and AS5.
Analysis of the traffic captured from both path segments of a users communication over Tor can be very accurately correlated as demonstrated in the graphs below produced by the Princeton team. Below them is a table displaying the statistical accuracy of positive client-to-destination identification using the data in the graphs.
The results are pretty staggering and serve to really emphasise just how dramatically a strategic AS-level adversary could undermine the Tor service. I don’t think there’s much need for further exposition on this particular section - the graphs really speak for themselves.
The graphs in the above picture illustrates the enormous difference between the the traffic of an unmatched pair, showing very poor correlation in contrast to the extremely clear correlation in the graphs from the matched pair.
Natural Churn (of BGP paths)
As a natural by-product of changes in the physical network topology, traffic over the Internet can take gradually (and sometimes drastically) different routes, even if the source and destination do not change. This is usually as a result of hardware failures or recoveries, and the rollout of new services e.g. routers and links. Other influential factors can be the implementation of new policies at the AS-level and the impact of new business relationships. This fluctuation in routes is referred to as BGP churn.
For users who continue to communicate with the same destination recipients, each instance of communication has the potential to be compromised. In order to address this threat, the Tor platform utilises fixed entry guard relays for a set period of time (usually 9 months) which reduces the opportunity for malicious relays to gain a foothold in the chain of communications. This does not, however, address the threat posed by adversarial ASes.
The path between the client and its guard relay can change over time, as a result of the aforementioned churning in BGP routes. As the images below exemplify, the collapse of the AS5 to AS4 link results in AS3 joining the path between the exit relay and the destination allowing for further traffic analysis.
The results from the team at Princeton showed that, over a period of one month, natural churn has the potential to increase the number of compromised Tor circuits by up to 50%. Further statistical analysis, which is thoroughly laid out in the report, showed that some of the largest ASes saw traffic from up to 90% of all entry and exit pairs.
In contrast to the passive nature of the first two attacks, there is also the possibility of an active attack which seeks to manipulate the trust model of BGP routing by engaging in dishonest practices. This kind of attack is well known already but hasn’t previously been applied to anonymity providing services like Tor.
A brief deviation
There is a really great overview of the broader topic of Internet routing and BGP in general by Dr Richard Mortimer from the University of Nottingham, produced by the YouTube channel Computerphile which you can find here. If you’re interested in a more thorough explanation, I highly recommend watching it.
Relatedly, as a quick deviation, I want to just go over how BGP routing works to explain the efficacy of this attack and how it actually affects Tor traffic.
RAPTOR includes an attack which relies on hijacking or intercepting the BGP routes by which packets are directed around the internet. In particular, the attack requires that an adversarial AS advertises itself with the IP prefix of the AS that an attacker wishes to siphon traffic from. IP prefixes are methods for representing IP ranges using a network shorthand. So, all the addresses between 192.168.1.0 and 192.168.1.255 can be represented simply with 192.168.1/24 - you may be familiar with its counterpart subnet mask which is 255.255.255.0. The /24 simply represents the first 24-bits in the 32-bit structure of an IPv4 address.
Since the IPv4 address range has a pool of over 4 billion addresses (232), it would be nonsensical for every single AS to maintain a list of routes for every IPv4 address with 32-bit accuracy. A router which is potentially hundreds of network hops away from the destination has no need to know its exact route and can instead simply retain a prefix for a hop closer to it. Traffic for an example IP address such as 192.168.1.1 can be sent via the route which advertises for 192.168.1/24 instead. It might even only have 192.168/16 if it’s sufficiently far away from the recipient.
This concept is a key feature in the active attack portion of RAPTOR which I will explain further after outlining one other important point about IP prefixes.
When an AS receives a packet and checks its destination, it will try to find the most accurate IP prefix route by which to send it onwards. So if it has a packet destined for 192.168.1.1 and within the BGP table it has an entry for 192.168/16 and 192.126.1/24, the packet will be sent to the second address due to method by which BGP operates - which is to find the prefix that best matches the destination IP.
An important note to this is that most ASes will not look for anything more specific than a 24-bit prefix and it will most likely route it to the first matching 24-bit address it finds in the table.
Now back to the show.
This attack takes two forms - hijacking and interception.
Hijacking occurs when the adversarial AS advertises a more accurate IP prefix than the AS(es) on the desired route. So traffic intended for an AS between the client and entry relay is diverted via the undesirable AS where it can be further analysed. Interception occurs when the desired route is through an AS which is logically further away from the client than the attackers compromised AS and its advertisements of the more accurate IP prefix are simply reached first.
An interception-style attack is displayed in the images from the report included below.
The report splits the suggested countermeasures into two main groups: (a) approaches with the goal of reducing the ability for any AS-level adversary from observing both segments of the anonymous communication, and (b) approaches that focus on reducing the likelihood of correlation in the case that an adversary successfully observes both segments of the communication.
In order to minimise the potential for traffic analysis at an AS-level, the Tor network could implement monitoring of path dynamics on the client-entry and exit-destination segments which could be utilised when selecting which relays to include in the circuit. The report suggests that Tor clients could select relays in a way that ensures the first and last segment are not routed via any matching ASes.
From a mitigation standpoint, there are a couple of suggested measures. The first of these is a monitoring framework that would be used to detect routing attacks. This would allow the Tor project to potentially identify problem ASes and, hopefully, announce them publicly so that they can be held to account. Additionally, clients can be informed allowing them to make a conscious about whether or not to suspend their usage of Tor or select a new relay.
The second method for mitigation focuses on attempting to prevent instances of routing attacks. The approaches required for this include enforcing that Tor operators advertise /24 prefixes, that tor clients favour logically closer guard relays, and the application of secure inter-domain routing.
Enforcing of /24 IP prefixes would have a profound effect on BGP hijacking. As mentioned in the short deviation above, ASes will filter out routes that are more specific than 24-bits so AS-level attackers will be unable to launch a more specific hijack. This can be combined with the favouring of closer guard relays. If the chosen guard relay is topologically closer than the adversarial AS then the attacker will be unable to intercept the route of the clients traffic - this has the added benefit of further protecting the communication against hijacks as well as mitigating the opportunity for asymmetric traffic analysis and the effects of BGP churn.
The final suggested improvement is better securing of inter-domain routing. This is a solution that requires investment, uptake, and agreement from many different parties that make up the infrastructure and ecosystem of the Internet. As the Princeton report suggests, the highlighting of attacks such as RAPTOR are hugely important in driving interest around this subject and helping to accelerate the development in this area.
I want to sign this off with a quick mention of an ethical concern raised towards the end of the report.
Colluding adversaries. ASes that fall inside the same jurisdiction as one another could very easily be instructed to monitor traffic from the Tor network and report it to law enforcement. Looking at the list of providers who observed the top ten percentages of Tor traffic makes the previous statement a very concerning possibility. This concern is further compounded when you consider how many nation states currently engage in data sharing for law enforcement purposes.
I hope this wasn’t too dense a read for those of you who took the time to and also a little nod to those who suffered through my rather fast summation of this topic in my talk at the Leeds Ethical Hacking Society.