Attackers on the Internet often use intermediate systems as stepping stones to hide their true origin. Detecting stepping stones — that is, linking an incoming connection to a system with an outgoing one — can help detect ongoing attacks and investigate past intrusions. Such detection cannot rely on communication contents, since they are usually cryptographically transformed, but rather the timings and volume of network traffic on the two connections. To make detection more efficient and robust, a watermark may be embedded into the network traffic timings by a router, to be recognized at a later point.
Our aim is to investigate watermarking schemes with several research goals. We want to improve the efficiency of watermarks in terms of the number of packets marked, error rates, the amount of data that is possible to embed in a watermark, and robustness to attacks. We also want to study system properties of watermarking techniques, such as the feasibility of large-scale watermark detection, communication costs between markers and detectors, and how the limits in systems' abilities to both measure and alter network timings impact the effectiveness of watermarking techniques.
In addition to stepping stone attacks, we want to study how watermarking can be applied to anonymous communication networks to link streams that such networks relay. In this context, it may be necessary to correlate watermarks on thousands of different streams marked at many different sources, so questions of efficiency and scalability are paramount. An additional question is whether the presence of a watermark may be hidden so that only authorized parties may detect it since otherwise users will avoid those routers that watermark their streams. Studying these questions will let us better understand the limits of anonymous networking and investigate potential countermeasures. There are also other applications for large-scale traffic analysis, such as tracking attackers as they step across foreign systems that may not be willing to cooperate, as well as identifying large-scale coordinated activity, such as is present in botnets. In this case, hidden watermarks have the advantage that they will impose minimal delay costs on the traffic of legitimate users.