Skip to content

Traceroute processing

mappel edited this page Aug 26, 2021 · 1 revision

Traceroute processing

See also the Measurements Results API Reference.

Filtering broken traceroutes

If any of the following checks fail, the traceroute is not included in the calculation.

  1. Check dst_addr field. If it is missing, the DNS resolution of the topology4.dyndns.atlas.ripe.net hostname has most likely failed.
  2. Lookup the AS number corresponding to dst_addr
  3. Lookup the prefix corresponding to dst_addr
  4. Check from field. Even though there should be no reason for this field to be missing or empty, sometimes it is.
  5. Lookup the AS number corresponding to from.
  6. Check if the (from, destination prefix) combination was already seen in previous traceroutes for this time interval. We process traceroutes in daily intervals and a single probe should not target the same prefix twice during one day. Some probes, however, do so, sometimes multiple times in succession, possibly due to DNS caching.

Hop processing

The following steps are applied to each hop in the top-level result list. This process generates a raw AS and IP path with one entry for each hop. This path includes AS 0 as a placeholder if an IP can not be mapped to an AS. This raw path is later reduced to result in a valid AS path.

  1. Check error field. If it is present, some critical error (e.g., packet_send failed) has occurred and the traceroute was aborted. We ignore the traceroute in this case.
  2. Check each reply in the result list of the hop.
    1. If x is in the reply, the request timed out. Record * as reply address.
    2. Check from field. Technically, this field should always exist if the result is not a timeout, but for some reason it sometimes does not.
      1. If present: Record IP as reply address.
      2. If not: Record * as reply address.
    3. If err is in the reply, there was an ICMP error (e.g., Network unreachable).
      1. If the IP in the reply was not seen before, include this reply.
      2. If we saw this IP before, ignore this reply. The rationale behind this is to include new information once, but exclude repeated errors. For example, a router that repeatedly replies with Network unreachable creates "valid" hops (reply from a router), but actually does not move the traceroute forward.
  3. Check if we have any reply addresses (including *). If not, this traceroute is broken and ignored.
  4. Check if we have multiple different reply addresses. In this case, discard * from the set of reply addresses, if it is present.
  5. Check again if we still have multiple reply addresses (after removal of *). If this is the case, we add an AS set to the path and are finished with this hop.
  6. If we now (or originally) only have one reply address, we add it and the corresponding AS to the path. If the address is *, add 0 as the AS number.

Path reduction / modification

  1. Remove hops with AS 0, i.e., * hops and hops for which we could not resolve the IP to an AS number. Convert AS sets that contain only a single value after this to normal hops.
  2. Remove duplicate AS numbers. We only include the first occurrence of each AS number in the path.
  3. If we are left with an empty path, this traceroute is ignored.
  4. If we have at least one AS number, we check if the expected peer and destination AS numbers are at the start and the end of the path. If not, we add them.
  5. If this process results in an AS path of length 1, this traceroute is ignored. Otherwise, we include it in the calculation.

Removing trailing * hops

This process is independent from the reduction above, since there * hops are generally removed (since they map to AS 0). Here we actually modify the raw path by reducing trailing * hops into a single hop. The measurement results are already truncated, if a traceroute did not reach its destination, in which case the last hop has the number 255. However, there can be a varying number of * hops before this last hop. To normalize this, we replace this tail with a single hop.

Special cases

While we keep track of the following cases, we do not remove them during preprocessing since we are interested in differences between BGP and traceroute data.

  • IXP AS: Additional entry in the AS path caused by an IXP.
  • Sibling AS: Different ASN caused by organizations which own multiple ASes but do not announce all.
  • AS "loops": Traceroute hops can result in AS paths of the form A B A, which would be invalid for BGP. An actual routing loop, however, would lead to the traceroute not terminating. Furthermore, we do not care about the order of ASes, just if they occur in the path or not. For this purpose, we can simply remove this duplicate.
  • Private ASNs: This is a bit of a chicken-and-egg problem. These should not occur in BGP announcements, but they will only occur in the traceroute data, if they occur in BGP announcements. The IP-to-AS mapping is done based on BGP data after all.
  • Hops after destination: Similar to the AS "loops" it can happen that the traceroute reaches the destination AS that contains the target IP, but then appears to leave it temporarily. This is an artifact of multiple traceroute probes being routed over different paths (e.g., due to load balancing). Since these are valid cases (some packet entered these ASes after all), we include them.
  • Third party addresses: Caused by traceroute's behavior of replying with the outgoing interface's address instead of the incoming one. Might lead to differences in the path. We will look at them when they become a problem, but according to Hyun2003 they are rather rare.