If a user starts browsing a web site, some pages are easy to reach, since they are just a few clicks away, while other need quite some effort to be reached.
Depth is a measure of the distance from one resource to another. It counts, how many clicks it takes the user to move from the start page to any given URL.
The number of necessary clicks to move to a given resource is called its level. The start page resides on level zero, since no clicks are needed. All URLs that are accessible from the start page have a level of one. URLs that are linked by pages on the first level, have a level of two, and so on.
The Audisto Crawler provides two different perspectives on depth:
Users do not see all references to other resources on a page. Their interaction with a site is mainly limited to
The Audisto Crawler therefore considers only this kind of references when building the User Graph. It ignores references such as <link>, <img>, <script> etc.
When it comes to redirects, the Audisto Crawler regards them as being invisible to a user, so they do not increment the level. For example:
Both B and C would be assigned a user level of 5:
From a bot's perspective, things are different. A bot can see and understand meta references like a <link> or similar, and takes them into consideration when building the Bot Graph. What kind of references our crawler should follow is configurable when creating a crawl.
In contrary to the user view, the bot view is centered around requests, not clicks. The number of necessary requests to move to a given resource is counted. The requests must be legal, the bot must not violate the rules given by
Given the above example of a redirect:
Page B would get a bot level of 5, and page C a bot level of 6.
An edge case is the handling of pages that
In this case, the bot depth is set to be below the first resource linking with follow. For example:
Page D would be assigned a bot level of 7, not 6, as one may think.
This is because the shortest legal path to page D is via B and C, not via C directly.
Regarding the distribution of resources over levels, the Audisto Crawler divides between