The History API allows retrieving aggregated data of crawls over time. The general path is
This returns a list of objects containing the history's crawl and scheduling settings, which mostly resembles a crawl object.
To access a single history use
History data can be accessed using:
It returns a history data list, which will be described below.
Building the Path
History Cluster References
The Audisto history is traced by cluster. Since a cluster's IDs may change from crawl to crawl, clusters are referenced using a fixed string. These reference IDs are user defined, however there are a couple of default clusters that provide a fixed reference ID:
!root: The cluster "All Paged"
!external: The cluster containing all external pages
!internal: The cluster containing all internal pages
History data is grouped by aggregates, which mostly but not necessarily correspond to properties of crawled URLs. A full list can be found below.
The JSON format returned consists of the following members:
type: An extended enumeration value
items: An array of data points
The JSON format is not fixed, but may be extended at any time, whenever we add new features. It will however always be backward compatible.
The type is an extended enumeration value, that has the following properties:
id: The aggregate ID
value: Name of aggregate
description: A description or explanation for this aggregate
unit_value: The unit of the value, e.g. "Level" for all level based aggregates
unit_aggregated: The unit of the aggregated value. E.g. "URLs" for counter.
member_name: The name of a single data point value. This mostly is the same as the name or a singular version of the name.
The extended properties are only returned for deep responses. If
deep=0, only a standard enumeration is returned.
Data points have the following properties:
id: A unique ID for this data point
value: The value of this data point. The format depends on the aggregate type, it can be an integer, a floating point value, or an enumeration. E.g. for the HTTP status, the value would be 200, 301, 404 or similar.
aggregated: The aggregated values, usually a counter of affected URLs. Numeric, but not necessarily an integer. For average, for example, these are float values.
In a deep response is requested, the JSON is extended by
execution_time: Date and time of the originating crawl
id_crawl: ID of originating crawl
History data can be filtered.
The following filters are supported:
value: Filter by value. Values are generally integers, but their meaning may differ (e.g. enumeration ID), depending on the aggregate.
range: A range. See below.
History data can be filtered by range. A range can be
- offset- or date-based
- inclusive or exclusive
The general syntax of a range is
Boundaries mark inclusive (start or end of range is part of selection) or exclusive ranges. Allowed boundaries are:
(: Begin of exclusive range
): End of exclusive range
[: Begin of inclusive range
]: End of inclusive range
Inclusive boundaries are default and can be omitted.
For range start and end, integers and dates are supported.
Integers are treated as offsets, counting from the latest - that is youngest - data source. Offsets are zero-based.
Dates must be provided in ISO 8601 format.
Supported are calendar dates in format
YYYY-MM-DD. The shorten format
YYYYMMDD is not supported.
Time and timezone are optional. A time may only be specified up to the level of seconds. Milliseconds are not supported. Hours, minutes and seconds must be divided by a colon - like in
HH:MM:SS, the shorten format of
HHMMSS is not supported.
Return counters for all HTTP status over time:
Return number of crawled URLs over time:
Return aggregated counters for all redirects:
Get data for last ten crawls:
Get data for February 2019:
Get data for 4th February 2019, 8:00 to 18th February, 9:15 :
Get last ten crawls from February 2019:
Note that an inclusive offset of 9 ist the same than an exclusive offset of 10.
List of Aggregate IDs
The following aggregate IDs are in use currently:
2: HTTP Status
3: Response Time
4: Discovered URLs per Level
5: Crawled URLs per Level
8: Content Size Uncompressed
9: Content Size Compressed
10: Duplicate Content Groups Counter
11: Duplicate Content Pages Counter
15: Content Type
16: PageRank per Level
17: CheiRank per Level
18: Total Ranks
20: Discovered URLs per User-Level
21: Crawled URLs per User-Level
22: Level Relation
24: Check Passed URLs
25: Check Failed URLs
26: Isolation Levels
28: Check Results
29: Requirement Results
30: Check Overall Results
31: PageRank per Status
32: CheiRank per Status
33: PageRank per HTTP Status
34: CheiRank per HTTP Status
35: PageRank per Indexability
36: CheiRank per Indexability
37: PageRank per Isolation Level
38: CheiRank per Isolation Level
39: PageRank per User-Level
40: CheiRank per User-Level
41: PageRank per Host
42: CheiRank per Host
43: PageRank per Internal Host
44: CheiRank per Internal Host
45: PageRank per External Host
46: CheiRank per External Host
47: URL Rewrites
48: PageRank crawled URLs per Level
49: CheiRank crawled URLs per Level
50: PageRank crawled URLs per User-Level
51: CheiRank crawled URLs per User-Level
52: Hreflang Groups