Domains
Crawl one or several domains, including or excluding subdomains
The Main Domain
Each project, monitoring, and crawl is bound to exactly one main domain. This domain cannot be changed. The starting point of a crawl must be within the main domain or one of its subdomains, but can else be choosen freely.
For example a crawl with the main domain example.com
may have the following starting points:
https://example.com/categories/
Correcthttp://www.example.com/
Correcthttps://categories.www.example.com/overview.html
Correct
The following starting points are forbidden, since they are defined on another host:
https://example.de/
Wronghttp://www.example.org/
Wrong
If you use a subdomain as main domain, like www.example.com
instead of example.com
,
the starting point must be within the subdomain or one of it's subdomains.
https://example.com/categories/
Wronghttp://www.example.com/
Correcthttps://categories.www.example.com/overview.html
Correct
Include or Exclude Subdomains
For each crawl, you may decide, if the crawl should include subdomains of the domains or not.
For example, giving a main domain of example.com
and a starting point of http://www.example.com
,
the following domains would be included in a crawl, if including all subdomains is enabled.
example.com
Includedwww.example.com
Includedhelp.example.com
Includedcategories.www.example.com
Included
If, however, including all subdomains is disabled, only the domains of the starting point would be included:
example.com
Excludedwww.example.com
Includedhelp.example.com
Excludedcategories.www.example.com
Excluded
Additional Domains
Additional domains are available for projects only.
Sometimes it may be necessary to crawl several domains at once. This is possible using additional domains.
Think for example of a site that has been localized using different top-level-domains:
example.com
example.de
example.co.uk
example.fr
Or think of a site that has outsourced some functionality to some satellite-domains:
example.com
example-support.com
example-forums.com
Additional domains must be added in the project settings first.
This feature is only available, if the main domain is verified.
To edit a project, choose Edit Settings on your project dashboard. There you may add an remove domains in the section titled Domains.
Excluded domains are on the left, included domains on the right. You can move domains from left to right and back by simply clicking on them in one of the according lists.
Only verified domains can be added as additional domains.
Every included additional domain and all of its subdomains will be included in the crawl. Given for example an
additional domain of www.example.fr
, the following domains would be crawled:
www.example.fr
Includedcategories.www.example.fr
Includedhelp.example.fr
Excluded
Once additional domains have been configured in the project settings, they can be enabled or disabled per crawl withing the advanced settings section.
Monitoring crawls always have additional domains enabled, if configured for the project.