SiteCrawler supported only "on-platform" advertising, meaning that the dealership had to be using one of our websites to be able to support all of our automated custom ad generation.  The next phase for our platform was to support "off-platform".  Doing so meant we had to be able to gather information about the dealership's website such as discovering the url to their inventory pages or the url for a specific vehicle's details page.

Since we don't own their site, we built a lightweight crawler which takes as input details about the page we want to find, such as "new Honda Civic", or "service and parts", and determines the best matching url.  It's a weight based system where certain keywords are worth more than others, where they appear on the page, such as in url or in link text, and which parts of urls are considered more important.  We also know a lot of common url patterns for dealership websites and we can program those into the system, swapping out variable parameters for a specific vehicles information, then test that the page is valid before serving the ad.