The generated API clients are a work in progress, you can also find our stable clients on the Algolia documentation.

Skip to main content

Crawler API (1.0.0)

Download OpenAPI specification:Download

API to configure and manage the Algolia Crawler.

List available Crawlers.

List available Crawlers.

Authorizations:
BasicAuth
query Parameters
itemsPerPage
integer [ 1 .. 100 ]
Default: 20

Change the number of items per page.

page
integer [ 1 .. 100 ]
Default: 1

Change the page number.

name
string
Example: name=MyCrawlerName

Filter by crawler name.

appId
string
Example: appId=XXXXXXX123

Filter by Application ID.

Responses

Response samples

Content type
application/json
{
  • "items": [
    ],
  • "itemsPerPage": 20,
  • "page": 1,
  • "total": 100
}

Create a new Crawler with the given config.

Create a new Crawler with the given config.

Authorizations:
BasicAuth
Request Body schema: application/json
name
required
string (CrawlerName) <= 64 characters

The name of the Crawler.

required
object (Configuration)

A Crawler configuration object. See the Crawler documentation to have more details about it.

Responses

Request samples

Content type
application/json
{
  • "name": "My Crawler",
  • "config": {
    }
}

Response samples

Content type
application/json
{
  • "id": "string"
}

Get information about the specified Crawler and its configuration.

Get information about the specified Crawler and its configuration.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

query Parameters
withConfig
boolean

Whether or not the configuration should be returned in the response (in the 'config' field).

Responses

Response samples

Content type
application/json
{
  • "name": "My Crawler",
  • "createdAt": "2019-05-10T07:58:41.146Z",
  • "updatedAt": "2019-05-10T08:16:47.920Z",
  • "running": true,
  • "reindexing": true,
  • "blocked": false,
  • "blockingError": "Error: Failed to fetch external data for source 'testCSV': 404",
  • "blockingTaskId": "string",
  • "lastReindexStartedAt": "2019-05-10T08:16:47.920Z",
  • "lastReindexEndedAt": null,
  • "config": {
    }
}

Update parts of the Crawler, either its name, its config, or both.

Update parts of the Crawler, either its name, its config, or both.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

Request Body schema: application/json
name
string (CrawlerName) <= 64 characters

The name of the Crawler.

object (Configuration)

A Crawler configuration object. See the Crawler documentation to have more details about it.

Responses

Request samples

Content type
application/json
{
  • "name": "My Crawler",
  • "config": {
    }
}

Response samples

Content type
application/json
{
  • "taskId": "e0f6db8a-24f5-4092-83a4-1b2c6cb6d809"
}

Update parts of the Crawler configuration.

Update parts of the Crawler configuration.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

Request Body schema: application/json
object

A partial config object that will be injected into the current one.

Responses

Request samples

Content type
application/json
{}

Response samples

Content type
application/json
{
  • "taskId": "e0f6db8a-24f5-4092-83a4-1b2c6cb6d809"
}

Request the specified Crawler to run.

Request the specified Crawler to run.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

Responses

Response samples

Content type
application/json
{
  • "taskId": "e0f6db8a-24f5-4092-83a4-1b2c6cb6d809"
}

Request the specified Crawler to pause itself.

Request the specified Crawler to pause itself.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

Responses

Response samples

Content type
application/json
{
  • "taskId": "e0f6db8a-24f5-4092-83a4-1b2c6cb6d809"
}

Request the specified Crawler to start a reindex.

Request the specified Crawler to start a reindex.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

Responses

Response samples

Content type
application/json
{
  • "taskId": "e0f6db8a-24f5-4092-83a4-1b2c6cb6d809"
}

Get a summary of the current status of crawled URLs for the specified Crawler.

Get a summary of the current status of crawled URLs for the specified Crawler.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

Responses

Response samples

Content type
application/json
{
  • "count": 0,
  • "data": [
    ]
}

Test an URL against the crawler's config.

Test an URL against the given Crawler's config and see what will be processed. You can also override parts of the configuration to try your changes before updating the configuration.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

Request Body schema: application/json
url
required
string

The URL to test.

config
object

A partial configuration object, that will be merged with the configuration saved. This allows to tests changes in a configuration before saving it. Note that it's not a deep merge, we will simply override all top level fields with the ones that you will pass.

Responses

Request samples

Content type
application/json
{}

Response samples

Content type
application/json
{}

Get the status of a specific task.

Get the status of a specific task.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

tid
required
string

The Id of the targeted Task.

Responses

Response samples

Content type
application/json
{
  • "pending": true
}

Cancel a blocking action on your Crawler.

Cancel a blocking action on your Crawler.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

tid
required
string

The Id of the targeted Task.

Responses

Response samples

Content type
application/json
{
  • "error": {
    }
}

List crawler versions.

List crawler config versions.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

query Parameters
itemsPerPage
integer [ 1 .. 100 ]
Default: 20

Change the number of versions per page.

page
integer [ 1 .. 5000 ]
Default: 1

Change the page number.

Responses

Response samples

Content type
application/json
{
  • "items": [
    ],
  • "itemsPerPage": 20,
  • "page": 1,
  • "total": 100
}

Get a specific version of the configuration of a crawler.

Get a specific version of the configuration of a crawler.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

version
required
integer

The version of the targeted Crawler revision.

Responses

Response samples

Content type
application/json
{
  • "version": 1,
  • "config": {
    },
  • "createdAt": "string",
  • "authorId": "string"
}

Immediately crawl some URLs and update the live index.

The passed URLs will be crawled immediately, and the generated records will be pushed to the live index if no reindex is currently running. If a reindex is running, the records will be pushed to the temporary index.

Authorizations:
BasicAuth
path Parameters
id
required
string

The Id of the targeted Crawler.

Request Body schema: application/json
urls
required
Array of strings
save
boolean

If true, the given URLs will be added to the extraUrls field of the config (if not already in startUrls or sitemaps). If false, the URLs will not be saved in the config. If unspecified, the URLs will be saved to the extraUrls field of the config only if they haven't been indexed during the last reindex.

Responses

Request samples

Content type
application/json

Response samples

Content type
application/json
{
  • "taskId": "e0f6db8a-24f5-4092-83a4-1b2c6cb6d809"
}

List registered Domains.

List registered Domains.

Authorizations:
BasicAuth
query Parameters
itemsPerPage
integer [ 1 .. 100 ]
Default: 20

Change the number of items per page.

page
integer [ 1 .. 100 ]
Default: 1

Change the page number.

appId
string
Example: appId=XXXXXXX123

Filter by Application ID.

Responses

Response samples

Content type
application/json
{
  • "items": [
    ],
  • "itemsPerPage": 20,
  • "page": 1,
  • "total": 100
}