API Endpoint
The Retrieve API endpoint allows you to transform web pages into APIs by scraping and manipulating data. This documentation provides a comprehensive guide on how to use the Retrieve API effectively.
If you need any support, have feedback, or just want to hang out and chat with the development team and other developers in our community, join us on Discord.
Endpoint URL
To access the Retrieve API, use the following URL:
HTTP Methods
The Retrieve API supports the following HTTP methods:
- POST (form-data): Use this method for form submissions.
- POST (raw-JSON): Use this method for sending structured JSON data.
Choose the method that best fits your use case.
Required Parameters
The Retrieve API accepts several parameters. Below is a table detailing each parameter, its requirement status, description, type, and an example.
Parameter | Required | Description | Type | Example |
---|---|---|---|---|
webpage_url | Yes | The URL of the web page to be scraped. Also supports Google SERP. | String | https://example.com/ or https://www.google.com/search?q=search+query+here |
api_method_name | Yes | A user-defined name for the API action. This should be descriptive to guide the AI web scraper in interpreting the task. | String | getUserData |
api_response_structure | Yes | The expected structure of the APIs response, defined in JSON format. Specify explicit requirements or allow the AI web scraper to infer details. | String | {"response": {"name": "<the name of the user>", "email": "<the email address of the user>"}} |
api_key | Yes | Your InstantAPI.ai API key. | String | Get your API key |
api_parameters | No | Additional user-defined parameters for the API method, defined in JSON format. These provide additional context, but do not override the API method name and response structure. | String | {"user_id": "12345"} |
country_code | No | Specifies the country code for premium web proxy, or sets the Google SERP country. Defaults to country of web page URL, or ‘us’ as fallback, if not provided. | String | us |
verbose | No | If set to true, the response will include the full HTML content of the scraped webpage. | Boolean | true |
wait_for_xpath | No | A valid XPath within the web page to wait for specific elements to load. This is optional but can help with certain web pages, such as slow loading scripts, or complex JavaScript web apps. | String | .some-class-name or #some-element-id |
enable_javascript | No | If set to false, JavaScript rendering will be disabled. | Boolean | true |
link_extract | No | If set to true, the endpoint enables link extraction mode. See Links by Condition for usage example. | Boolean | true |
cache_ttl | No | The cache time-to-live (in seconds). Minimum 60, maximum 86400. | Integer | 86400 |
serp_limit | No | The number of search engine results. Minimum 1, maximum 10. Defaults to 5. | Integer | 5 |
serp_site | No | The domain name to restrict search engine results to. Defaults to none (no restriction). | String | example.com |
serp_page_num | No | The page number of the search engine results. | Integer | 1 |
API Response
The API will return a JSON object based on the specified api_response_structure
. If the verbose
parameter is set to true, the response will also include the full HTML content under the key verbose_full_html
.
Example Response
If any required parameters are missing or an error occurs, the API will return a JSON object with an error message.
Example Error Response
Best Practices
Descriptive Naming
Use clear and descriptive names for api_method_name
to guide the AI effectively. For example, prefer getUserData
over getData
.
Detailed Response Structure
Clearly define the api_response_structure
to ensure the AI understands your needs. Specificity leads to more accurate responses.
Contextual Parameters
Utilize api_parameters
to provide additional context, helping the AI generate more accurate outputs.
Optimization Tips
Minimize Token Usage
The AI model’s latency is influenced by the length of the output. Be concise in your requests to improve response time.
Use Premium Proxies Judiciously
The service defaults to the quickest scraping method. Use country-specific premium web proxies only when necessary to avoid latency.
Leveraging AI Capabilities
Creative Output Requirements
Be creative with your output requirements. The AI can handle various tasks, including summarization and sentiment analysis.
Inference and Analysis
The AI can infer information and perform analytical tasks. Specify outputs that require deeper understanding or analysis.
Limitations
Error Handling
If required parameters are missing or an error occurs, the API will return an error message. It is recommended to retry up to 5 times before failing due to cycling in and out of premium web proxies.
AI Interpretations
While the AI is powerful, it may not always interpret requests perfectly. Providing clear, detailed instructions will yield the best results.