Dashboard Order Developer FAQ
Login

 Introduction

 API call structure
 Scrape Google Search
 Scraping Bing Search
 Keyword Analytics
 Keyword rank tracker


 Source Code download

Basic API information

Minimal API link: https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_
API login: _USER@ACCOUNT_
API key: _CREDENTIALS_

Scrape Google Search

Any Google Search related API calls require the "&job_type=scrape_google_search"  parameter

By using the API it is possible to do the same tasks as by using your Google Search dashboard.

How to scrape:
  1. Create a Job
  2. Modify the job by adding keywords, up to 100.000
  3. Start the Job
  4. Download and store results (result details are available for 14 days within our database)

 

 The Job


A Job contains up to 100,000 keywords and all information required to scrape the keyword results from Google.
There is no limit on the number of Jobs that can be created, however each Job is identified by a unique identifier.

Job parameters:
  • Identifier
    [Alphanumeric]
    The identifier is a unique alphanumeric string. 
  • Results per Keyword
    [Numeric] 100,200,300 up to 1000, default 100
    The number of organic results to scrape per keyword
  • Priority
    [Numeric] 0-20, default 1
    The priority your job will receive among our scraping infrastructure, higher priority will reduce the time required to deliver results but increases the credit cost.
    A priority of 0 results in a significant credit cost reduction but only idle resources will be allocated to your job. This can result in unpredictable time to complete the Job.
  • Localization - Country
    [Country], default 'Default'
    The Google country to scrape results from (the list of countries can be found on this manual page)
  • Localization - Language
    [Language], default 'English'
    The localized language used during scraping, the language needs to fit to the selected country to receive accurate results.
  • Keywords
    A number of keywords to scrape results for


A job can be modified and deleted freely until it was started.

The credit cost depends on Priority, Results per Keyword and the number of Keywords and will be deducted when the Job is started.
By using get_job it is possible to estimate the scraping timespan which depends on the selected priority and number results to scrape.
Once the job has been started it can not be modified anymore and will continue to run until finished.


 API Tasks

get_jobs: 
https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_&job_type=scrape_google_search&type=json&task=get_jobs

This API call will return up to 50 jobs. All details about the jobs are included except the keywords,result data and detailed progress.

The api task "get_job" can be used to retrieve progress details about one specific job.

 

Response example:

{
  "exceptions": [],
  "answer":
{
"jobs" :
[
  {
    "jobname": "first test",
    "created": "2016-04-11 03:20:21",
    "job_type": "scrape_google_search",
    "max_results": "100",
    "estimated_keywords": "4",
    "priority": "1",
    "finished": "2016-12-11 03:21:01",
    "credit_cost": "0.90",
    "start": "1",
    "progress": 100
  },
  {
    "jobname": "second test",
    "created": "2016-12-11 03:06:08",
    "job_type": "scrape_google_search",
    "max_results": "100",
    "estimated_keywords": "5",
    "priority": "1",
    "finished": "2016-12-11 03:13:01",
    "credit_cost": "0.90",
    "start": "1",
    "progress": 100
  }
  ]
}
}

- progress shows only 0 and 100, detailed values available through "get_job". Aborted or stopped jobs will always show 0
- finished null or "datetime", use this field to detect if a job was completed


get_job:

https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_&job_type=scrape_google_search&type=json&task=get_job&jobname=_JOBNAME_

This call is used to receive details about a specific job, especially once it has been started.

The API will respond with a live progress value as well as the number of scraped keywords as well as available pages.

 

Example response:

{
"exceptions": [],
"answer":
{
"job":
{
"jobname": "demo_job",
"created": "2016-10-11 00:41:40",
"job_type": "scrape_google_search",
"max_results": "100",
"estimated_keywords": "114",
"priority": "1",
"finished": "2016-12-11 03:04:01",
"credit_cost": "0.90",
"start": "1",
"progress": 100,
"total_keywords": "114",
"total_results": "114",
"total_pages": 3,
"estimated_hours_left": 0.04,
"pages" : 3
}
}
}


- estimated_keywords is the number of original accepted keywords
- finished is either "null" or shows the date and time when the last keyword was scraped

- progress shows the percentage of finished keywords

- total_keywords is the actual number of keywords in case a keyword had to be rejected

- total_results is the actual number of scraped keywords

- total_pages is the number of "pages" available once scraping is finished
- estimated_hours_left is the estimation how long this job will run once started or until finished, float value in hours
- pages is the number of "pages" available by using the "get_results" API call



get_keywords:

https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_&job_type=scrape_google_search&type=json&task=get_keywords&jobname=_JOBNAME_

This call is used to retrieve all keywords of a job, they will be separated into "finished" and "unfinished"

Available response types are: html, csv, json

 

Example response:

{
"exceptions": [],
"answer":
{
"keywords":
{
"jobname": "demo_job",
"job_type": "scrape_google_search",
"keywords_finished": ["keyword1","keyword2","keyword3"],
"keywords_unfinished": ["keyword4","keyword5","keyword6"],
}
}
}




create_job:

https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_&job_type=scrape_google_search&type=json&task=create_job

To create a Job an additional json formated object needs to be passed as POST parameter "json={....}"

 

Example:

{
  "job_type" : "scrape_google_search",
  "language" : "English",
  "country" : "default",
  "max_results" : 100,
  "jobname" : "demo_job",
  "priority" : 1
}

The API will respond as usual with a specific json encoded object:
{
  "exceptions": [],
  "answer":
  {
"job":
{
    "jobname": "demo_job",
    "job_type": "scrape_google_search"
  }
}
}


Once a Job has been created it can be populated with keywords.

Available countries and languages:

 

 

This API query can be easily tested by using the command line tool "curl":




remove_job:

https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_&job_type=scrape_google_search&type=json&task=remove_job&jobname=_JOBNAME_

This removes a selected Job by it's unique identifier.

You may remove any Job that has not been started.

 

API Response:

{
"exceptions": [],
"answer":
[
true
]
}



modify_job:

https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_&job_type=scrape_google_search&type=json&task=modify_job

To modify a Job an additional json formated object needs to be passed as POST parameter "json={....}"
A Job may be modified, extended or deleted until it has been started. 

Once a Job was started it can not be modified in any way.

The supplied json object must include "job_type" and "jobname", any other elements are optional.

When adding keywords to a Job the element "keyword_operation" may contain "replace" or "append".

  "replace" will remove any existing keywords from the selected Job before adding new keywords.

  "append" is default behaviour and will append any supplied keywords to the selected Job.

Example json object:

{
  "job_type" : "scrape_google_search",
  "language" : "English",
  "country" : "default",
  "max_results" : 200,
  "jobname" : "demo_job",
  "priority" : 2,
"keywords" :
[
"keyword 1",
"keyword 2",
"keyword 3"
],
"keyword_operation" : "replace"
}

 

Example API response:

{
"exceptions": [],
"answer":
{
"job":
{
"jobname": "demo_job",
"max_results": 200,
"priority": 2,
"job_type": "scrape_google_search",
"estimated_keywords": 3,
"credit_cost": 0.9
}
}
}

The response reflects all applied changes and shows the credit cost that will occur when the Job is started.

There is a minimal credit cost for any Job, currently 0.6 credits for a Priority 1 Job and 0.9 credits for a Priority 2 Job.

 



start_job:

https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_&job_type=scrape_google_search&type=json&task=start_job&jobname=_JOBNAME_

Any created job that contains keywords may be started at any time.

Once started, the job is locked from further modification and will immediately deduct the credit cost from your wallet.

Such an action is non refundable and can not be interrupted anymore.

 

Once a job was started it can be monitored using the "get_job" API call to see how many keywords are already finished.

It is possible to download the keyword results using "get_results" while a job is running.

Scraped keyword results are stored for 14 days, once this timespan has passed the detail information will be removed.

 

Example response:

{
"exceptions": [],
"answer":
{
"job":
{
"jobname": "demo_job",
"start": "1",
"job_type": "scrape_google_search",
"credit_cost": "0.90"
}
}
}



get_results:

https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_&job_type=scrape_google_search&type=json&task=get_results&jobname=_JOBNAME_&page=_PAGE_

Scraped Google keyword results are downloaded in pages, each page contains 50 keyword results.

The estimated download size is between 1 and 10 MB per page.

* The optional parameter 'result_per_keyword_quantity' is an method to limit the amount of results per keyword. 
* The optional parameter 'page_quantity' is an method to adjust the number of keyword results from 50 to a higher or lower value. (50 keywords at 100+ results/keyword, 5000 kewords at 1 results/keyword). each 50 keywords are internally generating an API call (0.01 Credits charge)

The API response object contains an array "keyword_results" with up to 50 keyword results.
Each keyword_result contains information about the keyword as well as all organic results and advertisement results, again in form of two arrays.

Let's take a look at a simplified API response:

{
"results":
{
"exceptions": [],
"answer": {
"jobname": "demo_test_2",
"job_type": "scrape_google_search",
"keyword_results":
[
{
"count_organic": 101,
"count_creative": 0,
"language": "en",
"country": "Default",
"keyword": "A Chip on Your Shoulder",
"resultstats": { "1": "799000", "2": "650000" },
"results_organic":
[
{
"url": "https:\/\/en.wikipedia.org\/wiki\/Chip_on_shoulder",
"title": "Chip on shoulder - Wikipedia",
"description": "To have a chip on one's shoulder refers to the act of holding a grudge or grievance that readily provokes disputation.",
"sitelinks": [],
"flags": { "hacked": 0, "harmful": 0 }, "page": 1

},
{
"url": "https:\/\/en.wikipedia.org\/wiki\/Chip_on_shoulder",
"title": "Chip on shoulder - Wikipedia",
"description": "",
"sitelinks": [],
"flags": { "hacked": 0, "harmful": 0 }, "page": 1
},
{
"url": "https:\/\/en.wikipedia.org\/wiki\/Chip_on_shoulder",
"title": "Chip on shoulder - Wikipedia",
"description": "To have a chip on one's shoulder refers to the act of holding a grudge or grievance that readily provokes disputation.",
"sitelinks": [],
"flags": { "hacked": 0, "harmful": 0 }, "page": 1

},
{
"url": "http:\/\/dictionary.cambridge.org\/us\/dictionary\/english\/have-a-chip-on-your-shoulder",
"title": "have a chip on your shoulder Definition in the Cambridge English ...",
"description": "have a chip on your shoulder definition, meaning, what is have a chip on your shoulder: to seem angry all the time because you think you have been treated ...",
"sitelinks": [],
"flags": { "hacked": 0, "harmful": 0 }, "page": 1

} /*,... additional results*/
],
"results_creative": []
} /*,... additional keywords*/
],
"page": 1,
"pages_expected": 1,
"pages_available": 1
}
}
}

 

- keyword_results: is an array of objects, each object is one keyword :
-- count_organic: the number of organic (normal) results for this keyword
-- count_creative: the number of advertisements for this keyword
-- language: the language code
-- country: the country code
-- keyword: the keyword
-- resultstats: the number of results reported for each page (array)
-- results_organic:
--- url: The URL 
--- title: The title
--- description: The result description
--- flags: (array) reports found Google flags  ("this site was hacked", "this site my be harmful")
--- page: The page of this keyword (by default we scrape 100 keywords per page)
--- sitelinks: An array with possible sitelinks
---- An array containing all Google sitelinks with url, title, description each
-- results_creative: an array identical to results_organic
- page: the current page of results requested (containing up to 50 keywords each containing up to 1000 results)
- pages_available: the number of total pages currently available
- pages_expected: the number of total pages expected when the job finished

 

New feature - Google Knowledge
We have added Google Knowledge into our JSON result data! (CSV to be added at a later time)
You may now search for

  • Persons
  • Companies
  • Cities
  • Restaurants and shopping
  • Books and Movies
  • Attractions
  • Locations and weather information


 

Take an english search for "Obama" for example:

                    "result_knowledge": {
                        "image_logo_src": "https:\/\/lh6.googleusercontent.com\/-2lJYGtfXKwQ\/AAAAAAAAAAI\/AAAAAAAB2HQ\/QSmIw0nQN_c\/s0-c-k-no-ns\/photo.jpg",
                        "header_text": "Barack Obama",
                        "header_sub_text": "44th U.S. President",
                        "times_open": [],
                        "description": "Barack Hussein Obama II is an American politician who served as the 44th President of the United States from 2009 to 2017. He was the first African American to serve as president, as well as the first born outside the contiguous United States. Wikipedia ",
                        "items_of_interest": [
                            "Barack Obama was the 44th and Donald Trump is the 45th President of the United States.",
                            "Hillary Clinton",
                            "Barack Obama and Michelle Obama have been married since 1992.",
                            "Ann Dunham was Barack Obama's mother.",
                            "Vladimir Putin"
                        ],
                        "person_born_info": "August 4, 1961 (age 55 years), Kapiolani Medical Center for Women and Children, Honolulu, HI",
                        "person_education_info": "Harvard Law School (1988–1991), More"
                    }

 Another example, english "Ferrari":

 "result_knowledge": {
                        "image_logo_src": "http:\/\/lh5.googleusercontent.com\/-SnTKdWZA8NI\/VWSg3sycQ6I\/AAAAAAAAACk\/37STRepzvJM5uuLfSs793X4Feo_CI1HJgCJkC\/s241-k-no\/",
                        "location_href": "https:\/\/www.google.com\/maps\/place\/MAG+Ferrari+In+Columbus,+Ohio\/@40.099688,-83.125521,15z\/data=!4m2!3m1!1s0x0:0x2ac116cff7ca4e58?sa=X&ved=0ahUKEwifqZeGy-XRAhUM7YMKHevBBQEQ_BII8wUwfA",
                        "location_img_src": "https:\/\/www.google.com\/maps\/vt\/data=RfCSdfNZ0LFPrHSm0ublXdzhdrDFhtmHhN1u-gM,oQRzStoSkLpDIbvDpw9QlO3VnR7-oeqWzIo7ofJbM-zACv_FTzt7ishssYouQq_AXBXkLbKUZbiHQ_7CzU4C1Dxr5Uzv4rs935AuzFkVvuhGcQ-EdBJhM_xRRlJ0QkCEHkn6P6Q",
                        "header_text": "Ferrari S.p.A.MAG Ferrari In Columbus, Ohio",
                        "header_sub_text": "Car manufacturer",
                        "website_href": "http:\/\/www.dublin.ferraridealers.com\/",
                        "address_text": "5035 Post Rd, Dublin, OH 43017",
                        "phone_text": "Phone:(614) 389-5557",
                        "times_open": [
                            {
                                "day": "Saturday",
                                "time": "9AM–6PM"
                            },
                            {
                                "day": "Sunday",
                                "time": "Closed"
                            },
                            {
                                "day": "Monday",
                                "time": "9AM–7PM"
                            },
                            {
                                "day": "Tuesday",
                                "time": "9AM–7PM"
                            },
                            {
                                "day": "Wednesday",
                                "time": "9AM–7PM"
                            },
                            {
                                "day": "Thursday",
                                "time": "9AM–7PM"
                            },
                            {
                                "day": "Friday",
                                "time": "9AM–6PM"
                            }
                        ],
                        "description": "Ferrari S.p.A. is an Italian sports car manufacturer based in Maranello. Founded by Enzo Ferrari in 1939 as Auto Avio Costruzioni, the company built its first car in 1940. Wikipedia ",
                        "items_of_interest": [
                            "Lamborghini",
                            "Maserati",
                            "Tesla Motors",
                            "Bugatti Automobiles",
                            "Porsche"
                        ],
                        "organization_founded_info": "September 13, 1939, Maranello, Italy"
                    }

 

The full list of possible values within result_knowledge:

  • image_logo_src - a link to the main image
  • location_href - a link to the google map pinpoint
  • location_img_src - a link to the google map image thumbnail
  • header_text - the main headline
  • header_sub_text - the sub headline
  • website_href - the website URL
  • address_text - the address
  • phone_text - the phone number
  • times_open - an array of opening times containing 'day' and 'time'.
  • description - a description which might come from wikipedia, IMDB or somewhere else
  • weather_info - the current weather
  • population_info - the population
  • items_of_interest=[] - an array with most related keywords (more attractions, related actors, all movies or books from a series, etc.)
  • height_text  - architectural height
  • opened_info - when an attraction was opened first
  • review_rating -  a numeric review rating ("4,7")
  • currency_info - the currency used at the location
  • location_area_info - square meters or similar
  • organization_founded_info - when an org was founded
  • location_architects_info - the architect(s) of a building/attraction
  • person_born_info - when a person was born
  • person_deceased_info - when a person has died
  • book_published_info - when a book was published
  • company_stock_info - N/A (currently disabled)
It's important to know that the language used to search will also be present inside the knowledge data, including day names, time formats and language specific conventions.