Dashboard Order Programming FAQ
Login

 Introduction

 Information
 Scrape Google Search
 Scraping Bing Search


 Free Source Code

Basic API information

Minimal API link: https://scraping.services/api?user=_USER@ACCOUNT_&credentials=_CREDENTIALS_
API login: _USER@ACCOUNT_
API key: _CREDENTIALS_

Scrape Google Search

Any Google Search related API calls require the "&job_type=scrape_google_search"  parameter

By using the API it is possible to do the same tasks as by using your Google Search dashboard.

How to scrape:
  1. Create a Job
  2. Modify the job by adding keywords, up to 100.000
  3. Start the Job
  4. Download and store results (result details are available for 14 days within our database)

 

 The Job


A Job contains up to 100,000 keywords and all information required to scrape the keyword results from Google.
There is no limit on the number of Jobs that can be created, however each Job is identified by a unique identifier.

Job parameters:


A job can be modified and deleted freely until it was started.

The credit cost depends on Priority, Results per Keyword and the number of Keywords and will be deducted when the Job is started.
By using get_job it is possible to estimate the scraping timespan which depends on the selected priority and number results to scrape.
Once the job has been started it can not be modified anymore and will continue to run until finished.


 API Tasks

get_jobs: 
https://scraping.services/api?user=_LOGIN_&credentials=_PASSWORD_&job_type=scrape_google_search&type=json&task=get_jobs

This API call will return up to 50 jobs. All details about the jobs are included except the keywords,result data and detailed progress.

The api task "get_job" can be used to retrieve progress details about one specific job.

 

Response example:

{
  "exceptions": [],
  "answer":
{
"jobs" :
[
  {
    "jobname": "first test",
    "created": "2016-04-11 03:20:21",
    "job_type": "scrape_google_search",
    "max_results": "100",
    "estimated_keywords": "4",
    "priority": "1",
    "finished": "2016-12-11 03:21:01",
    "credit_cost": "0.60",
    "start": "1",
    "progress": 100
  },
  {
    "jobname": "second test",
    "created": "2016-12-11 03:06:08",
    "job_type": "scrape_google_search",
    "max_results": "100",
    "estimated_keywords": "5",
    "priority": "1",
    "finished": "2016-12-11 03:13:01",
    "credit_cost": "0.60",
    "start": "1",
    "progress": 100
  }
  ]
}
}


get_job:

https://scraping.services/api?user=_LOGIN_&credentials=_PASSWORD_&job_type=scrape_google_search&type=json&task=get_job&jobname=_JOBNAME_

This call is used to receive details about a specific job, especially once it has been started.

The API will respond with a live progress value as well as the number of scraped keywords as well as available pages.

 

Example response:

{
"exceptions": [],
"answer":
{
"job":
{
"jobname": "demo_job",
"created": "2016-10-11 00:41:40",
"job_type": "scrape_google_search",
"max_results": "100",
"estimated_keywords": "114",
"priority": "1",
"finished": "2016-12-11 03:04:01",
"credit_cost": "0.60",
"start": "1",
"progress": 100,
"total_keywords": "114",
"total_results": "114",
"total_pages": 3,
"estimated_hours_left": 0.04
"pages" : 3
}
}
}


- estimated_keywords is the number of original accepted keywords
- finished is either "null" or shows the date and time when the last keyword was scraped

- progress shows the percentage of finished keywords

- total_keywords is the actual number of keywords in case a keyword had to be rejected

- total_results is the actual number of scraped keywords

- total_pages is the number of "pages" available once scraping is finished
- estimated_hours_left is the estimation how long this job will run once started or until finished, float value in hours
- pages is the number of "pages" available by using the "get_results" API call




create_job:

https://scraping.services/api?user=_LOGIN_&credentials=_PASSWORD_&job_type=scrape_google_search&type=json&task=create_job

To create a Job an additional json formated object needs to be passed as POST parameter "json={....}"

 

Example:

{
  "job_type" : "scrape_google_search",
  "language" : "English",
  "country" : "default",
  "max_results" : 100,
  "jobname" : "demo_job",
  "priority" : 1
}

The API will respond as usual with a specific json encoded object:
{
  "exceptions": [],
  "answer":
  {
"job":
{
    "jobname": "demo_job",
    "job_type": "scrape_google_search"
  }
}
}


Once a Job has been created it can be populated with keywords.

Available countries and languages:

 

 

This API query can be easily tested by using the command line tool "curl":




remove_job:

https://scraping.services/api?user=_LOGIN_&credentials=_PASSWORD_&job_type=scrape_google_search&type=json&task=remove_job&jobname=_JOBNAME_

This removes a selected Job by it's unique identifier.

You may remove any Job that has not been started.

 

API Response:

{
"exceptions": [],
"answer":
[
true
]
}



modify_job:

https://scraping.services/api?user=_LOGIN_&credentials=_PASSWORD_&job_type=scrape_google_search&type=json&task=modify_job

To modify a Job an additional json formated object needs to be passed as POST parameter "json={....}"
A Job may be modified, extended or deleted until it has been started. 

Once a Job was started it can not be modified in any way.

The supplied json object must include "job_type" and "jobname", any other elements are optional.

When adding keywords to a Job the element "keyword_operation" may contain "replace" or "append".

  "replace" will remove any existing keywords from the selected Job before adding new keywords.

  "append" is default behaviour and will append any supplied keywords to the selected Job.

Example json object:

{
  "job_type" : "scrape_google_search",
  "language" : "English",
  "country" : "default",
  "max_results" : 200,
  "jobname" : "demo_job",
  "priority" : 2,
"keywords" :
[
"keyword 1",
"keyword 2",
"keyword 3"
],
"keyword_operation" : "replace"
}

 

Example API response:

{
"exceptions": [],
"answer":
{
"job":
{
"jobname": "demo_job",
"max_results": 200,
"priority": 2,
"job_type": "scrape_google_search",
"estimated_keywords": 3,
"credit_cost": 0.9
}
}
}

The response reflects all applied changes and shows the credit cost that will occur when the Job is started.

There is a minimal credit cost for any Job, currently 0.6 credits for a Priority 1 Job and 0.9 credits for a Priority 2 Job.

 



start_job:

https://scraping.services/api?user=_LOGIN_&credentials=_PASSWORD_&job_type=scrape_google_search&type=json&task=start_job&jobname=_JOBNAME_

Any created job that contains keywords may be started at any time.

Once started, the job is locked from further modification and will immediately deduct the credit cost from your wallet.

Such an action is non refundable and can not be interrupted anymore.

 

Once a job was started it can be monitored using the "get_job" API call to see how many keywords are already finished.

It is possible to download the keyword results using "get_results" while a job is running.

Scraped keyword results are stored for 14 days, once this timespan has passed the detail information will be removed.

 

Example response:

{
"exceptions": [],
"answer":
{
"job":
{
"jobname": "demo_job",
"start": "1",
"job_type": "scrape_google_search",
"credit_cost": "0.90"
}
}
}



get_results:

https://scraping.services/api?user=_LOGIN_&credentials=_PASSWORD_&job_type=scrape_google_search&type=json&task=get_results&jobname=_JOBNAME_&page=_PAGE_

Scraped Google keyword results are downloaded in pages, each page contains 50 keyword results.

The estimated download size is between 1 and 10 MB per page.

The API response object contains an array "keyword_results" with up to 50 keyword results.
Each keyword_result contains information about the keyword as well as all organic results and advertisement results, again in form of two arrays.

Let's take a look at a simplified API response:

{
"results":
{
"exceptions": [],
"answer": {
"jobname": "demo_test_2",
"job_type": "scrape_google_search",
"keyword_results":
[
{
"count_organic": 101,
"count_creative": 0,
"language": "en",
"country": "Default",
"keyword": "A Chip on Your Shoulder",
"results_organic":
[
{
"url": "https:\/\/en.wikipedia.org\/wiki\/Chip_on_shoulder",
"title": "Chip on shoulder - Wikipedia",
"description": "To have a chip on one's shoulder refers to the act of holding a grudge or grievance that readily provokes disputation.",
"sitelinks": []
},
{
"url": "https:\/\/en.wikipedia.org\/wiki\/Chip_on_shoulder",
"title": "Chip on shoulder - Wikipedia",
"description": "",
"sitelinks": []
},
{
"url": "https:\/\/en.wikipedia.org\/wiki\/Chip_on_shoulder",
"title": "Chip on shoulder - Wikipedia",
"description": "To have a chip on one's shoulder refers to the act of holding a grudge or grievance that readily provokes disputation.",
"sitelinks": []
},
{
"url": "http:\/\/dictionary.cambridge.org\/us\/dictionary\/english\/have-a-chip-on-your-shoulder",
"title": "have a chip on your shoulder Definition in the Cambridge English ...",
"description": "have a chip on your shoulder definition, meaning, what is have a chip on your shoulder: to seem angry all the time because you think you have been treated ...",
"sitelinks": []
} /*,... additional results*/
],
"results_creative": []
} /*,... additional keywords*/
],
"page": 1,
"pages_expected": 1,
"pages_available": 1
}
}
}

 

- keyword_results: is an array of objects, each object is one keyword :
-- count_organic: the number of organic (normal) results for this keyword
-- count_creative: the number of advertisements for this keyword
-- language: the language code
-- country: the country code
-- keyword: the keyword
-- results_organic:
--- url: The URL 
--- title: The title
--- description: The result description
--- sitelinks: An array with possible sitelinks
---- An array containing all Google sitelinks with url, title, description each
-- results_creative: an array identical to results_organic
- page: the current page of results requested (containing up to 50 keywords each containing up to 1000 results)
- pages_available: the number of total pages currently available
- pages_expected: the number of total pages expected when the job finished

 

New feature - Google Knowledge
We have added Google Knowledge into our JSON result data! (CSV to be added at a later time)
You may now search for

Our new beta feature might have to go through some minor naming convention changes in the future but it's ready to be used.

 

Take an english search for "Obama" for example:

                    "result_knowledge": {
                        "image_logo_src": "https:\/\/lh6.googleusercontent.com\/-2lJYGtfXKwQ\/AAAAAAAAAAI\/AAAAAAAB2HQ\/QSmIw0nQN_c\/s0-c-k-no-ns\/photo.jpg",
                        "header_text": "Barack Obama",
                        "header_sub_text": "44th U.S. President",
                        "times_open": [],
                        "description": "Barack Hussein Obama II is an American politician who served as the 44th President of the United States from 2009 to 2017. He was the first African American to serve as president, as well as the first born outside the contiguous United States. Wikipedia ",
                        "items_of_interest": [
                            "Barack Obama was the 44th and Donald Trump is the 45th President of the United States.",
                            "Hillary Clinton",
                            "Barack Obama and Michelle Obama have been married since 1992.",
                            "Ann Dunham was Barack Obama's mother.",
                            "Vladimir Putin"
                        ],
                        "person_born_info": "August 4, 1961 (age 55 years), Kapiolani Medical Center for Women and Children, Honolulu, HI",
                        "person_education_info": "Harvard Law School (1988–1991), More"
                    }

 Another example, english "Ferrari":

 "result_knowledge": {
                        "image_logo_src": "http:\/\/lh5.googleusercontent.com\/-SnTKdWZA8NI\/VWSg3sycQ6I\/AAAAAAAAACk\/37STRepzvJM5uuLfSs793X4Feo_CI1HJgCJkC\/s241-k-no\/",
                        "location_href": "https:\/\/www.google.com\/maps\/place\/MAG+Ferrari+In+Columbus,+Ohio\/@40.099688,-83.125521,15z\/data=!4m2!3m1!1s0x0:0x2ac116cff7ca4e58?sa=X&ved=0ahUKEwifqZeGy-XRAhUM7YMKHevBBQEQ_BII8wUwfA",
                        "location_img_src": "https:\/\/www.google.com\/maps\/vt\/data=RfCSdfNZ0LFPrHSm0ublXdzhdrDFhtmHhN1u-gM,oQRzStoSkLpDIbvDpw9QlO3VnR7-oeqWzIo7ofJbM-zACv_FTzt7ishssYouQq_AXBXkLbKUZbiHQ_7CzU4C1Dxr5Uzv4rs935AuzFkVvuhGcQ-EdBJhM_xRRlJ0QkCEHkn6P6Q",
                        "header_text": "Ferrari S.p.A.MAG Ferrari In Columbus, Ohio",
                        "header_sub_text": "Car manufacturer",
                        "website_href": "http:\/\/www.dublin.ferraridealers.com\/",
                        "address_text": "5035 Post Rd, Dublin, OH 43017",
                        "phone_text": "Phone:(614) 389-5557",
                        "times_open": [
                            {
                                "day": "Saturday",
                                "time": "9AM–6PM"
                            },
                            {
                                "day": "Sunday",
                                "time": "Closed"
                            },
                            {
                                "day": "Monday",
                                "time": "9AM–7PM"
                            },
                            {
                                "day": "Tuesday",
                                "time": "9AM–7PM"
                            },
                            {
                                "day": "Wednesday",
                                "time": "9AM–7PM"
                            },
                            {
                                "day": "Thursday",
                                "time": "9AM–7PM"
                            },
                            {
                                "day": "Friday",
                                "time": "9AM–6PM"
                            }
                        ],
                        "description": "Ferrari S.p.A. is an Italian sports car manufacturer based in Maranello. Founded by Enzo Ferrari in 1939 as Auto Avio Costruzioni, the company built its first car in 1940. Wikipedia ",
                        "items_of_interest": [
                            "Lamborghini",
                            "Maserati",
                            "Tesla Motors",
                            "Bugatti Automobiles",
                            "Porsche"
                        ],
                        "organization_founded_info": "September 13, 1939, Maranello, Italy"
                    }

 

The full list of possible values within result_knowledge:

It's important to know that the language used to search will also be present inside the knowledge data, including day names, time formats and language specific conventions.