Copyscape Plagiarism Checker - Duplicate Content Detection Software Products  |  Plagiarism  |  Help  |  About  |  Log In 



Copyscape Premium API

The Copyscape Premium API allows your developers to seamlessly integrate Copyscape Premium into your Content Management System, enabling you to automatically check the originality of new content as it arrives from your writers. If you have created a private index, the API also lets you add content to your private index or check new content against it.

Technically speaking, the API (Application Programming Interface) allows your developers to write scripts on your server that query the Copyscape Premium service and receive results in XML or HTML format.

To use the API, please sign up for Copyscape Premium.


Using the API

To begin using the API, you may write a simple script in a server-side language such as PHP, Java or ASP that runs on your server and queries the API. The API cannot be accessed by Javascript/Ajax running within a web page, since web browsers do not allow cross-domain Ajax requests.

The following sections provide the information you need to use the Copyscape Premium API:


Sample Code - Some examples which demonstrate how to use the API.


URL Search Request - Explains how to check for copies of a web page via the API.

Text Search Request - Explains how to check for copies of some text via the API.

XML Search Response - Describes XML responses that you receive from API search requests.

HTML Search Response - Describes HTML responses that you receive from API search requests.


URL Add to Private Index Request - Explains how to add the content from a URL to your private index.

Text Add to Private Index Request - Explains how to add some text to your private index.

Add to Private Index Response - Describes responses from API requests that add to your private index.

Delete from Private Index Request and Response - Explains how to delete content from your private index.


Check Balance Request and Response - Describes how to check your account balance.


Sample Code

Click to download sample code for accessing the Copyscape API in PHP, Python, Ruby, Java, Perl, Coldfusion or these flavors of ASP.NET: C#, Visual Basic, C# (Razor syntax), VB (Razor syntax).


URL Search Request

To check for copies of a web page via the Copyscape API, send an HTTP GET request to this URL:

http://www.copyscape.com/api/

Parameters are specified on the URL (using ? and &) as follows:

Parameter Explanation Value Required? Default
u Your username [your username] Yes -
k Your API key [your API key] Yes -
o API operation csearch (or psearch or cpsearch
if you create a private index)
Yes -
q Source URL [urlencoded URL] Yes -
c Full comparisons 0 to 10 No 0
f Response format xml or html No xml
x Example test 1 or omitted No -

API operation (o): Use csearch to search against the public Internet or psearch to search against your private index. You can also use cpsearch to search against both the Internet and your private index, for the cost of two search credits.

Source URL (q): As per the HTTP specification, this must be urlencoded. For example, ? should be replaced by %3F and & replaced by %26. Most languages provide a built-in function for urlencoding - see examples in PHP, Java and ASP.

Full comparisons (c): Set to a value between 1 and 10 to request a full text-on-text comparison (with an exact count of matching words) between the query text and the top (one to ten) results found. Note that full comparisons may add a delay of a few seconds.

Response format (f): If omitted or set to xml, the API will respond in XML. If set to html, the API will respond in basic HTML.

Example test (x): If set to 1, the API will search the Internet for copies of this page and you will not be charged.


Text Search Request

To check for copies of some text via the Copyscape API, send an HTTP POST request to this URL:

http://www.copyscape.com/api/

The text to be searched and other parameters can be specified in one of two ways:

  • Form Encoded. Provide all parameter values as form-urlencoded data within the HTTP POST payload. This is how web browsers submit forms over HTTP, and will usually be easiest. If your scripting language lets you set up an HTTP POST request with a list of parameter values, it will probably build this form-urlencoded payload automatically.

  • Raw POST. Provide all parameters except the text to be searched (parameter t) on the URL (using urlencoding and ? and &, as if this was an HTTP GET). Provide the text itself in the raw HTTP POST payload data with no parameter name and no urlencoding. This method may be easier if you are building HTTP requests at a lower level, or using a command-line tool such as curl.
The parameters are as follows:

Parameter Explanation Value Required? Default
u Your username [your username] Yes -
k Your API key [your API key] Yes -
o API operation csearch (or psearch or cpsearch
if you create a private index)
Yes -
e Text encoding [encoding name] Yes -
t Text to be searched [the text] Yes -
c Full comparisons 0 to 10 No 0
f Response format xml or html No xml
x Example test 1 or omitted No -

API operation (o): Use csearch to search against the public Internet or psearch to search against your private index. You can also use cpsearch to search against both the Internet and your private index, for the cost of two search credits.

Text encoding (e): Use an IANA name, such as UTF-8 (Unicode), ISO-8859-1 (Latin-1) or WINDOWS-1251 (Cyrillic).

Text to be searched (t): If you are using the Raw POST method, as described above, the raw text should be supplied in the POST payload without a parameter name and without any urlencoding.

Full comparisons (c): Set to a value between 1 and 10 to request a full text-on-text comparison (with an exact count of matching words) between the query text and the top (one to ten) results found. Note that full comparisons may add a delay of a few seconds.

Response format (f): If omitted or set to xml, the API will respond in XML. If set to html, the API will respond in basic HTML.

Example test (x): If set to 1, the API will search the Internet for copies of the text on this page and you will not be charged.


XML Search Response

For searches with XML responses, the API returns UTF-8 encoded XML enclosed by a <response> element, with the following subelements:

Element name Explanation Present? Example
<query> URL searched If a URL search http://mydomain.com/ page.html
<error> Reason for API request failure If request failed No search credits remaining
<querywords> Number of words checked If succeeded 583
<count> Number of results found If succeeded 6
<allwordsmatched> Number of source words matched If succeeded and c>=3
and o is not cpsearch
387
<allpercentmatched> Percentage of source words matched If succeeded and c>=3
and o is not cpsearch
56
<alltextmatched> Full extract of source text matched If succeeded and c>=3
and o is not cpsearch
When in the Course of human events...
<allviewurl> URL for viewing found results If succeeded and
o is csearch
http://view.copyscape.com/ search/a1b2c3d4e5

The <query> value may differ from the original URL you supplied if there was a frameset or redirection.

The <allwordsmatched>, <allpercentmatched> and <alltextmatched> values are based on full comparisons performed between the source text and the top (up to 10) results found. They summarize the portion of the source text that was matched in any of these full comparisons. They are present if the c parameter is 3 or more, and the search was not performed against both the Internet and your private index simultaneously.

The <allviewurl> value can be used to display the list of results in an iframe or window. If used, the contents of this page must be displayed in full, without modification.


If the search request succeeded, the <response> element also contains zero or more <result> subelements, each describing one result that was found, will the following subelements:

Element name Explanation Present? Example
<index> Position in results Yes 1
<url> URL of found page or source URL of page in private index Yes http://www.law.indiana.edu/ uslawdocs/declaration.html
<handle> Handle of found article If private index SIA_1_4487334_3978624
<id> ID of found article If private index MY_ARTICLE_123
<articlewords> Number of words in found article If private index 639
<added> When the found article was added (GMT) If private index 2014-11-22 20:50:21
<title> Title of the found web page or article in your private index Yes Declaration of Independence
<textsnippet> Text snippet showing some of the matching text Yes ... separate and equal station to which ...
<htmlsnippet> HTML version of snippet for display in web pages Yes <font color="#777777">...</font> <font color="#000000">separate and equal station</font>
<minwordsmatched> Minimum number of words matching Yes 96
<viewurl> URL for viewing found page If Internet result http://view.copyscape.com/ compare/a1b2c3d4e5/1

The <minwordsmatched> value is an approximate and relative measure of the amount of matching content found for each result. For an exact count of matching words in the top results, use the c API parameter to request full comparisons.

The <viewurl> value can be used to display the found page, with the matching content highlighted, in an iframe or window. If used, the contents of this page must be displayed in full, without modification.


If a full text-on-text comparison was performed for a result, its <result> subelement may also contain:

Element name Explanation Present? Example
<urlwords> Number of words in found page If Internet page retrieved OK 950
<wordsmatched> Exact number of words matching If page retrieved OK 133
<percentmatched> Percentage of submitted content matched on page If page retrieved OK 13
<textmatched> Matching text in full If page retrieved OK When in the Course of human events...
<urlerror> Error retrieving URL If Internet page not retrieved The document could not be retrieved - error code 404


Please note that additional XML elements may be added in future, so your XML parser must safely ignore any elements or subelements which are not recognized.


HTML Search Response

For HTML responses, the API returns UTF-8 encoded content with minimal HTML formatting.

If the search request succeeded, the title of the HTML page contains the URL queried (if appropriate) and the number of results found. The body of the page includes a series of paragraphs, one for each result, e.g.:

Declaration of Independence : Indiana Law
... for opposing with manly firmness his invasions on the rights of the people. ... For transporting us beyond Seas to be tried for pretended offences: ... He has plundered our seas, ravaged our Coasts, burnt our towns, and destroyed the ... He has excited domestic insurrections amongst us, and has endeavoured to bring on ... the merciless Indian Savages, whose known rule of warfare, ... by their legislature to extend an unwarrantable jurisdiction over us. ... which, would inevitably interrupt our connections and correspondence. ... by the Authority of the good People of these Colonies, solemnly publish and declare, ...
http://www.law.indiana.edu/uslawdocs/declaration.html

If the API request failed, the HTML response will contain some red text describing the error.

The HTML format may change in the future, so you should not rely on its structure. The HTML also contains less information than the XML format, and excludes full text-on-text comparisons. To show more information or ensure consistent formatting, please use the XML response format and build your own HTML.


URL Add to Private Index Request

This API operation requires a private index to be created for your account.

To add the content from a URL to your private index, send an HTTP GET request to this URL:

http://www.copyscape.com/api/

Parameters are specified on the URL (using ? and &) as follows:

Parameter Explanation Value Required? Default
u Your username [your username] Yes -
k Your API key [your API key] Yes -
o API operation pindexadd Yes -
q Source URL [urlencoded URL] Yes -
i Article ID [ID for private index] No [none]
f Response format xml or html No xml

Source URL (q) and Article ID (i): These parameters must be urlencoded. For example, ? should be replaced by %3F, & by %26 and space by + or %20. Most languages provide a built-in function for urlencoding - see examples in PHP, Java and ASP.

The title of the article in your private index is taken from the web page at the URL provided. The request returns a response confirming if the operation was successful.


Text Add to Private Index Request

This API operation requires a private index to be created for your account.

To add some text to your private index, send an HTTP POST request to this URL:

http://www.copyscape.com/api/

The text to be added and other parameters can be specified in one of two ways:

  • Form Encoded. Provide all parameter values as form-urlencoded data within the HTTP POST payload. This is how web browsers submit forms over HTTP, and will usually be easiest. If your scripting language lets you set up an HTTP POST request with a list of parameter values, it will probably build this form-urlencoded payload automatically.

  • Raw POST. Provide all parameters except the text to be added (parameter t) on the URL (using urlencoding and ? and &, as if this was an HTTP GET). Provide the text itself in the raw HTTP POST payload data with no parameter name and no urlencoding. This method may be easier if you are building HTTP requests at a lower level, or using a command-line tool such as curl.
The parameters are as follows:

Parameter Explanation Value Required? Default
u Your username [your username] Yes -
k Your API key [your API key] Yes -
o API operation pindexadd Yes -
e Text encoding [encoding name] Yes -
t Text to be added [the text] Yes -
a Article title [title for private index] No [none]
i Article ID [ID for private index] No [none]
f Response format xml or html No xml

Text encoding (e): Use an IANA name, such as UTF-8 (Unicode), ISO-8859-1 (Latin-1) or WINDOWS-1251 (Cyrillic).

Text to be added (t): If you are using the Raw POST method, as described above, the raw text should be supplied in the POST payload without a parameter name and without any urlencoding.

The request returns a response confirming if the operation was successful.


Add to Private Index Response

For XML responses, the API returns UTF-8 encoded XML enclosed by a <response> element.

If the request to add to your private index succeeded, the <response> element contains the subelements <url> (for URL requests only), <words>, <handle>, <id> and <title>.

Please note that additional XML elements may be added in future, so your XML parser must safely ignore any elements or subelements which are not recognized.

For URL requests, the subelement <url> specifies the URL whose content was added. This may differ from the original URL you supplied if there was a frameset or redirection.

The subelement <words> specifies the number of words that were added.

The subelement <handle> provides a reference for the article created in the private index, which you may use to delete the article in the future. It consists of up to 32 ASCII characters.

The subelement <id> contains the article ID that you provided in the request (if any).

The subelement <title> contains the title of the article added to the private index. For URL requests, this is obtained from the title of the web page. For text requests, it contains the title that you provided (if any).

If the request to add to your private index failed, the <response> element contains a subelement <error> explaining the problem and, if appropriate, a subelement <query>.

For HTML responses, the API will return a message confirming whether the content was added successfully.


Delete from Private Index Request and Response

This API operation requires a private index to be created for your account.

To delete an item of content from your private index, send an HTTP GET request to this URL:

http://www.copyscape.com/api/

Parameters are specified on the URL (using ? and &) as follows:

Parameter Explanation Value Required? Default
u Your username [your username] Yes -
k Your API key [your API key] Yes -
o API operation pindexdel Yes -
h Handle Handle of article Yes -
f Response format xml or html No xml

For XML responses, a <response> element is returned. If the request succeeded, it contains:

Element name Explanation Example
<handle> Handle of deleted article SIA_1_4487334_3978624
<id> ID of deleted article MY_ARTICLE_123

If the delete request failed, the <response> element contains a subelement <error> explaining the problem.

For HTML responses, a textual description of the result of the request is returned as basic HTML.

There is no charge for deleting articles from your private index.


Check Balance Request and Response

To check how much credit you have remaining, send an HTTP GET request to this URL:

http://www.copyscape.com/api/

Parameters are specified on the URL (using ? and &) as follows:

Parameter name Explanation Value Required?
u Your username [your username] Yes
k Your API key [your API key] Yes
o Name of operation balance Yes
f Response format xml or html No - xml by default

For XML responses, a <remaining> element is returned containing three subelements:

Element name Explanation Example
<value> Monetary value of your remaining credit in dollars 999.50
<total> Total number of search credits remaining 19990
<today> Number of Internet searches remaining today 9990

For HTML responses, a textual description of your balance is returned as basic HTML.


Getting Assistance

If you have any questions or problems regarding the API, please contact us.

 

Copyscape © 2014 Indigo Stream Technologies, providers of Giga Alert and Siteliner. All rights reserved.