API Specification for Antiplagiarism Server
Server request:
Address: http://{xxx.xxx.xxx.xxx:yyy}/etxt_antiplagiat
POST-parameters for various action:
1) Getting the server current status:
try=1
2) Sending package to originality checking
xmlUrl={address to xml-package containing the texts for checking}
xmlAnswerUrl={Address of script accepting originality checking results must at the end return "ok" written as roman type (no quotation marks)}
xmlAnswerUrl={Address of script accepting originality checking results must at the end return "ok" written as roman type (no quotation marks)}
Notes:
- package successfully sent to sequential queue for originality checking, if server response code is Code=1
- if the script accepting originality checking results wouldn't return "ok" at the end, the antiplagiarism server will repeatedly try to send the originality checking results
Server response (JSON):
Code - Returns server response code:
0 - Unidentified mistake
1 - Package processing tack accepted
3 - Incorrect (absent) address to xml-file containing package of documents for processing
4 - Incorrect address of server request (should look as http://{xxx.xxx.xxx.xxx:yyy}/etxt_antiplagiat)
5 - Incorrect (absent) address to the script accepting originality checking results
6 - Server status request successfully proceeded
7 - No Internet access
10 - Request denied as it is impossible to check client's access to the server. Try again later.
11 - Request denied due to inaccessibility of server (most probably result of zero account balance)
12 - Request denied as the limit of maximum amount of packages in queue is exceeded (50 by default)
13 - Request denied as the time limits of access to the texts are exceeded
Description - Returns textual description of server response
NumPacketsInQueue - Returns the current amount of packages in queue
AvgDocumentTime - Returns average time of document processing, shown in minutes. This number will be below zero if no data is available at the moment
CurrentPacketTime - Current time of proceeding current package, shown in minutes (integer)
Response case:
{"Code":3,"Description":"There is no address of package to proceed in request","NumPacketsInQueue":2,"AvgDocumentTime":-1.000000E+000, "CurrentPacketTime":0}
Getting results:
The script, address of which was noted in POST-parameter at the server request, accepts the result. The following POST variables will be available:
XmlFileName - The name of file containing XML-package of documents
NumDocsInPacket - The number of documents in package
PacketTime - The time of packet proceeding , shown in minutes
DocumentTime - The average time of a document in package proceeding, shown in minutes
ServerType - The parameter accepted from a client in an XML-package at request
TotalWords - Total amount of all the words in the package of documents after canonicalization
Xml - The result of XML-package checking, crypted and then written as Base64
Error - Contains a mistake, if an XML-package was not checked (in this case all the POST string variables will be empty, apart from XmlFileName, and all POST numeric variables will be zero), coded as Base64
Xml-package (Request)
Notes:
- symbols [] denote optional pretense of nodes/attributes
<?xml version="1.0" encoding="UTF-8"?>
<root>
<serverType></serverType>
[<exceptions>
<url>{Url ignored while checking originality of every document in a package, coded as Base64}</url>
...
<url>...</url>
<domain>{domain ignored while checking originality of every document in a package, coded as Base64}</domain>
...
<domain>...</domain>
</exceptions>]
<entry>
<id>{ Number }</id>
<type>{ String }</type>
<uservars>{Any user data in XML}</uservars>
<name>{Text name coded as Base64}</name>
[<text>{Text for checking coded as Base64}</text>]
[<exceptions>
<url>{Url ignored while checking originality of every document in a package, coded as Base64}</url>
...
<url>...</url>
<domain>{domain ignored while checking originality of every document in a package, coded as Base64}</domain>
...
<domain>...</domain>
</exceptions>]
</entry>
<entry>
...
</entry>
</root>
Xml-package (Response)
Notes:
- symbols [...] denote optional pretense of nodes/attributes
- if the originality of a text is 100%, the node ftext is absent
- in nodes url there will be shown (maximum of 5) the urls of pages containing the biggest percentage of likeness
<?xml version="1.0" encoding="UTF-8"?>
<root>
<numPacketsInQueue>{current number of packets in a queue}</numPacketsInQueue>
[<exceptions>
<url>{Url ignored while checking originality of every document in a package, coded as Base64}</url>
...
<url>...</url>
<domain>{domain ignored while checking originality of every document in a package, coded as Base64}</domain>
...
<domain>...</domain>
</exceptions>]
<entry>
<id>{ Number }</id>
<type>{ String }</type>
<uservars>{Any user data in XML}</uservars>
<name>{Name of text coded as Base64}</name>
[<ftext uniq={Originality(integer)}>
{Checked text with highlighted matching parts, coded as Base64}
</ftext>]
[<urls>
<url unique={originality of the checked text as the related to the content of given url (integer from 0 to 100)}
color={colour of highlighted matching shingles in ftext, shown as #xxxxxx}
[title={content of title tag, coded as Base64}]>
{Url containing matches with the checked text, coded as Base64}
</url>
<url ...> ...</url>
...
</urls>]
<statistics>
<download_ratio>{Ratio of total amount of downloaded pages to all pages (per cent)}</download_ratio>
<total_pages>{Total amount of pages to download}</total_pages>
</statistics>
[<exceptions>
<url>{Url ignored while checking originality of every document in a package, coded as Base64}</url>
...
<url>...</url>
<domain>{domain ignored while checking originality of every document in a package, coded as Base64}</domain>
...
<domain>...</domain>
</exceptions>]
</entry>
<entry>
...
</entry>
...
</root>
PHP-class for Antiplagarism server communication
<?
define('CHECK_KEY', 'YOUR KEY');
class EtxtAntiPlagiat
{
// address to Antiplagiarism server
private $serverUrl = array (
'server' => 'xxx.xxx.xxx.xxx:yyy/etxt_antiplagiat',
);
// type of server by default
public $serverType = 'server';
// address to the web-part of checking
private $localServer = 'http://www.site.ru/';
// address for getting results
public $localUrl = 'http://www.site.ru/antiplagiat/upload.php';
// array of objects for checking
private $ItemsToCheck = array();
// folder for xml-tasks
private $xmlPath = '/home/user/site.ru/htdocs/antiplagiat/xml/';
// types of objects for checking
private $typesObjects = array ('text');
// cipher usage key
public $useCrypt = 1;
// flag of antiplagiarism server connection
public $isConnect = false;
// mistake status
public $Error = '';
// denoting the package priority
public $sort = 0;
// constructor, parameter - xml-file name
function EtxtAntiPlagiat($path = '', $my_crypt = 1, $serverType = 'server')
{
if ($path) $this->xmlPath .= $path;
$this->useCrypt = $my_crypt;
if ($this->useCrypt == 1) $this->useCrypt = CHECK_KEY;
$this->serverType = $serverType;
// ping server
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $this->serverUrl[$this->serverType]);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "try=1");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
$json = curl_exec($ch);
curl_close($ch);
// stating ping result
$this->isConnect = $json ? $json : false;
$this->isConnect = str_replace('\,', ',', $this->isConnect);
if (($tmp = json_decode($this->isConnect)) && $tmp->Code == 7) {
$this->isConnect = false;
$this->Error = $tmp->Description;
}
}
// function of adding an object for checking
public function addItemToCheck($data)
{
if (!$data['id'] || !in_array($data['type'], $this->typesObjects)) return false;
$this->ItemsToCheck[] = array ('id' => $data['id'], 'text' => (isset($data['text']) ? $this->codeText($data['text']) : ''), 'type' => $data['type'], 'name' => $this->codeText($data['name']), 'uservars' => isset($data['uservars']) ? $data['uservars'] : array());
return true;
}
// function of text converting
private function codeText($text)
{
return base64_encode(@iconv('WINDOWS-1251', 'UTF-8//IGNORE', $text));
}
// function of processing request for checking
public function execRequest($create = 1)
{
// trying to create XML-file containing tasks
if ($create && !$this->createXml()) return false;
// Sending request to Antiplagiarism server
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $this->serverUrl[$this->serverType]);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "xmlUrl=".$this->localServer.'antiplagiat/xml/'.basename($this->xmlPath)."&xmlAnswerUrl=".$this->localUrl.($this->sort ? '&sort='.$this->sort : ''));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
$json = curl_exec($ch);
curl_close($ch);
return str_replace('\,', ',', $json);
}
// Function of creating XML-file containing tasks
private function createXml()
{
$string = '<?xml version="1.0" encoding="UTF-8" ?'.'><root>';
$string .= '<serverType>'.$this->serverType.'</serverType>';
foreach ($this->ItemsToCheck as $item) {
$string .= '
<entry>
<id>'.$item['id'].'</id>
<type>'.$item['type'].'</type>';
if (isset($item['uservars']) && $item['uservars'] && is_array($item['uservars'])) {
$string .= '
<uservars>';
foreach ($item['uservars'] as $key => $uservar)
$string .= '
<'.$key.'>'.$uservar.'</'.$key.'>';
$string .= '
</uservars>';
}
$string .='
<name>'.$item['name'].'</name>'.
(isset($item['text']) && $item['text'] ? '<text>'.$item['text'].'</text>' : '').'
</entry>';
}
$string .= '</root>';
if (is_file($this->xmlPath)) unset($this->xmlPath);
// ciphering data
if ($this->useCrypt) {
$td = mcrypt_module_open (MCRYPT_RIJNDAEL_128, '', MCRYPT_MODE_ECB, '');
mcrypt_generic_init ($td, $this->useCrypt, $this->useCrypt);
$string = mcrypt_generic ($td, $string);
mcrypt_generic_deinit ($td);
mcrypt_module_close ($td);
}
// saving file
$f = @fopen($this->xmlPath, 'w') or die("Failed open file # ".$this->xmlPath." # on write");
fwrite($f, $string, strlen($string));
fclose($f);
if (is_file($this->xmlPath)) return true;
return false;
}
}
?>
Making a task for checking
// creating an object to form the request for checking
$etxtPlagiat = new EtxtAntiPlagiat(md5(time()).'.xml', 1, 'server');
// доступность сервера
if (!$etxtPlagiat->isConnect || !($tmp = json_decode($etxtPlagiat->isConnect))) {
logging('*** No connection to Antiplagiarism server');
exit();
}
// choosing objects for checking
$items = SELECT * FROM table
// adding all the objects for checking
foreach ($items as $item) {
// добавляем объект на проверку
$etxtPlagiat->addItemToCheck(array('id' => $item['id'], 'text' => $item['text'], 'type' => $item['type'], 'name' => $item['title'], 'uservars' => $item['usersvars']));
}
// sending request for checking to the server
if ($data = $etxtPlagiat->execRequest()) {
if ($data = json_decode($data)) {
logging(iconv('UTF-8', 'WINDOWS-1251//IGNORE', $data->Description));
exit();
}
}
Getting server response
// checking data in response
if (!isset($_POST['Xml']) || !$_POST['Xml']) {
logging('No data found in server response');
exit();
}
$_POST['Xml'] = str_replace(' ', '+', $_POST['Xml']);
// decrypting data
$td = mcrypt_module_open (MCRYPT_RIJNDAEL_128, '', MCRYPT_MODE_ECB, '');
mcrypt_generic_init ($td, CHECK_KEY, CHECK_KEY);
$_POST['Xml'] = mdecrypt_generic ($td, base64_decode($_POST['Xml']));
// installing interception of mistakes
libxml_use_internal_errors(true);
// trying to convert xml into an object
if ($xml = simplexml_load_string($_POST['Xml'])) {
foreach ($xml->entry as $item) {
...
}
}