Send Data

Description

Send data to an existing job or multiple jobs. If the comma separated list of jobIds form is used the data will be duplicated and sent to each job for analysis. The jobs should all expect data in the same format. The send operation returns a JSON object containing counts of the number of records, fields and bytes processed and other informative details. See Data Response for details.

Important

The Engine API can only accept data from a single connection. Do not attempt to access the data endpoint from different threads at the same time. Use a single connection synchronously to send data, close, flush or delete a single job.

Definition

http://localhost:8080/engine/v2/data/<jobId>[,<jobId2>,<jobId3>...]
jobId:A comma separated list of the identifier(s) of the job that the data is destined for. At least 1 jobId must be specified

Parameters

ignoreDowntime:Controls if gaps in data are treated as anomalous or as a maintenance window after a job re-start. See Ignore Downtime.

Method

POST

Returns

Status code 202 is returned once the upload has finished.

EXAMPLE RESPONSES

When the single job form is used and the upload is successful the response will contain these fields:

{
  "responses" : [ {
    "jobId" : "super-important-analysis",
    "uploadSummary" : {
      "processedRecordCount" : 6275,
      "processedFieldCount" : 25100,
      "inputRecordCount" : 6293,
      "inputBytes" : 88538,
      "inputFieldCount" : 25172,
      "invalidDateCount" : 18,
      "missingFieldCount" : 0,
      "outOfOrderTimeStampCount" : 0,
      "failedTransformCount" : 0,
      "latestRecordTimeStamp" : "2015-06-25T23:59:56.000+0000"
    }
  } ]
}

In this example only 6275 of the 6293 input records were processed because 18 had date fields that couldn’t be parsed.

If the data is sent to multiple jobs then multiple responses are returned:

{
  "responses" : [ {
    "jobId" : "joba",
    "uploadSummary" : {
      "processedRecordCount" : 86275,
      "processedFieldCount" : 172550,
      "inputRecordCount" : 86275,
      "inputBytes" : 3760538,
      "inputFieldCount" : 258825,
      "invalidDateCount" : 0,
      "outOfOrderTimeStampCount" : 0,
      "missingFieldCount" : 0,
      "failedTransformCount" : 0,
      "latestRecordTimeStamp" : "2014-06-27T23:59:56.000+0000"
    }
  }, {
    "jobId" : "jobb",
    "error" : {
      "cause" : null,
      "message" : "No known job with id 'jobb'",
      "errorCode" : 20101
    }
  },
  }, {
    "jobId" : "jobc",
    "uploadSummary" : {
      "processedRecordCount" : 86275,
      "processedFieldCount" : 172550,
      "inputRecordCount" : 86275,
      "inputBytes" : 3760538,
      "inputFieldCount" : 258825,
      "invalidDateCount" : 0,
      "outOfOrderTimeStampCount" : 0,
      "missingFieldCount" : 0,
      "failedTransformCount" : 0,
      "latestRecordTimeStamp" : "2014-06-27T23:59:56.000+0000"
    }
  } ]
}
]

In this case there was an error uploading to jobb as the job does not exist.

Ignore Downtime

Gaps in your data, which may occur natrually as a result of downtime or an maintenance window, can be regarded as anomalous by the Engine API. When restarting a job after a period of downtime setting the ignoreDowntime flag instructs the Engine not to generate anomalies for the period between closing the job and restarting it. This parameter only has affect at the job re-start after the job has been closed, in all other cases the parameter is ignored.

For example:

http://localhost:8080/engine/v2/data/job-with-gap-in-data?ignoreDowntime=true

ignoreDowntime is a Boolean value the strings ‘True’ and ‘true’ evaluate to true all other values are false, the default is false.

Errors

See the Error Codes documentation for the full list of errors that may be returned by the API.

NOTES

The API will not send a final 202 response to the POST request until all the data has been copied into the engine. At the time the server responds the Engine may not have finished processing all the data sent but it is ready to accept more. If the data is POSTed to multiple jobs the request will not return until all the jobs are ready. If the client and server are geographically separate and share an unreliable connection or if your data is inside a very large file it is prudent to stream the data in chunks. This is because if the upload fails, only a single chunk needs to be resent. The Web Service can handle gzipped data but you cannot split a gzipped file into chunks and upload them piecemeal - the entire compressed chunk has to be sent in one message. See Using compressed data for details.

Important

The Engine API prefers time series data to be in ascending chronological order. If the order of the data cannot be guaranteed, a latency window can be specified in the job analysis configuration. Further notes on handling out-of-sequence data are available.

Where latency has not been specified and data is streamed in chunks with multiple POSTs this ordering must be maintained with the earliest data in the first POST. If a job is restarted after being finished and persisted, the new data must be temporally ordered after the last data processed by the job before it was finished.