Skip to main content
Jobs are the core resource in Harmstack. A job runs a specific benchmark against your model endpoint and produces a score. You can submit a single job, submit multiple jobs in one batch, list your past jobs, and retrieve a specific job’s status and results. All endpoints require authentication.
Authorization
string
required
Bearer YOUR_API_KEY

Submit a job

POST https://api.harmstack.com/v0/jobs
Submit a single benchmarking job. The job is queued immediately and begins processing asynchronously. Poll GET /v0/jobs/{id} to check progress and retrieve results.

Request body

benchmark_id
number
required
The ID of the benchmark to run. Retrieve available benchmark IDs from GET /v0/benchmarks.
endpoint_url
string
required
The URL of your model’s chat completions endpoint (e.g. https://your-model.example.com/v1/chat/completions).
api_key
string
required
The API key used to authenticate requests to your model endpoint.
provider
string
The API shape of your model endpoint. Accepted values: openai, openai_responses, gemini, raw. Defaults to openai.
model
string
The model name or identifier (e.g. "gpt-4o", "claude-3-5-sonnet"). Optional — used for logging.
headers
object
Additional HTTP headers to include with every request to your model endpoint. Provide as key-value pairs.
benchmark_count
number
default:"1"
Number of benchmark units to run. Must be between 1 and 10. Each unit costs one credit.
seed
number
Optional random seed for reproducible unit sampling.

Example request

curl --request POST \
  --url https://api.harmstack.com/v0/jobs \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "benchmark_id": 1,
    "endpoint_url": "https://your-model.example.com/v1/chat/completions",
    "api_key": "sk-your-model-api-key",
    "provider": "openai",
    "model": "gpt-4o",
    "benchmark_count": 3
  }'

Response

Returns 202 Accepted.
job_id
string
required
UUID of the created job. Use this to poll for results.
status
string
required
Initial status of the job. Always "pending" on creation.
message
string
required
A human-readable message with the polling URL.
{
  "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "pending",
  "message": "job submitted; poll GET /jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890 for progress"
}
Submitting a job immediately deducts credits equal to benchmark_count from your account balance. Ensure you have sufficient credits before submitting — check your balance with GET /v0/me.

Submit a batch of jobs

POST https://api.harmstack.com/v0/jobs/batch
Submit multiple benchmarking jobs in a single request. All jobs in the batch share the same model endpoint configuration. Credits for all jobs in the batch are deducted atomically — the entire batch is rejected if your balance is insufficient.

Request body

endpoint_url
string
required
The URL of your model’s chat completions endpoint.
api_key
string
required
The API key used to authenticate requests to your model endpoint.
provider
string
The API shape of your model endpoint. Accepted values: openai, openai_responses, gemini, raw. Defaults to openai.
model
string
The model name or identifier.
headers
object
Additional HTTP headers to include with every request to your model endpoint.
seed
number
Optional random seed for reproducible unit sampling across all jobs in the batch.
jobs
object[]
required
Array of job definitions. Each item specifies which benchmark to run and how many units.

Example request

curl --request POST \
  --url https://api.harmstack.com/v0/jobs/batch \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "endpoint_url": "https://your-model.example.com/v1/chat/completions",
    "api_key": "sk-your-model-api-key",
    "provider": "openai",
    "model": "gpt-4o",
    "jobs": [
      { "benchmark_id": 1, "benchmark_count": 5 },
      { "benchmark_id": 2, "benchmark_count": 3 }
    ]
  }'

Response

Returns 202 Accepted.
batch_id
string
required
UUID identifying the batch.
job_ids
string[]
required
Array of UUIDs for each created job. Poll GET /v0/jobs/{id} for each to retrieve results.
status
string
required
Initial status of all jobs in the batch. Always "pending" on creation.
message
string
required
A human-readable message describing how to poll for results.
{
  "batch_id": "f9e8d7c6-b5a4-3210-fedc-ba9876543210",
  "job_ids": [
    "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "b2c3d4e5-f6a7-8901-bcde-f12345678901"
  ],
  "status": "pending",
  "message": "job batch submitted; poll GET /jobs/{id} for each job for progress"
}

List jobs

GET https://api.harmstack.com/v0/jobs
Returns a list of jobs for your account. By default returns your 10 most recent completed jobs.

Query parameters

status
string
default:"completed"
Filter jobs by status. Accepted values: pending, running, completed, failed.
limit
number
default:"10"
Maximum number of jobs to return.

Example request

curl "https://api.harmstack.com/v0/jobs?status=completed&limit=5" \
  --header "Authorization: Bearer YOUR_API_KEY"

Response

Returns 200 OK with a jobs array.
jobs
object[]
required
Array of job objects.
{
  "jobs": [
    {
      "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "status": "completed",
      "module": "Haystack",
      "benchmark_id": 1,
      "benchmark_name": "haystack-medical-v1",
      "endpoint_url": "https://your-model.example.com/v1/chat/completions",
      "needle_annnoted_count": 3,
      "hay_non_annotated_count": 30,
      "total_prompts": 33,
      "created_at": "2024-06-01T12:00:00Z",
      "completed_at": "2024-06-01T12:05:42Z",
      "passed_count": 2,
      "failed_count": 1,
      "score_pct": 66.67,
      "total_scored": 3
    }
  ]
}

Get a job

GET https://api.harmstack.com/v0/jobs/{id}
Retrieve the full status and results for a specific job. Poll this endpoint after submitting a job to track progress and retrieve your score.

Path parameters

id
string
required
The UUID of the job, returned from POST /v0/jobs or POST /v0/jobs/batch.

Example request

curl https://api.harmstack.com/v0/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  --header "Authorization: Bearer YOUR_API_KEY"

Response

Returns 200 OK. Fields are the same as in the list response, with additional real-time progress fields available while the job is running.
job_id
string
required
UUID of the job.
status
string
required
Current job status: pending, running, completed, or failed.
module
string
required
The evaluation module used.
benchmark_id
number
required
ID of the benchmark that was run.
benchmark_name
string
required
Name of the benchmark that was run.
endpoint_url
string
required
The model endpoint URL that was evaluated.
needle_annnoted_count
number
required
Number of annotated needle prompts used.
hay_non_annotated_count
number
required
Number of non-annotated hay prompts used.
total_prompts
number
required
Total number of prompts sent to your model endpoint.
price
number
required
Credits deducted for this job.
created_at
string
required
ISO 8601 timestamp of when the job was created.
completed_at
string
ISO 8601 timestamp of when the job completed. null if still in progress.
passed_count
number
Number of evaluation units your model passed. Present when job is completed.
failed_count
number
Number of evaluation units your model failed. Present when job is completed.
score_pct
number
Your model’s score as a percentage (0–100). Present when job is completed.
total_scored
number
Total number of units scored. Present when job is completed.
current
number
Number of prompts processed so far. Present while the job is running.
total
number
Total number of prompts to process. Present while the job is running.
message
string
A progress message from the runner. Present while the job is running.
{
  "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "completed",
  "module": "Haystack",
  "benchmark_id": 1,
  "benchmark_name": "haystack-medical-v1",
  "endpoint_url": "https://your-model.example.com/v1/chat/completions",
  "needle_annnoted_count": 3,
  "hay_non_annotated_count": 30,
  "total_prompts": 33,
  "price": 3.0,
  "created_at": "2024-06-01T12:00:00Z",
  "completed_at": "2024-06-01T12:05:42Z",
  "passed_count": 2,
  "failed_count": 1,
  "score_pct": 66.67,
  "total_scored": 3
}
When a job is still in progress (status: "running"), the response also includes current, total, and optionally message fields so you can track real-time progress.