Jobs

Jobs are the core resource in Harmstack. A job runs a specific benchmark against your model endpoint and produces a score. You can submit a single job, submit multiple jobs in one batch, list your past jobs, and retrieve a specific job’s status and results. All endpoints require authentication.

Authorization

string

required

Bearer YOUR_API_KEY

Submit a job

POST https://api.harmstack.com/v0/jobs

Submit a single benchmarking job. The job is queued immediately and begins processing asynchronously. Poll GET /v0/jobs/{id} to check progress and retrieve results.

Request body

benchmark_id

number

required

The ID of the benchmark to run. Retrieve available benchmark IDs from GET /v0/benchmarks.

endpoint_url

string

required

The URL of your model’s chat completions endpoint (e.g. https://your-model.example.com/v1/chat/completions).

api_key

string

required

The API key used to authenticate requests to your model endpoint.

provider

string

The API shape of your model endpoint. Accepted values: openai, openai_responses, gemini, raw. Defaults to openai.

model

string

The model name or identifier (e.g. "gpt-4o", "claude-3-5-sonnet"). Optional — used for logging.

headers

object

Additional HTTP headers to include with every request to your model endpoint. Provide as key-value pairs.

benchmark_count

number

default:"1"

Number of benchmark units to run. Must be between 1 and 10. Each unit costs one credit.

seed

number

Optional random seed for reproducible unit sampling.

Example request

curl --request POST \
  --url https://api.harmstack.com/v0/jobs \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "benchmark_id": 1,
    "endpoint_url": "https://your-model.example.com/v1/chat/completions",
    "api_key": "sk-your-model-api-key",
    "provider": "openai",
    "model": "gpt-4o",
    "benchmark_count": 3
  }'

Response

Returns 202 Accepted.

job_id

string

required

UUID of the created job. Use this to poll for results.

status

string

required

Initial status of the job. Always "pending" on creation.

message

string

required

A human-readable message with the polling URL.

{
  "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "pending",
  "message": "job submitted; poll GET /jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890 for progress"
}

Submitting a job immediately deducts credits equal to benchmark_count from your account balance. Ensure you have sufficient credits before submitting — check your balance with GET /v0/me.

Submit a batch of jobs

POST https://api.harmstack.com/v0/jobs/batch

Submit multiple benchmarking jobs in a single request. All jobs in the batch share the same model endpoint configuration. Credits for all jobs in the batch are deducted atomically — the entire batch is rejected if your balance is insufficient.

Request body

endpoint_url

string

required

The URL of your model’s chat completions endpoint.

api_key

string

required

The API key used to authenticate requests to your model endpoint.

provider

string

The API shape of your model endpoint. Accepted values: openai, openai_responses, gemini, raw. Defaults to openai.

model

string

The model name or identifier.

headers

object

Additional HTTP headers to include with every request to your model endpoint.

seed

number

Optional random seed for reproducible unit sampling across all jobs in the batch.

jobs

object[]

required

Array of job definitions. Each item specifies which benchmark to run and how many units.

Show job item fields

jobs[].benchmark_id

number

required

The ID of the benchmark to run for this job.

jobs[].benchmark_count

number

required

Number of benchmark units for this job. Must be between 1 and 10.

Example request

curl --request POST \
  --url https://api.harmstack.com/v0/jobs/batch \
  --header "Authorization: Bearer YOUR_API_KEY" \
  --header "Content-Type: application/json" \
  --data '{
    "endpoint_url": "https://your-model.example.com/v1/chat/completions",
    "api_key": "sk-your-model-api-key",
    "provider": "openai",
    "model": "gpt-4o",
    "jobs": [
      { "benchmark_id": 1, "benchmark_count": 5 },
      { "benchmark_id": 2, "benchmark_count": 3 }
    ]
  }'

Response

Returns 202 Accepted.

batch_id

string

required

UUID identifying the batch.

job_ids

string[]

required

Array of UUIDs for each created job. Poll GET /v0/jobs/{id} for each to retrieve results.

status

string

required

Initial status of all jobs in the batch. Always "pending" on creation.

message

string

required

A human-readable message describing how to poll for results.

{
  "batch_id": "f9e8d7c6-b5a4-3210-fedc-ba9876543210",
  "job_ids": [
    "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "b2c3d4e5-f6a7-8901-bcde-f12345678901"
  ],
  "status": "pending",
  "message": "job batch submitted; poll GET /jobs/{id} for each job for progress"
}

List jobs

GET https://api.harmstack.com/v0/jobs

Returns a list of jobs for your account. By default returns your 10 most recent completed jobs.

Query parameters

status

string

default:"completed"

Filter jobs by status. Accepted values: pending, running, completed, failed.

limit

number

default:"10"

Maximum number of jobs to return.

Example request

curl "https://api.harmstack.com/v0/jobs?status=completed&limit=5" \
  --header "Authorization: Bearer YOUR_API_KEY"

Response

Returns 200 OK with a jobs array.

jobs

object[]

required

Array of job objects.

Show job fields

job_id

string

required

UUID of the job.

status

string

required

Current job status: pending, running, completed, or failed.

module

string

required

The evaluation module used (e.g. "Haystack").

benchmark_id

number

required

ID of the benchmark that was run.

benchmark_name

string

required

Name of the benchmark that was run.

endpoint_url

string

required

The model endpoint URL that was evaluated.

needle_annnoted_count

number

required

Number of annotated needle prompts used in the evaluation.

hay_non_annotated_count

number

required

Number of non-annotated hay prompts used in the evaluation.

total_prompts

number

required

Total number of prompts sent to your model endpoint.

created_at

string

required

ISO 8601 timestamp of when the job was created.

completed_at

string

ISO 8601 timestamp of when the job completed. null if still in progress.

passed_count

number

Number of evaluation units your model passed. Present when job is completed.

failed_count

number

Number of evaluation units your model failed. Present when job is completed.

score_pct

number

Your model’s score as a percentage (0–100). Present when job is completed.

total_scored

number

Total number of units that were scored. Present when job is completed.

{
  "jobs": [
    {
      "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "status": "completed",
      "module": "Haystack",
      "benchmark_id": 1,
      "benchmark_name": "haystack-medical-v1",
      "endpoint_url": "https://your-model.example.com/v1/chat/completions",
      "needle_annnoted_count": 3,
      "hay_non_annotated_count": 30,
      "total_prompts": 33,
      "created_at": "2024-06-01T12:00:00Z",
      "completed_at": "2024-06-01T12:05:42Z",
      "passed_count": 2,
      "failed_count": 1,
      "score_pct": 66.67,
      "total_scored": 3
    }
  ]
}

Get a job

GET https://api.harmstack.com/v0/jobs/{id}

Retrieve the full status and results for a specific job. Poll this endpoint after submitting a job to track progress and retrieve your score.

Path parameters

string

required

The UUID of the job, returned from POST /v0/jobs or POST /v0/jobs/batch.

Example request

curl https://api.harmstack.com/v0/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890 \
  --header "Authorization: Bearer YOUR_API_KEY"

Response

Returns 200 OK. Fields are the same as in the list response, with additional real-time progress fields available while the job is running.

job_id

string

required

UUID of the job.

status

string

required

Current job status: pending, running, completed, or failed.

module

string

required

The evaluation module used.

benchmark_id

number

required

ID of the benchmark that was run.

benchmark_name

string

required

Name of the benchmark that was run.

endpoint_url

string

required

The model endpoint URL that was evaluated.

needle_annnoted_count

number

required

Number of annotated needle prompts used.

hay_non_annotated_count

number

required

Number of non-annotated hay prompts used.

total_prompts

number

required

Total number of prompts sent to your model endpoint.

price

number

required

Credits deducted for this job.

created_at

string

required

ISO 8601 timestamp of when the job was created.

completed_at

string

ISO 8601 timestamp of when the job completed. null if still in progress.

passed_count

number

Number of evaluation units your model passed. Present when job is completed.

failed_count

number

Number of evaluation units your model failed. Present when job is completed.

score_pct

number

Your model’s score as a percentage (0–100). Present when job is completed.

total_scored

number

Total number of units scored. Present when job is completed.

current

number

Number of prompts processed so far. Present while the job is running.

total

number

Total number of prompts to process. Present while the job is running.

message

string

A progress message from the runner. Present while the job is running.

{
  "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "completed",
  "module": "Haystack",
  "benchmark_id": 1,
  "benchmark_name": "haystack-medical-v1",
  "endpoint_url": "https://your-model.example.com/v1/chat/completions",
  "needle_annnoted_count": 3,
  "hay_non_annotated_count": 30,
  "total_prompts": 33,
  "price": 3.0,
  "created_at": "2024-06-01T12:00:00Z",
  "completed_at": "2024-06-01T12:05:42Z",
  "passed_count": 2,
  "failed_count": 1,
  "score_pct": 66.67,
  "total_scored": 3
}

When a job is still in progress (status: "running"), the response also includes current, total, and optionally message fields so you can track real-time progress.

Get Started

CLI Reference

API Reference

Submit a job

Request body

Example request

Response

Submit a batch of jobs

Request body

Example request

Response

List jobs

Query parameters

Example request

Response

Get a job

Path parameters

Example request

Response

Get Started

CLI Reference

API Reference

Documentation Index

​Submit a job

​Request body

​Example request

​Response

​Submit a batch of jobs

​Request body

​Example request

​Response

​List jobs

​Query parameters

​Example request

​Response

​Get a job

​Path parameters

​Example request

​Response

Submit a job

Request body

Example request

Response

Submit a batch of jobs

Request body

Example request

Response

List jobs

Query parameters

Example request

Response

Get a job

Path parameters

Example request

Response