If you are deploying Label Studio Enterprise or Label Studio Starter on-prem and are upgrading from a pre-2.34 release, you will need to run a script to backfill changes to annotator agreement.

What is agreement and how has it changed

When you have multiple annotators working on a task, agreement shows how much overlap there is between their submissions.

Label Studio 2.34 introduces a number of changes and enhancement, including:

Consensus methodology for scoring
The ability to configure agreement metrics for each control tag
The ability to view per-control-tag agreement in the Data Manager

See the Label Studio 2.34 release notes for complete overview of the changes.

What this script does

With Label Studio 2.34, agreement is backed by a new data model.

Because of this new model, existing annotation data must be reprocessed. This script will migrate all your old data to this new model.

Who needs to run this script

This script only needs to be run once, and is only needed for organizations that are:

On-prem: Running Label Studio Enterprise or Label Studio Starter
Upgrading: Upgrading from a pre-2.34 release. New organizations do not need to migrate.

What happens if you do not run this script

If you do not migrate, nothing in your deployment will break.

However, tasks in existing projects will show empty or zero per-control-tag agreement scores for all historical tasks and annotators.

Effects while running the script

Zero Downtime: You do not need to pause labeling operations. The system remains fully accessible to all users.
Background Resource Usage: The migration is designed to trigger asynchronous, parallel background jobs on a per-project basis. During the execution period, you may observe an increase in background processing load.

Agreement scores will start appearing project by project. You do not need to wait for the entire organization to finish before results are visible.

If a project job fails, you can retry without reprocessing completed projects. Project that have already been backfilled will be automatically skipped on subsequent runs.

How to migrate

You can trigger and monitor the migration programmatically without needing to use the Label Studio UI.

The migration operates on your entire organization simultaneously. See Trigger agreement backfill for organization in our API reference

You must have Administrator or Owner permissions. You can approach this migration in several ways:

Batched rollout
Per-project
All projects at one

Option 1: Batched rollout (Recommended)

Use this to process a controlled number of projects per call. This is the safest approach for large organizations — it keeps the background job queue from filling up and avoids blocking other async work (storage syncs, ML backend calls, etc.).

Using the Python SDK

import time
from label_studio_sdk import LabelStudio

ls = LabelStudio(
    base_url="<https://your-label-studio-instance.com>",
    api_key="your-api-key",
)

BATCH_SIZE = 10  # adjust based on your queue capacity and urgency

result = ls.dimensions.trigger_backfill(num_projects=BATCH_SIZE)
print(
    f"Queued:{result.jobs_queued}, "
    f"Skipped (already done):{result.projects_skipped}, "
    f"Remaining:{result.projects_remaining}"
)

if result.projects_remaining == 0:
    print("All projects queued or complete.")

Using requests

import time
import requests

BASE_URL = "<https://your-label-studio-instance.com>"
HEADERS = {"Authorization": "Token your-api-key", "Content-Type": "application/json"}

BATCH_SIZE = 10  # adjust based on your queue capacity and urgency

response = requests.post(
    f"{BASE_URL}/api/dimensions/backfill/",
    headers=HEADERS,
    json={"num_projects": BATCH_SIZE},
)
data = response.json()
print(
    f"Queued:{data['jobs_queued']}, "
    f"Skipped (already done):{data['projects_skipped']}, "
    f"Remaining:{data['projects_remaining']}"
)

if data["projects_remaining"] == 0:
    print("All projects queued or complete.")

Option 2: Single project

Use this when you want to backfill one specific project, for example to validate the migration before rolling it out broadly.

Using the Python SDK

from label_studio_sdk import LabelStudio

ls = LabelStudio(
    base_url="<https://your-label-studio-instance.com>",
    api_key="your-api-key",
)

result = ls.dimensions.trigger_backfill(project_id=42)
print(result)
# jobs_queued=1, projects_skipped=0, projects_remaining=0, ...

Using requests

import requests

BASE_URL = "<https://your-label-studio-instance.com>"
HEADERS = {"Authorization": "Token your-api-key", "Content-Type": "application/json"}

response = requests.post(
    f"{BASE_URL}/api/dimensions/backfill/",
    headers=HEADERS,
    json={"project_id": 42},
)
print(response.json())
# {"jobs_queued": 1, "projects_skipped": 0, "projects_remaining": 0, ...}

Option 3: All projects at once

Warning: This cancels all in-flight backfill jobs and immediately enqueues every unprocessed project in your organization. On large instances this can flood the background job queue and delay other async operations (storage syncs, webhooks, ML backend predictions) for an extended period. Use batched mode unless you are intentionally prioritizing the backfill above all other background work.

Using the SDK

from label_studio_sdk import LabelStudio

ls = LabelStudio(
    base_url="<https://your-label-studio-instance.com>",
    api_key="your-api-key",
)

result = ls.dimensions.trigger_backfill(all_projects=True)
print(result)
# jobs_queued=150, projects_skipped=12, projects_remaining=0, ...

Using requests

import requests

BASE_URL = "<https://your-label-studio-instance.com>"
HEADERS = {"Authorization": "Token your-api-key", "Content-Type": "application/json"}

response = requests.post(
    f"{BASE_URL}/api/dimensions/backfill/",
    headers=HEADERS,
    json={"all_projects": True},
)
print(response.json())
# {"jobs_queued": 150, "projects_skipped": 12, "projects_remaining": 0, ...}

Migration time

Total duration will scale according to the number of tasks, number of annotations per task, and number of control tags in your projects.

Approximate time per project:

Organization Size	Approximate Duration
Small (<10K entities)	< 1 minute
Medium (10K-100K entities)	1-5 minutes
Large (100K-1M entities)	5-10 minutes
Very Large (1M+ entities)	10-60 minutes

Monitor progress

Check overall organization status

Using the Python SDK

status = ls.dimensions.get_backfill_status()
print(status)
# org_status={"completed": 42, "pending": 8, "failed": 1, ...}

Using requests

response = requests.get(f"{BASE_URL}/api/dimensions/backfill/status/", headers=HEADERS)
print(response.json())
# {"org_status": {"completed": 42, "pending": 8, "failed": 1, ...}}

Check a specific project

Using the Python SDK

status = ls.dimensions.get_backfill_status(project_id=42)
print(status)
# job_id=17, status="COMPLETED", ...

Using requests

response = requests.get(
    f"{BASE_URL}/api/dimensions/backfill/status/",
    headers=HEADERS,
    params={"project_id": 42},
)
print(response.json())
# {"job_id": 17, "status": "COMPLETED", ...}

List all jobs with filtering

Using the Python SDK

jobs = ls.dimensions.list_backfills(status="FAILED")
# status options: PENDING, QUEUED, RUNNING, COMPLETED, FAILED
for job in jobs.results:
    print(job)

Using requests

response = requests.get(
    f"{BASE_URL}/api/dimensions/backfill/jobs/",
    headers=HEADERS,
    params={"status": "FAILED"},  # PENDING, QUEUED, RUNNING, COMPLETED, FAILED
)
print(response.json())
# {"count": 2, "results": [...]}

Cancel the migration

Cancel all jobs for the organization

Using the Python SDK

result = ls.dimensions.cancel_backfill()
print(result)
# cancelled_count=5, ...

Using requests

response = requests.delete(f"{BASE_URL}/api/dimensions/backfill/", headers=HEADERS)
print(response.json())
# {"cancelled_count": 5, "message": "Successfully cancelled 5 Agreement V2 backfill job(s)"}

Cancel jobs for a specific project

Using the SDK

result = ls.dimensions.cancel_backfill(project_id=42)
print(result)

Using requests

response = requests.delete(
    f"{BASE_URL}/api/dimensions/backfill/",
    headers=HEADERS,
    params={"project_id": 42},
)
print(response.json())

What is agreement and how has it changed

What this script does

Who needs to run this script

What happens if you do not run this script

Effects while running the script

How to migrate

Option 1: Batched rollout (Recommended)

Using the Python SDK

Using requests

Option 2: Single project

Using the Python SDK

Using requests

Option 3: All projects at once

Using the SDK

Using requests

Migration time

Monitor progress

Check overall organization status

Using the Python SDK

Using requests

Check a specific project

Using the Python SDK

Using requests

List all jobs with filtering

Using the Python SDK

Using requests

Cancel the migration

Cancel all jobs for the organization

Using the Python SDK

Using requests

Cancel jobs for a specific project

Using the SDK

Using requests

Related articles