If you are deploying Label Studio Enterprise or Label Studio Starter on-prem and are upgrading from a pre-2.34 release, you will need to run a script to backfill project and task states.
What are project and task states
Project and task states were introduced to Label Studio Enterprise on-prem in 2.34.
These states allow you to accurately model and manage the comprehensive lifecycle of your projects and tasks.
What this script does
Task and project states are automatically implemented for any new projects and tasks you create after upgrading.
However, for these states to be reflected in legacy projects, you will need to run a backfill script.
Who needs to run this script
This script only needs to be run once, and is only needed for organizations that are:
- On-prem: Running Label Studio Enterprise or Label Studio Starter
- Upgrading: Upgrading from a pre-2.34 release. New organizations do not need to migrate.
What happens if you do not run this script
If you do not migrate, nothing in your deployment will break, however:
- Inconsistent UI indicators: Legacy projects and tasks will lack the visual indicators that new projects and tasks have.
- Future feature development: We will continue to build new, advanced functionality utilizing these states.
Effects while running the script
- Zero Downtime: You do not need to pause labeling operations. The system remains fully accessible to all users.
- Background Resource Usage: The migration is designed to trigger asynchronous, parallel background jobs on a per-project basis. During the execution period, you may observe an increase in background processing load as historical tables are evaluated and passively synchronized to the new state models.
How to migrate
You can trigger and monitor the migration programmatically without needing to use the Label Studio UI.
The migration operates on your entire organization simultaneously. See Trigger state backfill for organization in our API reference.
Method A: Using the Python SDK (Recommended)
The Label Studio Python SDK provides a simplified, programmatic interface to trigger the migration.
Step 1. Trigger the Backfill
Initialize the client and execute the backfill command.
from label_studio_sdk import Client
# Initialize the client with your LSE instance URL and Admin API key
ls = Client(url="<https://app.labelstud.io>", api_key="YOUR_ADMIN_API_KEY")
# Trigger the backfill process for the organization
response = ls.fsm.trigger_backfill()
# Store the returned job ID for monitoring
job_id = response['job_id']
print(f"Migration started. Job ID: {job_id}")
Step 2. Monitor Progress
Use the stored job_id to poll the API and check the status of the migration.
import time
while True:
status_response = ls.fsm.get_backfill_status(job_id=job_id)
status = status_response['status']
if status == 'COMPLETED':
print("Migration successfully completed!")
progress = status_response.get('progress_data', {})
print(f"Projects Migrated: {progress.get('successful_projects')} / {progress.get('total_projects')}")
break
elif status in ['FAILED', 'CANCELED']:
print(f"Migration stopped with status: {status}")
break
print(f"Migration in progress... Current status: {status}")
time.sleep(30) # Poll every 30 seconds
Method B: Using Direct API Calls (cURL)
If you are not using the Python SDK, you can interact directly with the state endpoints.
Step 1. Trigger the Backfill
Send a POST request to the backend to start the migration.
curl -X POST <https://app.labelstud.io/api/fsm/backfill/> \\ -H "Authorization: Token YOUR_ADMIN_API_KEY" \\ -H "Content-Type: application/json"
Expected Response:
{
"job_id": "bf_123456789",
"status": "QUEUED"
}
Step 2. Monitor Progress
Using the job_id from the previous step, poll the status endpoint to track the migration.
curl -X GET <https://app.labelstud.io/api/fsm/backfill/status/?job_id=bf_123456789> \\ -H "Authorization: Token YOUR_ADMIN_API_KEY"
Expected Response (In Progress):
{
"status": "IN_PROGRESS",
"progress_data": {
"total_projects": 150,
"successful_projects": 45,
"failed_projects": 0
}
}
Step 3. Verification
Continue polling until the "status" returns "COMPLETED". You can verify success by ensuring "total_projects" equals "successful_projects" in the final progress_data payload.
Migration time
The backfill takes approximately 1 minute per project on average.
Total duration will scale according to your organization's total number of projects and overall data volume.
| Organization Size | Approximate Duration |
|---|---|
| Small (<10K entities) | < 1 minute |
| Medium (10K-100K entities) | 1-5 minutes |
| Large (100K-1M entities) | 5-10 minutes |
| Very Large (1M+ entities) | 10-15 minutes |
Comments
0 comments
Article is closed for comments.