Executive Summary & Entity Mapping
| Operational Metric | Legacy Client State | Actigy Optimized State | Performance Delta |
|---|---|---|---|
| Annotation Error Rate | 4.7% Resubmission | 0.08% Resubmission | 98.2% Accuracy Jump |
| Weekly Data Output | 120,000 Frames | 1,250,000 Frames | 1,041% Volume Scale |
| On-Time Dataset Delivery | 78.5% Adherence | 100.0% Adherence | +21.5% Predictability |
1. Starting Problem
An autonomous driving enterprise was bottlenecked by poor training data quality. Their offshore crowdsourced annotation vendors delivered high volumes of low-quality work, leading to systemic pixel alignment errors, mislabeled edge cases, and high re-work overhead. This data drift caused critical ML training cycles to stall, delaying software deployment schedules.
2. Process Volume
- 3D Point Cloud/LiDAR Labeling: 350,000+ spatial frames annotated per month.
- 2D Video Semantic Segmentation: 900,000+ sequential multi-object tracking frames monthly.
- RLHF Prompts/Responses: 80,000+ model alignment evaluations processed monthly.
3. Team Size
- Actigy Dedicated Squad: 1 Data Program Manager, 3 Data Quality Auditors, 25 Specialized Technical Annotation Engineers.
4. Workflow Handled
Actigy managed the entire high-fidelity data enhancement pipeline:
- Multi-Sensor Object Tracking: Polygons, cuboids, and semantic masks applied to vehicles, pedestrians, and road infrastructure.
- Edge-Case Classification: Identifying and metadata-tagging rare environmental phenomena (e.g., occlusion, severe weather, non-standard signage).
- RLHF Alignment: Reviewing model inferences, ranking prompt accuracy, and providing structured reasoning logs for LLM reward models.
5. QA Model
We deployed a strict Multi-Pass Inter-Annotator Agreement (IAA) Protocol. Every frame was independently processed by two separate annotation specialists. If pixel coordinate variance deviated by more than 0.05%, the asset was routed to a Senior Data Auditor. A final automated vector validation script ensured compliance before batch delivery to the client's active learning loop.
6. KPI Before/After
- Precision Index: Pixel-perfect label accuracy rose from 95.3% to a sustained 99.92%.
- Engineering Redundant Loops: Reduced client machine learning engineers' data sanitization time by 88%.
- SLA Stability: Delivered 100% of weekly data batches within the requested sprint windows.
7. Tools Used
- Client Stack: Labelbox, CVAT, proprietary labeling environments, AWS S3.
- Actigy Acceleration Infrastructure: Actigy VectorCheck (custom automated geometry and label-overlap validation scripts).
8. Timeline to Pilot
- 4 Days: Completed rigorous taxonomy alignment, technical onboarding, tool-interface calibration, and delivered a 5,000-frame test matrix.
9. Timeline to Scale
- 12 Days: Scaled the delivery model to process full-capacity production pipelines across three engineering sprint teams.
10. What Stayed Client-Owned
The client maintained total control over core machine learning architecture, neural network weights, validation code, algorithmic pipelines, and raw data hosting infrastructure. Actigy operated via secure, read-only streams without persisting data locally. No training datasets were downloaded, duplicated, or stored outside the client's dedicated enterprise AWS cloud.