Shift-share decomposition of W14 (Apr 4–10) vs W13 (Mar 28–Apr 3), isolating rate effect, weight effect, and their interaction. EMEA drives 128% of the decline; APAC offsets 31%.
W14 (Apr 4–10) vs W13 (Mar 28–Apr 3). Each segment's total contribution is decomposed into three additive components. Positive % of Δ = contributed to the decline; negative = offset.
−2.64pp) alone would have caused a 3.4pp decline if the mix hadn't shifted favorably. The actual −1.63pp is the best-case outcome given how much accuracy fell — saved only by favorable weight rebalancing.Rate effect (161%) tells us accuracy degradation within segments — holding mix constant — more than fully explains the decline. This is the "quality got worse" signal.
Weight effect (offset 47%) means the mix actually shifted favorably: segments with above-average accuracy gained share. Without this, the decline would have been ~3.4pp instead of 1.63pp.
Interaction (offset 15%) captures the joint effect — segments that lost accuracy also tended to shrink in weight, providing a small additional buffer.
The sum: −2.64 + 0.76 + 0.24 = −1.63pp, matching the observed global decline exactly.
Fuzzy cases are counted as errors in OEA. Global fuzzy rate rose from 2.00% to 2.35% — this +0.35pp increase directly cost 0.36pp of accuracy, explaining ~22% of the total decline. The remaining 78% is genuine accuracy degradation.
| Hub | FR W14 | FR W13 | Δ FR | Acc Δ total | Fuzzy explains | Non-fuzzy Δ | Verdict |
|---|---|---|---|---|---|---|---|
| AMS | 1.76% | 1.57% | +0.19pp | −0.12pp | −0.19pp | +0.06pp | Entire decline is fuzzy-driven. Non-fuzzy accuracy actually improved. |
| APAC | 2.08% | 1.59% | +0.49pp | +0.92pp | −0.49pp | +1.41pp | Fuzzy headwind absorbed — non-fuzzy quality improved strongly (+1.41pp). |
| EMEA | 3.12% | 2.86% | +0.26pp | −6.48pp | −0.26pp | −6.21pp | 96% of EMEA's decline is non-fuzzy. Fuzzy is a minor factor here. |
| Global | 2.35% | 2.00% | +0.36pp | −1.63pp | −0.36pp | −1.28pp | Fuzzy = 22%, non-fuzzy = 78% |
AMS accuracy fell just −0.12pp, and the entire decline is explained by the fuzzy rate increase (+0.19pp). Once fuzzy is stripped out, AMS non-fuzzy accuracy actually improved by +0.06pp.
This means AMS's labeling quality is holding steady or improving — the headline number is being dragged by borderline cases being reclassified or new ambiguous content types entering the pipeline.
Action: Consider fuzzy calibration or policy clarification for the specific content types driving the 0.19pp fuzzy increase. This is a recoverable loss.
APAC's reported accuracy improved +0.92pp, but the underlying non-fuzzy improvement is actually +1.41pp — being partially masked by a +0.49pp fuzzy rate increase (the largest of any hub).
APAC absorbed the biggest fuzzy headwind and still delivered the best headline improvement. However, the fuzzy trend (+0.49pp WoW) needs monitoring — if it continues, it will eventually overwhelm the quality gains.
Action: Investigate whether policy updates or new content types in APAC are driving the fuzzy surge. The quality fundamentals are strong, but the fuzzy trajectory is concerning.
EMEA's fuzzy rate only increased +0.26pp, explaining just 4% of its massive −6.48pp accuracy decline. The remaining −6.21pp is pure non-fuzzy accuracy degradation.
This definitively rules out "borderline cases" as an explanation for EMEA's performance. The problem is fundamentally about labeler accuracy, policy interpretation, or operational execution — not content ambiguity.
EMEA also has the highest absolute fuzzy rate (3.12% vs 2.08% APAC, 1.76% AMS), suggesting a structural baseline of ambiguity in its content mix, but the week-over-week change is small.
EMEA's three project types account for 129% of the decline. APAC General Recall is the largest single offset (−30.6%).
82.7% → 76.6% (−6.06pp) while still carrying 15.6% of global weight. The rate effect (−1.156pp) is the single largest driver in the entire decomposition. Even the weight shrinkage couldn't offset it.| Hub | Type | Acc W14 | Acc W13 | Δ Acc | GWt W14 | GWt W13 | Rate | Weight | Inter | Total | % of Δ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| EMEA | Appeal | 76.6% | 82.7% | −6.06 | 15.6% | 19.1% | −1.156 | +0.113 | +0.212 | −0.831 | 50.9% |
| EMEA | General Recall | 84.2% | 91.6% | −7.37 | 12.1% | 10.0% | −0.734 | +0.119 | −0.156 | −0.771 | 47.1% |
| EMEA | Analytics Appeal | 77.7% | 89.7% | −12.01 | 4.7% | 3.2% | −0.387 | +0.055 | −0.174 | −0.505 | 30.9% |
| AMS | General Recall | 84.4% | 85.8% | −1.37 | 14.3% | 9.5% | −0.130 | −0.005 | −0.066 | −0.200 | 12.2% |
| APAC | Appeal | 85.4% | 85.7% | −0.27 | 15.9% | 18.8% | −0.052 | +0.006 | +0.008 | −0.037 | 2.3% |
| Negative subtotal | −2.345 | 143.4% | |||||||||
| AMS | Appeal | 72.9% | 78.7% | −5.86 | 5.0% | 10.5% | −0.614 | +0.394 | +0.320 | +0.100 | −6.1% |
| AMS | Analytics Appeal | 79.0% | 62.7% | +16.35 | 0.5% | 0.6% | +0.102 | +0.038 | −0.027 | +0.113 | −6.9% |
| APAC | General Recall | 90.1% | 88.6% | +1.49 | 27.5% | 24.1% | +0.359 | +0.091 | +0.051 | +0.501 | −30.6% |
| Positive subtotal | +0.714 | −43.7% | |||||||||
The −1.156pp rate effect is the single largest driver in this decomposition. EMEA Appeal dropped from 82.7% to 76.6%, a −6.06pp swing, while still carrying 15.6% global weight.
The weight did shrink (19.1% → 15.6%), which partially offset the damage (+0.113pp weight effect, +0.212pp interaction), but the sheer magnitude of the accuracy collapse overwhelms both offsets.
Key question: Is this driven by specific BPO sites, policy updates, or labeler calibration drift? See the "Top Projects" tab for project-level decomposition.
APAC General Recall improved from 88.6% to 90.1% (+1.49pp) while also gaining weight (24.1% → 27.5%). This is the ideal scenario: an above-average segment both improves and grows.
All three effects are positive: rate (+0.359pp), weight (+0.091pp), interaction (+0.051pp), summing to +0.501pp — the single largest offset at −30.6% of the decline.
MENA1 + EN + SSA + DE + MENA2 = 110% of global decline, almost entirely rate-driven. SSA is a compounding case — weight grew while accuracy fell.
| Market | Acc W14 | Acc W13 | Δ Acc | GWt W14 | GWt W13 | Rate | Weight | Inter | Total | % of Δ |
|---|---|---|---|---|---|---|---|---|---|---|
| MENA1 | 80.5% | 90.4% | −9.89 | 6.26% | 6.46% | −0.639 | −0.009 | +0.020 | −0.628 | 38.4% |
| EN (GB) | 78.8% | 88.5% | −9.67 | 3.56% | 4.18% | −0.404 | −0.016 | +0.061 | −0.360 | 22.0% |
| SSA | 75.9% | 84.2% | −8.22 | 3.65% | 2.46% | −0.202 | −0.021 | −0.099 | −0.321 | 19.7% |
| DE | 77.4% | 86.8% | −9.42 | 2.93% | 2.90% | −0.273 | +0.000 | −0.003 | −0.276 | 16.9% |
| MENA2 | 75.8% | 80.9% | −5.08 | 4.28% | 4.37% | −0.222 | +0.005 | +0.005 | −0.213 | 13.0% |
| IT | 84.2% | 93.2% | −9.07 | 2.43% | 2.27% | −0.205 | +0.012 | −0.015 | −0.208 | 12.7% |
| IL | 67.2% | 83.8% | −16.55 | 0.36% | 0.43% | −0.071 | +0.001 | +0.011 | −0.059 | 3.6% |
| UA | 74.3% | 77.3% | −3.04 | 1.08% | 1.04% | −0.032 | −0.003 | −0.001 | −0.036 | 2.2% |
MENA1 dropped from 90.4% to 80.5% (−9.89pp) while maintaining roughly stable weight (6.46% → 6.26%). The rate effect (−0.639pp) almost entirely explains its contribution.
This is a nearly pure accuracy regression — no confounding mix shifts. The investigation should focus on what changed in MENA1 labeling quality, policy interpretation, or task distribution during W14.
SSA is unique among all segments: rate, weight, and interaction are all negative.
Rate (−0.202pp): accuracy fell from 84.2% to 75.9%, a −8.22pp drop.
Weight (−0.021pp): SSA's weight grew from 2.46% to 3.65%, but since SSA accuracy (84.2%) was below the W13 global mean (85.9%), this expansion hurts.
Interaction (−0.099pp): the weight grew AND accuracy fell simultaneously — the worst combination.
Key question: Was the SSA weight increase intentional (ramp-up)? If so, quality support did not scale with volume.
IL has the most dramatic accuracy decline of any market (83.8% → 67.2%, −16.55pp), but its small weight (0.36%) limits global impact to just −0.059pp (3.6% of decline).
Still worth flagging: a 16.5pp drop likely indicates a systemic issue — new policy, labeler turnover, or task type change — that could worsen if IL weight increases.
EMEA non-GCP alone accounts for 105% of the global decline, almost purely through rate effect. EMEA GCP's weight surge (0.85%→2.98%) into a low-accuracy segment cost another −0.38pp.
−5.8pp accuracy drop, EMEA non-GCP generates −1.72pp contribution (105.2% of total). Meanwhile EMEA GCP weight tripled into a 74%-accuracy segment — a weight trap costing −0.38pp even though its rate effect is tiny.| Hub | Type | Acc W14 | Acc W13 | GWt W14 | GWt W13 | Rate | Weight | Inter | Total | % of Δ |
|---|---|---|---|---|---|---|---|---|---|---|
| EMEA | non-GCP | 80.7% | 86.5% | 29.4% | 31.4% | −1.824 | −0.011 | +0.115 | −1.720 | 105.2% |
| EMEA | GCP | 69.8% | 74.0% | 2.98% | 0.85% | −0.036 | −0.254 | −0.090 | −0.379 | 23.2% |
| AMS | non-GCP | 81.9% | 83.0% | 10.9% | 10.9% | −0.128 | −0.000 | −0.000 | −0.128 | 7.8% |
| APAC | GCP | 82.4% | 85.1% | 2.69% | 3.32% | −0.090 | +0.005 | +0.017 | −0.067 | 4.1% |
| Negative subtotal | −2.295 | 140.4% | ||||||||
| AMS | GCP | 80.8% | 79.8% | 8.80% | 9.66% | +0.097 | +0.053 | −0.009 | +0.141 | −8.7% |
| APAC | non-GCP | 89.0% | 87.9% | 45.2% | 43.8% | +0.477 | +0.027 | +0.015 | +0.519 | −31.7% |
| Positive subtotal | +0.661 | −40.4% | ||||||||
Unlike most segments where rate effect dominates, EMEA GCP's primary damage vector is the weight effect (−0.254pp). Weight tripled from 0.85% → 2.98%, but GCP accuracy (74.0%) is 11.9pp below the global mean.
The rate effect is tiny (−0.036pp) because the starting weight was so small. But the interaction (−0.090pp) compounds things: weight grew while accuracy also fell (74.0% → 69.8%).
Key question: Is this a deliberate GCP ramp-up or an allocation error? If it's a ramp-up, quality support needs to precede or accompany the volume increase.
APAC non-GCP is the largest single segment by weight (45.2%) and improved from 87.9% → 89.0%. All three effects are positive, delivering +0.519pp total offset (−31.7% of decline).
This segment single-handedly prevented the global decline from being ~2.15pp instead of 1.63pp. Maintaining APAC non-GCP stability is critical to holding the floor.
Top 10 projects by negative contribution. GCP-appeal-GB-en-ALR-MNL leads at 25.1% — weight surged 6x while accuracy crashed from 100% → 69.9%. Four of the top five are Appeal or AA projects with weight expansion into below-mean accuracy.
| Project | Type | Acc W14 | Acc W13 | GWt W14 | GWt W13 | Rate | Weight | Inter | Total | % of Δ |
|---|---|---|---|---|---|---|---|---|---|---|
| GCP-appeal-GB-en-ALR-MNL | Appeal | 69.9% | 100.0% | 2.25% | 0.36% | −0.108 | +0.266 | −0.568 | −0.410 | 25.1% |
| AA-MENA2-ar-T&S-CAS | AA | 67.1% | 73.5% | 2.25% | 0.51% | −0.033 | −0.215 | −0.112 | −0.360 | 22.0% |
| GR General-MENA1-ku-CNX-ANK | GR | 84.4% | 96.8% | 2.59% | 2.61% | −0.322 | −0.002 | +0.002 | −0.321 | 19.7% |
| appeal-KE/TZ/UG-sw-TP-NBO | Appeal | 69.4% | 81.4% | 1.12% | 0.77% | −0.092 | −0.016 | −0.043 | −0.151 | 9.2% |
| GCP-GR General-GB-en-TP-ALB | GR | 58.9% | 92.7% | 0.06% | 1.40% | −0.473 | −0.091 | +0.451 | −0.113 | 6.9% |
| GCP-appeal-IT-it-TP-BRV | Appeal | 84.7% | 92.3% | 0.87% | 1.60% | −0.122 | −0.046 | +0.055 | −0.113 | 6.9% |
| appeal-MENA1-other-TP-MAK | Appeal | 63.4% | 96.1% | 0.22% | 0.53% | −0.174 | −0.032 | +0.101 | −0.104 | 6.4% |
| GCP-GR General-DE-de-TLS-LEJ | GR | 75.4% | 85.4% | 1.00% | 0.46% | −0.046 | −0.003 | −0.054 | −0.102 | 6.3% |
| appeal-MENA1-ar-CNX-IBD | Appeal | 74.2% | 78.9% | 1.48% | 1.17% | −0.056 | −0.021 | −0.014 | −0.091 | 5.6% |
| GR General-MENA1-ar-TP-MAK | GR | N/A | 100% | 0.00% | 0.53% | −0.534 | −0.075 | +0.534 | −0.075 | 4.6% |
This project's weight surged 6.25x (0.36% → 2.25%) while accuracy crashed from 100% → 69.9%. The interaction effect (−0.568pp) is the largest single component — weight grew dramatically while accuracy fell dramatically.
The weight effect is actually positive (+0.266pp) because the project was above the global mean in W13 (100% vs 85.9%). But the interaction overwhelms it: expanding into what became a low-accuracy segment is a compounding failure.
Key question: Was this a deliberate ramp-up of a previously small project? If so, quality controls didn't scale with volume.
Weight grew from 0.51% → 2.25% (4.4x) while accuracy was already below the global mean (73.5%) and fell further to 67.1%. The weight effect alone (−0.215pp) is the largest component — this is a mix-shift problem, not primarily a rate problem.
All three effects are negative: rate (−0.033), weight (−0.215), interaction (−0.112). A triple headwind totaling −0.360pp (22.0% of decline).
Action: Validate whether this weight increase was intentional. Expanding a chronically below-mean segment without quality uplift compounds the global decline.
This is the cleanest rate-driven case in the top 10: weight barely moved (2.61% → 2.59%), so the rate effect (−0.322pp) almost entirely explains the −0.321pp total contribution.
Accuracy dropped from 96.8% → 84.4% (−12.4pp) — a steep fall from a high base. No mix-shift or weight excuses here; something changed in execution quality.
Action: Investigate what changed for Kurdish-language GR in MENA1 during W14 — policy update, new labeler cohort, or calibration drift.
P0 (immediate): DE-LEJ site — likely site-level outage/failure, accounts for 63.7% of decline from just two projects. Quick root cause identification could recover the most impact.
P1 (this week): Four N/A projects — verify if delivery gaps are fixable. If unplanned, restoring these could offset 167% of the decline (they overlap with rate-driven decline).
P1 (this week): MENA1 accuracy regression — 38.4% of decline, pure rate effect. Check if a policy update or labeler calibration issue occurred during W14.
P2 (track): SSA triple headwind and EMEA GCP weight surge — these are structural issues that need monitoring over W15–W16 to determine if they're transient or persistent.
P2 (track): APAC fuzzy rate surge (+0.49pp) — quality fundamentals are strong but the fuzzy trajectory needs monitoring. AMS fuzzy calibration is a quick-win candidate.
W13 (Mar 28 – Apr 3) data will appear here when the report is generated.
W12 (Mar 21 – Mar 27) data will appear here when the report is generated.
W11 (Mar 14 – Mar 20) data will appear here when the report is generated.