Scoring systems

As usual, any system that has reasonable sensitivity has rubbish specificity (negative predictive value).  Retrospective analysis of 9000 UK A&E attendees <15yrs –

  • a modified Yale Observation Scale (YOS) – sensitivity of 54.0% and specificity of 63.7% at a cut-off of 10.
  • Pediatric Advanced Warning Score (PAWS) – sensitivity of 58.0% and specificity of 81.3% if any ‘red’ sign was present.
  • Alert, Voice, Pain, Unresponsive (AVPU) scale;
  • Recognising Acute Illness in Children (RAIC) score; sensitivity of 76.0% and specificity of 6.2% for ruling out serious bacterial infection at a score of 5 or less.
  • Oxford Vital Signs score sensitivity of 80.0% and specificity of 49.3% if any sign was present.
  • 2007 version of NICE CG160 (Fever guideline) traffic light system. 100% sensitivity and specificity on 0.12% if any ‘amber’ or ‘red’ sign was present, and had sensitivity of 62% and specificity of 74.5% if any ‘red’ sign was present  But data available covered only a selection of red and amber features.

[Verbakel, Pediatric Emergency Care 2014; 30: 373–80]

Same author applied rules to different data sets across UK, Netherlands and Belgium, found that all had lower performance than in their original derivation studies, but also wide variation across datasets eg NICE CG160 specificity ranged from 1-28.7%! Hard to understand differences.  [BMC Medicine 2013; 11: 10]

Lacour scoring system (“Laboratory-Score”) based on CRP, PCT and urinalysis. Has sensitivity of 94%, spec of 81%. Would reduce incidence of antibiotic use from 65 to 40% but good enough? [Lacour, PIDJ 2008 PMID 18536624]