RT Journal Article SR Electronic T1 A validation study of a new classification algorithm to identify rheumatoid arthritis using administrative health databases: case–control and cohort diagnostic accuracy studies. Results from the RECord linkage On Rheumatic Diseases study of the Italian Society for Rheumatology JF BMJ Open JO BMJ Open FD British Medical Journal Publishing Group SP e006029 DO 10.1136/bmjopen-2014-006029 VO 5 IS 1 A1 Carrara, Greta A1 Scirè, Carlo A A1 Zambon, Antonella A1 Cimmino, Marco A A1 Cerra, Carlo A1 Caprioli, Marta A1 Cagnotto, Giovanni A1 Nicotra, Federica A1 Arfè, Andrea A1 Migliazza, Simona A1 Corrao, Giovanni A1 Minisola, Giovanni A1 Montecucco, Carlomaurizio YR 2015 UL http://bmjopen.bmj.com/content/5/1/e006029.abstract AB Objectives To develop and validate a new algorithm to identify patients with rheumatoid arthritis (RA) and estimate disease prevalence using administrative health databases (AHDs) of the Italian Lombardy region.Design Case–control and cohort diagnostic accuracy study.Methods In a randomly selected sample of 827 patients drawn from a tertiary rheumatology centre (training set), clinically validated diagnoses were linked to administrative data including diagnostic codes and drug prescriptions. An algorithm in steps of decreasing specificity was developed and its accuracy assessed calculating sensitivity/specificity, positive predictive value (PPV)/negative predictive value, with corresponding CIs. The algorithm was applied to two validating sets: 106 patients from a secondary rheumatology centre and 6087 participants from the primary care. Alternative algorithms were developed to increase PPV at population level. Crude and adjusted prevalence estimates taking into account algorithm misclassification rates were obtained for the Lombardy region.Results The algorithms included: RA certification by a rheumatologist, certification for other autoimmune diseases by specialists, RA code in the hospital discharge form, prescription of disease-modifying antirheumatic drugs and oral glucocorticoids. In the training set, a four-step algorithm identified clinically diagnosed RA cases with a sensitivity of 96.3 (95% CI 93.6 to 98.2) and a specificity of 90.3 (87.4 to 92.7). Both external validations showed highly consistent results. More specific algorithms achieved >80% PPV at the population level. The crude RA prevalence in Lombardy was 0.52%, and estimates adjusted for misclassification ranged from 0.31% (95% CI 0.14% to 0.42%) to 0.37% (0.25% to 0.47%).Conclusions AHDs are valuable tools for the identification of RA cases at the population level, and allow estimation of disease prevalence and to select retrospective cohorts.