Published in The British Medical Journal - 3rd September 2021

Principal findings

In this study, we developed and validated symptom’s prediction scores for COVID-19 positivity, independently in children, adults and the elderly patients in the Nigerian context. The best individual symptom predictors of COVID-19 positivity in children, adult and the elderly patients were loss of smell (AUROC 0.56, 95% CI 0.55 to 0.56), either fever or cough (AUROC 0.57, 95% CI 0.56 to 0.58) and difficulty in breathing (AUROC 0.53, 0.48 to 0.58), respectively. In adults, all the symptom scores showed similar performance, with the statistically weighted score (AUROC 0.65) slightly showing better performance than the unweighted (AUROC 0.63) and clinically derived weighted (AUROC 0.64) scores. Similar results were found in children and elderly patients. Overall, none of the symptom scores had good enough discrimination to use in practice.

Strengths and limitations of this study

To the best of our knowledge, this is the first study to have developed and validated symptom prediction scores with a view to aiding prompt recognition of COVID-19 by frontline healthcare workers in the Nigerian context and possibly in sub-Saharan Africa at large. Despite the limited accuracy of the developed prediction tool, the findings are very important for a country with limited capacity for molecular diagnosis of COVID-19, as it provides the evidential basis for advocacy for more investments in molecular diagnostics by policy makers in Nigeria. We also adopted a transparent methodology and adhered to the TRIPOD reporting statement, hence minimising the vagueness often associated with the reporting of studies on the predictive performance of diagnostic models for COVID-19.9 The methodology taken to the derivation of weighted scores is also a strength of this study. In accordance with the preferred approach for building prediction models,9 the participatory approach taken to deriving the clinically weighted score can enhance the study relevance in the medical community.22 The use of beta regression coefficients as opposed to ORs in deriving the statistical weighted scores has the advantage of being less prone to bias by small to moderate sample size.23 A common limitation of many COVID-19 diagnostic models is bias due to overfitting of the models on data that are not representative of the target population.9 By using SORMAS database (hosts data from all over the country) for both derivation and validation in the present study, our findings are considerably generalisable to the COVID-19 situation in Nigeria and less prone to overestimation of COVID-19 risk among individuals tested.

This study, however, has some limitations that warrant discussion. First, being an analysis of secondary data based on practical recording of routine clinical assessments, the fundamental assumption is that the data recorded on clinical symptoms are reasonably complete; for instance, we assumed that where a symptom was not recorded as being absent rather than missing. Without any objective means of verifying this assumption, any bias caused by misclassification of the individual symptoms could potentially minimise differences in comparisons, in which case observed differences are likely to be in the direction of the null hypothesis. Second, the approach of splitting the dataset for both derivation and validation of symptom scores may have lowered the precision of estimated effect (wider 95% CIs)24 and potentially underestimated prediction performance due to loss of power.25 Moreover, evidence supporting the 10 events per outcome rule of thumb has been found by van Smeden et al
26 to be weak. Third, the study lacked detailed clinical laboratory data, such as record for albumin or albumin/globin, direct bilirubin values and red cell distribution width, which have been found to be significant variables in COVID-19 diagnostic models.9 Practically, however, the time and technical requirements for testing these laboratory data could limit their clinical utility.

Interpretation and implications of findings

Based on the systematic and critical review of diagnostic scores by Wynants et al, the performance of a diagnostic model is influenced by its composition, with higher number of clinical and laboratory parameters in a model indicating better predictive performance.9 For instance, studies in China,27 28 Brazil,29 Italy,30 The Netherlands31 and France32 with several clinical and laboratory parameters recorded excellent discriminatory performance, although with substantial evidence of bias9 and limited clinical utility. Conversely, a prediction model (containing fewer number of symptoms, heart rate, systolic and diastolic blood pressure) developed by Sun et al
33 in Singapore had a poor discriminatory capacity (C statistic: 0.65; 0.57–0.73). There is evidence to further suggest that the discriminatory accuracy of a prediction model, particularly its sensitivity, can be enhanced by including certain variables including a combination of loss of smell or taste and fever.34 In the absence of comparison study from a sub-Saharan African country, it is difficult to fully explain the variation in the findings from the present study and elsewhere. Thus, a follow-up study using both clinical and laboratory parameters in a Nigerian setting or in sub-Saharan Africa (with similar healthcare system and demographic structure) is recommended.

Prediction performance of the unweighted score with regard to COVID-19 positivity was better in adults than in children and the elderly patients in our study, although the predictive capacity of all the scores was poor overall. This finding has an important implication on NCDC’s current definition of COVID-19 suspected cases, which emphasises acute respiratory symptoms and either travel history within 14 days prior to symptom onset or self-reported contact with a confirmed case.13 Given our findings are indicative of age dependency of symptom, it may be useful to review the current case definitions of COVID-19 in Nigeria. For example, we found loss of smell and either fever or cough to be better in predicting COVID-19 positivity in children and adults, respectively, while breathing difficulty was more predictive of the disease in the elderly patients. Furthermore, this finding potentially has implications on the clinical utility of existing suspected case definition in Nigeria13 with a high proportion of asymptomatic COVID-19 cases8 and testing system that allows persons who are concerned about their COVID-19 risk to be tested. Thus, to minimise missed diagnoses and overburdening of the healthcare system, with attendant psychological effects on health personnel,35 there is a need for more economic investments on molecular testing across Nigeria.

Loss of smell recorded the highest specificity with regard to COVID-19 positivity for the three age groups: 98.1% in children, 98.5% in adults and 99.1% in the elderly. However, unlike the present study which explored the predictive capacity of loss of smell and taste separately, a combination of both symptoms has been shown to be more predictive.36 Thus, the potential use of both loss of smell and taste to differentiate COVID-19 from endemic febrile and respiratory illnesses in Nigeria, such as malaria and pneumonia, with overlapping symptoms warrant further study. Additionally, possibility of using both loss of smell and taste as early indicators of emerging COVID-19 wave or a surge in Nigeria would be useful in improving COVID-19 response, such as allocation of already limited testing resources, risk communication and aid decision-making concerning lockdowns and quarantines.37 The poor predictive capacity of cough or fever alone in the present study is congruent with that in a meta-analysis.38

Clinical validity (characterised by sensitivity, specificity and AUROC values) is an important criterion for assessing a clinical prediction tool39 as it is—the ability of the prediction tool to distinguish between who has an outcome (in this case SARS-CoV-2 infection) and who does not.40 The clinical validity of all our prediction scores was generally poor but appeared to be dependent on the number of symptoms. For instance, in our study, the unweighted and weighted (both statistical and clinical) predictive scores presenting with fewer number of symptoms were more sensitive compared with many symptoms in children and adults; it was, however, the opposite relative to specificity given ≥4 symptoms recorded higher specificity values than lower symptom thresholds. The poor sensitivity of many symptoms could potentially be attributable to a high proportion of false negatives, suggesting that some symptoms have limited validity for COVID-19 in children and adults. However, similarity in the predictive performance of various symptom thresholds on the two weighted scores in elderly suggests that weighting has less predictive value for this group of population. The high specificity of more symptoms could be indicative of low proportion of false positives, underlining the need to accurately assess symptoms. In practice, there is a trade-off between sensitivity and specificity such that when the consequences of having a false positive test is very serious, specificity is prioritised over sensitivity and vice versa.41 This is the case for the various symptom thresholds on the unweighted scale where specificity is higher than sensitivity. A higher specificity over sensitivity is of practical relevance when the political implication of refusing to test someone with suspected COVID-19 is considered, although higher sensitivity over specificity might be given preference in the early phases of a pandemic before surge capacity is reached.

Given the rapid increase in community transmission of COVID-19 cases and deleterious impacts of instituting another lockdown (partial or complete), large-scale surveillance for capturing the epidemiological trend of COVID-19 in Nigeria is crucial. However, Nigeria has limited SARS-CoV-2 testing capacity with an average turnaround of 2 days, making syndromic surveillance (symptomatic monitoring) a viable complementary surveillance system. As such, our findings would be relevant in informing the design of such a surveillance system, which has been demonstrated in Japan42 43 and in the USA,44 to be useful in improving the understanding of COVID-19 epidemiology (often in real time), assessing the effectiveness of public health interventions and enhancing preparedness for the emergence of COVID-19 wave or a surge. For instance, an evaluation of a syndromic surveillance system in the USA found new taste/smell loss to be highly correlated with a range of COVID-19 outcomes, highlighting their usefulness in supporting the surveillance system as an early warning system for COVID-19 prevention and control. However, the feasibility (eg, considering selection bias and recall bias) and acceptability of a syndromic surveillance system first need to be ascertained given the large proportion of asymptomatic COVID-19 cases at diagnosis in Nigeria.8 PPVs across the various prediction thresholds, especially for the weighted scales, were generally low despite increasing proportionately with the thresholds. This could be attributable, in part, to the general mildness of the pandemic with resultant low incidence of mortality in Nigeria. For instance, 66% of the 12 289 confirmed COVID-19 cases in Nigeria between 27 February and 6 June 2020 were asymptomatic at diagnosis, with an overall cumulative incidence and case fatality rate of 5.6 per 100 000 population and 2.8%, respectively8—these figures were substantially lower than those from European countries during the same period.45 As such, our predictive tools could perform differently during a more severe COVID-19 outbreak in Nigeria.