Table 3

Confusion matrices for the physicians (survey responses from 61 physicians classifying lung ultrasound clips into their respective causes, numbers in parenthesis reflect classifications from the aggregated approach used to calculate area under the receiver operating characteristic curve), model performance on the test-2 holdback set at the frame and the encounter level

Physicians		Predicted			Total
Physicians		COVID	NCOVID	HPE	Total
Actual	COVID	173 (3)	162 (3)	34 (2)	369 (8)
	NCOVID	177 (4)	163 (1)	30 (2)	370 (7)
	HPE	138 (0)	102 (0)	302 (6)	542 (6)
	Total	488 (7)	427 (4)	366 (10)
CNN-Frames		Predicted			Total
CNN-Frames		COVID	NCOVID	HPE	Total
Actual	COVID	3188	256	7	3451
	NCOVID	1176	3741	3	4920
	HPE	109	1119	2771	3999
	Total	4473	5116	2781
CNN-Encounters		Predicted			Total
CNN-Encounters		COVID	NCOVID	HPE	Total
Actual	COVID	6	0	0	6
	NCOVID	1	6	0	7
	HPE	0	3	4	7
	Total	7	9	4

‘Predicted’ represents the model or physicians’ opinions; ‘actual’ is the true label of the clip.
CNN, convolutional neural network; HPE, hydrostatic pulmonary edema.