Table 2

Patient population, reference standard, test outcomes and repeatability and validity data of all included studies featuring a functional vision test

Citation	Patient population	Functional vision test	Reference standard(s)	Test outcome(s)	Reported repeatability and validity data
Orientation and mobility (O&M)
Roman et al, 202261	10 patients with GUCY2D-associated and CEP290-associated Leber’s congenital amaurosis	Mobility test for rod-mediated vision	VA; FST	Navigation success over a fixed number of trials; Travel duration	Content validity—Mobility demonstrated a linear relationship with FST. No correlation between VA and mobility Construct validity—No significant difference between controls and patients in suprathreshold transit time (p=0.63). At threshold and dimmer luminance levels, transit times increased for both patients and normal subjects.
Sahel et al, 202115	25 patients with retinitis pigmentosa and RPE65-associated Leber’s congenital amaurosis	StreetLab mobility course	VA; VF; CS; Dark adaptation	Course completion time; PWS; PPWS; Number of collisions; Walking initiation time; trajectory analyses/segments; Distance travelled	Construct validity—Patients performed worse than controls for PWS, PPWS, number of collisions and walking initiation time under both low and high illumination.
Bertaud et al, 2021	22 patients with glaucoma				Construct validity—No difference in mobility performance between patients and controls under photopic luminance. Under glare conditions, PWS and PPWS were significantly lower in patients than controls (p=0.049 and p=0.038, respectively). Mobility time was significantly longer in patients than controls (p=0.046). Distance travelled, mobility incidents and trajectory segmentations not significantly different between patients and controls.
Chung et al, 2018;55 Maguire et al, 201920	19 patients with RPE65-associated Leber’s congenital amaurosis	Multi-Luminance Mobility Test (MLMT)	VA; VF; FST (white light)	MLMT binocular change score (number of collisions and time to navigate course)	Content validity—Variable correlation of accuracy score with quality-of-life questionnaire (r=−0.54 to −0.7). Correlation of mean accuracy score with VA ranged from 0.75 to 0.86. Correlation between mean accuracy score and total degrees of visual field ranged from −0.37 to −0.53. Construct validity—Able to distinguish controls from patients. Repeatability—High inter-grader agreement for scoring (Cohen’s kappa=97.9%). High concordance between scores at baseline visits ranging from 86% to 98%. Sensitivity to change—Over 1-year observation period, controls had an MLMT change score of 0, representing no change, and 20 patients had an MLMT change score of 0. Few patients had an MLMT change score of −1 or −2 (ie, a worsening).
Lam et al, 2024**26	18 patients with NR2E3 and RHO-associated retinitis pigmentosa			MLMT monocular change score	Construct validity—six out of seven RHO patients had stable or improved MLMT scores, including two patients that demonstrated a 3-luminance level improvement. Autosomal dominant-NR2E3 patients had no improvement
Kumaran et al, 202023	19 patients with RPE65-related retinal dystrophy	Vision-guided mobility assessment	VA; CS; VF; FST; Impact of Vision Impairment Questionnaire	Completion time; error number; walking speed; PPWS	Repeatability—Large repeatability coefficient of 1.10 m/s. Content validity—Mean retinal sensitivity (p=0.022) and total hill of vision (p=0.022) predicted walking speed with significance. No correlation between walking speed and VA (p=0.340) or CS (p=0.433) Criterion validity—Walking speed approached significance (p=0.052) and was positively associated with affected subjects’ perceived difficulties with mobility
Pierce et al, 2024;66 Pierce et al, 202467	26 patients with CEP290-associated retinal dystrophy	Ora-VNC (Visual Navigation Challenge)		Navigation time; Composite score	Content validity—Composite score was correlated with BCVA, white light FST and red light FST in both eyes, and blue light FST in the better eye (p<0.05). Construct validity—Nine participants (64%) showed a meaningful improvement from baseline. Repeatability—Mean test–retest variability from baseline to retest in the worse eye was 0.6 for VNC composite score (95% CI = −0.1 to 1.3). Sensitivity to change—Mean change from baseline to 12 months test in the worse eye was −0.1 (-1.2 to 1.0).
Virtual reality O&M
Authie et al, 202329	30 patients with retinitis pigmentosa	MObility Standardised Test (MOST)	VA; CS; VF; Dark adaptation	Trial duration; Number of collisions; Number of steps and flags touched; Entries in the dead end; Course redirections	Construct validity—Demonstrates discrimination between patients and controls (accuracy larger than 95% in all conditions) and between early and late stages of the disease (mean accuracy of 82.3%). Content validity—Average performance score strongly correlated with VA, CS and VF. Reliability—Highly reproducible (intraclass correlation coefficient >0.98) and reliable (VR and real-life correlation r=0.98).
Aleman et al, 2021;27 Bennett et al, 202328	29 patients with choroideremia, RPE65-associated Leber’s congenital amaurosis, EYS, CNGB1, NR2E3, RPGR, CRKL, PRPH2, USH2A, PRPF31-associated retinitis pigmentosa	Virtual reality orientation and mobility	VF; FST; VA	Speed; Accuracy (obstacle identification, departures from the path, direction of movement, collisions and whether the subject missed any arrows or repeated them)	Content validity—Better performance in patients with better VA and larger VF extents. Construct validity—Significant difference in the time to complete obstacle testing between patients and controls (p=0.0027). Controls identified approximately 50% of the obstacles at the dimmest course luminance. All but two patients were able to complete the test, although they required higher luminance levels (by >2 log units) to identify 50% of the obstacles. Repeatability—Small improvement in object detection on the second test leading to positive test–retest differences. Greater test–retest values at the dimmest obstacle course luminance level suggest a minor learning effect.
Facial recognition
Hirji et al, 2020; Hirji et al, 2021	72 patients with primary open angle glaucoma with glaucomatous macular damage	The Cambridge Face Memory test	VF; CS	Percentage of correctly identified faces	Content validity—Significant correlation between facial recognition and VF mean deviation (p<0.0001)
Observer-rated performance tests
Azoulay-Sebban et al, 2020;21 Lombardi et al, 2018	32 patients with glaucoma	Homelab at StreetLab	VA; CS; VF; NEI VFQ-25	Path travel time; Mobility incidents; Movement onset; movement initiation time and duration; Localisation of people time; Face orientation recognition time	Construct validity—No significant difference in path travel time between patients and controls. Number of mobility incidents was higher in the advanced glaucoma group than in the other two groups (p=0.0126 and 0.0281, for controls and early glaucoma respectively). Content validity—Integrated binocular field and VF demonstrated significant correlation with test outcomes. Overall movement duration for small objects in reaching and grasping tasks was significantly longer in patients with glaucoma compared with controls. Mobility incidents and the reaching and grasping task parameters were not significantly correlated with quality-of-life questionnaire scores.
Visual search
Higgins et al, 202049	38 patients with non-neovascular age-related macular degeneration	Computer-based assessment (Visual search task and simulated dynamic driving scene)	VA; CS; MP; EuroQol-5D questionnaire	Total correct responses; Median response time	Construct validity—Slower performance in visual search tasks associated with more severe disease. No significant difference between groups for total correct responses (p=0.342). Significant difference in median response time between the groups (p=0.007). Early and intermediate group’s median response time was not significantly slower than the controls. Content validity—Response time was associated with measures of VA and CS.
Kartha et al, 2023	37 patients with ultra-low vision	Virtual reality visual performance test	Berkeley Rudimentary Vision Test	Item measure; Person measure	Content validity—Negative correlation between patients with poorer visual acuity having lower person measures (p=0.002, r²=0.2, mean absolute error=0.43). Construct validity—Items measures ranged between −1.09 to 0.39 in relative d′ units. Person measures ranged between −0.74 and 2.2 relative d’ units.
Zhang et al, 2022;36 Manley et al, 202237	63 patients with cerebral visual impairment	Virtual toybox and virtual hallway		Success rate; Reaction time; Gaze error; Visual search area; Off-screen per cent (an index of task compliance	Construct validity—For the virtual toybox task, mean success rate for patients was significantly lower compared with controls (p<0.001). Significant difference with respect to mean reaction time with patients taking longer to find the target compared with controls (p<0.001). For the virtual hallway task, the mean success rate for patients was significantly lower compared with controls (p<0.001). Mean reaction time was significantly greater in patients compared with controls (p<0.001).
Driving simulators
Adrian et al, 202222	14 patients with glaucoma	Fixed base driving simulator at StreetLab		Reaction times; Longitudinal regulation; lateral control; eye and head movements; Fixation duration and number per second; Fixation duration; horizontal and vertical gaze direction; head yaw	Construct validity—Compared with controls, patients demonstrated a longer mean duration of lateral excursions (p=0.045), and more lane excursions in a wide left curve (p=0.045). Patients demonstrated a larger SD of horizontal gaze (p=0.034). No significant difference was established for the other measured outcomes.
Lee et al, 2019	31 patients with glaucoma	DriveSafe (slide recognition test)	VA; VF; CS; UFOV test	Total number of correctly identified road user features (DriveSafe score); number of fixation points; average fixation duration; average saccade amplitude; horizontal and vertical search variance	Construct validity—Patients had significantly worse DriveSafe scores (p=0.03), fixated on road users for shorter durations (p<0.001), exhibited smaller saccades (p=0.02), reduced fixation duration and saccadic amplitudes compared with controls (p<0.001 and p=0.02). No other significant group differences were found. Content validity—Significant relationship between clinical measures and DriveSafe scores: UFoV 2 (p=0.005), worse‐eye VF mean deviation (p=0.003), CS (p=0.03) and UFoV 3 (p=0.05).

Where a genetic mutation was reported, this has been included in italics. If a form of validation evidence (eg, construct validity) is absent from the table, it was not reported in the original article.
*Indicates a conference abstract.
BCVA, best corrected visual acuity; CS, contrast sensitivity; FLORA, functional low‐vision observer rated assessment; FST, full-field stimulus testing; MP, microperimetry; NEI VFQ-25, National Eye Institute Visual Function Questionnaire-25; POAG, primary open angle glaucoma; PPWS, percentage preferred walking speed; PWS, preferred walking speed; UFOV, useful-field-of-view; VA, visual acuity; VF, visual field; VR, virtual reality.