Title: Second‐trimester sonographic soft markers: what can we learn from the experience of first‐trimester nuchal translucency screening?
Abstract: Nuchal translucency thickness (NT) measurement as a screening test for fetal Down syndrome has revolutionized obstetric ultrasound. For the first time, the performance and usefulness of a sonographic marker (NT) as a screening test for a fetal condition (trisomy 21) was tested prospectively and confirmed by many large-scale studies using the same or very similar methodology, prior to being accepted generally and incorporated into routine clinical practice. Based on these studies, it became very clear that the success of using a sonographic marker as a screening tool requires the development of a precise protocol, proper operator training, certification, auditing of performance and re-certification. When any of these elements is lacking, the high accuracy of NT screening is not achievable. The importance of adherence to a strict protocol and stringent quality assurance has been recognised for decades by our colleagues in the biochemistry department through their development of maternal serum screening for open neural tube defects and Down syndrome. In contrast, many sonographic tests, which are actually far more complicated and cumbersome than NT measurement, have been incorporated widely into clinical practice well before full evaluation, leading to much confusion and mismanagement. Of these, the second-trimester ‘soft markers’ is the best example. Soft markers are qualitatively different from structural anomalies. They are detected in a variable proportion of normal fetuses. They are of interest because of their statistical association with chromosomal abnormalities, making such markers potentially useful as a screening tool for fetal aneuploidies. Since the majority of cases of trisomies 13 and 18 are associated with structural anomalies that are detectable by ultrasound, the role of soft markers for screening of these conditions is limited. Therefore, the following discussion focuses on the clinical value of using soft markers as a screening test for trisomy 21. Unlike NT, the typical history of soft markers started with the documentation of a new proposed marker in a series of abnormal cases in a high-risk population, without controlling for confounding factors. The subsequent overwhelming enthusiasm led to the premature incorporation of assessment of the marker into routine clinical practice. Only with more experience did the denominator emerge which lowers the incidence and strength of association. New technologies and studies in ultrasound then showed the marker to be present in far more normal cases than was originally thought, reducing the positive predictive value further, even as far as to the background risk. Eventually, only after being used for several years, have many of these markers been found to be poor or even useless for trisomy 21 screening. Although ultrasound is a ‘non-invasive’ procedure, it is not without risk. Pregnant women referred for further assessment because of a soft marker have significantly higher levels of anxiety than do those referred for advanced maternal age only, at a level similar to those referred for abnormal maternal serum biochemistry1. This anxiety could lead to a cascade of additional, often unnecessary, tests and is often very difficult to resolve without an invasive diagnostic test. Therefore, all sonographic tests, like any medical test, must be evaluated properly, and should only be implemented if a clear benefit is demonstrated. So, how should we assess the usefulness of soft markers? If they are being used for the purpose of screening for trisomy 21, they should be examined as such. In other words, we should be satisfied that we have a well-characterized test that provides an acceptable sensitivity and specificity. Unfortunately, none of the soft markers used in clinical practice has undergone adequate vigorous scientific scrutiny, and the major problems are as follows. Firstly, the second-trimester ‘genetic sonogram’, which includes both soft markers and structural anomalies, when used by the most experienced operators was able to detect about 70% of cases of trisomy 21, with a false-positive rate (FPR) of about 10%2. This performance would be substantially lower if only soft markers were included, and is substantially inferior to other well-established screening protocols, such as first-trimester NT screening (detection rate of 76.8% at a 4.2% FPR3) or first-trimester combined screening (detection rate approaching 90%4). Therefore, it is quite clear that second-trimester soft markers alone should not be the screening test of choice for trisomy 21. Secondly, do we have a standardized methodology in applying soft markers as a screening test for trisomy 21? The answer, quite simply, is ‘No’. There are wide variations in the definition of individual markers; the methodology of examination is not standardized; and many of the reported studies were performed in referral centers on high-risk pregnancies. It is generally agreed that assessment of multiple markers is required to produce acceptable screening performance. However, there is no consensus as to how many markers should be examined, or how to integrate these markers. Some have proposed the simple approach of defining a high-risk group by the presence of two or more markers, or using a scoring system5, but there is much controversy over the relative importance of each marker and therefore the validity of such a simple approach. In contrast, the use of likelihood ratios (LR) appears to be a more scientific approach. The use of LR enables us to modify the background risk, taking into account all of the available information including prior screening test results. Unfortunately, there are wide variations in the reported LRs associated with each of the markers individually or in combination, due to the lack of consistent definitions and methodologies in their assessment. For example, the estimated LR for isolated echogenic intracardiac focus ranges from 1.1 (useless) to 5.4 (moderately strong marker)6-8. Which is correct? If we are going to use several markers, the ultimate uncertainty in the combined LR could be enormous. Thirdly, the absence of soft markers has been used to reduce the individualized risk of trisomy 21, with a reported negative LR as low as 0.159. The reported series are in general conducted in very specialized centers. It is still unclear what the exact negative LR should be, how many markers need to be included in the assessment, and the precise definition of each marker. What is certain, however, is that reproducible results cannot be achieved until the same methodology is used consistently. Therefore, it is recommended that a reduction of risk should not be applied ‘on the basis of a 16–20-week ‘screening’ scan, owing to the variety of imaging locations involved’10, unless the scan is ‘undertaken in an established centre performing tertiary-level ultrasound’11. We would add that such centers should either have the data to generate their own negative LR, or, if they are using reported negative LRs from the medical literature, should have adequate evidence to confirm that the methodology used is exactly the same as that described in the original reports, and that their performances are comparable. Fourthly, unlike first-trimester NT screening, there is no specific structured training for the assessment of each of these soft markers, nor is there any certification or recertification process. It is unlikely that adequate quality assurance programs are present in the majority of centers to audit the outcome and performance of the center or of individuals in using these soft markers. Experience from all screening programs, including first-trimester NT screening as the closest example, tells us that no screening program will be successful in the absence of a stringent certification and quality assurance process. It is quite clear that if a woman requests prenatal screening for trisomy 21, sonographic soft markers would not be the test of choice. Yet, there is a great temptation to use second-trimester sonographic markers to modify the individualized risk of trisomy 21 in patients who have received other forms of screening tests in early pregnancy. However, it does not seem logical to modify a risk calculated on the basis of a reliable test, which was well studied and had a clear protocol and quality assurance program, by a second test that lacks standardization or quality assurance—possibly with the exception of the few units which have considerable follow-up data of their own12, 13. We have no intention of disputing the association between second-trimester markers and fetal trisomy 21, in particular nasal bone hypoplasia, echogenic bowel and nuchal fold thickness. However, we do have concerns that these markers are being used in routine clinical settings before their effect on screening has been fully elucidated, with an uncertain effect on the overall screening performance. With the many effective screening programs currently available in the first and early second trimesters, it is likely that only very strong soft markers, such as those with a LR of more than 10, will contribute significantly to further improving the performance of trisomy 21 screening programs. Therefore, efforts should be focused on: (i) the development of a well-defined protocol for the evaluation of these few strong markers, including how to do it, when to do it, and who should do it; (ii) the development of an algorithm for incorporating these markers into existing screening programs, such as in the recent study by Borrell et al.14; and (iii) the confirmation of their efficacy as screening markers by large prospective studies in an unselected population. Over the last few years, we have been pleased to see that there have been more discussions in the medical literature concerning training and quality assessment in obstetric sonography15-17. This is encouraging. It should be remembered that a service without quality is worse than no service at all. We would like to conclude with the following quote: ‘The screening procedure should not be merely a test but must be a comprehensive program… . The screening process must allow patients to decline intervention at each step throughout the process. A screening program must undertake regular clinical audit to evaluate local performance’11. There is still much to be achieved before second-trimester soft markers find a definite role in fetal aneuploidy screening. The basic principle of medicine—‘First do no harm’—should always be observed. Any test, be it diagnostic or screening in nature, should not be used in clinical practice unless its efficacy has been established.