GM system evaluation. Evaluation results for nine gene taggers are shown for two of the five corpora used (PennBioIE Oncology, left; Bio1, right). There are 45 data points in each graph. Five evaluation metrics – X, Strict: spans must match exactly; S, Sloppy: spans must overlap; L, LeftMatch: span starts must match; R, RightMatch: span ends must match; E, EitherMatch: span start or end must match – were used to evaluate each tagger. Different colors are used to distinguish between the taggers. F-measure contour lines are displayed in gray, with the corresponding value listed on the right, also in gray.