Merlin Is Magical, but It Still Makes Mistakes

As the popularity of the app鈥檚 Sound ID feature grows, so do concerns about how imperfect artificial intelligence impacts a trove of scientific data.
Photo: Luke Franke/爆料公社

As a volunteer reviewer for , Tim Carney鈥檚 role is to review bird observations logged in multiple counties in Maryland to ensure they are as accurate as possible. In a typical migration season, he might receive up to 50 reports of uncommon species that require him to email users for additional documentation.

But in the past couple of years, Carney says, his workload has grown dramatically. Dubious reports have poured in without sufficient evidence to support them. Carney says that鈥檚 because more birders have been attributing their identifications to , a tool created, like eBird, by the Cornell Lab of Ornithology.

For the most part, the Merlin app鈥檚 Sound ID feature, launched in 2021, is a birder鈥檚 dream. Activate it, and it transforms the bird sounds it hears into images that depict pitch and volume. The app then renders a real-time species identification, using artificial intelligence trained to read those images, called spectrograms. The experience is so seamless, people have taken to calling Sound ID 鈥淪hazam for birds.鈥 

Yet, impressive as the tool is, Merlin Sound ID can make mistakes. And when eBird users rely solely on the technology to make identifications, reviewers are swarmed with unexpected and sometimes questionable observations. The potential consequences go beyond mere irritation: The possibility of misidentifications sneaking through has experts concerned for the integrity of eBird鈥檚 high-quality data source, which is not only valuable for birders but also .

When Cornell launched Sound ID, the idea was to create a space separate from eBird where beginners could learn the complexities of birding by ear. Merlin project manager Drew Weber says the feature 鈥渋s a way of providing a safe playground for people to get acquainted with bird identification,鈥 and to help them build up skills to eventually contribute to community science.

Some Merlin-based submissions to eBird, however, have raised eyebrows in the birding community. Reports of unusual species automatically populate the platform鈥檚 rare bird alerts, which are then emailed to users in the area. As a result, these errors are highly visible. Birders have taken to to point them out. For instance, a Little Ringed Plover in Arkansas (native to Europe) and a Plush-crested Jay in a backyard in Michigan (native to South America), both misidentified by Merlin, were listed in the past few months.

It鈥檚 the slip-ups involving native species, however, that most worry experts. The Philadelphia Vireo, for instance, is an uncommon migrant over much of North America, but its song is extremely similar to the more common Red-eyed Vireo. Even experienced birders have trouble discerning the two, says Wisconsin eBird reviewer Jason Thiele. And Philadelphia Vireos identified by Merlin have significantly increased the number of submissions eBird reviewers are seeing for the more elusive species. Carney, the reviewer in Baltimore, has seen a huge spike in reports just in the past year, forcing him to spend more time tracking down evidence.

It鈥檚 not yet clear if or how these reports have affected eBird鈥檚 data quality. Jenna Curtis, project leader at eBird, says the team at Cornell is looking into what role Merlin might play in contributing to bias in the database鈥攐r removing it. In fact, it鈥檚 likely that birders have historically under-detected Philadelphia Vireos and that Merlin is helping to correct that oversight, according to Weber. The Merlin team has also noticed increased detections of high-pitched species like Tennessee Warbler, Blackpoll Warbler, and Golden-crowned Kinglets, Weber notes, which are likely underrepresented in eBird鈥檚 data because they鈥檙e easily missed by birders with high-frequency hearing loss.

When Merlin Sound ID does make a mistake, it鈥檚 often an issue of audio length. The app analyzes sound in three-second intervals, but making an accurate identification sometimes requires a longer snippet of birdsong. Confidently distinguishing Philadelphia from Red-eyed Vireos by ear, for instance, requires tuning in to subtle differences in the cadence of their songs that unfurl over time. Merlin also struggles with mimics like Northern Mockingbirds: By analyzing vocalizations in short intervals, the tool often ends up the species that the bird is mimicking. 鈥淢erlin doesn鈥檛 have that kind of memory currently, but that鈥檚 something we can investigate in the future,鈥 says Weber.

Experts still encourage birders to submit the species they hear to eBird, as long as they can make a confident identification. This means also seeing the bird if possible, especially for easy-to-mix-up species. Curtis and Weber also urge anyone who submits Merlin-based observations to upload sound recordings from the app along with their eBird checklists. Those files not only give reviewers further evidence to check, but also train Merlin鈥檚 algorithm to make more accurate identifications in the future. If submitting a longer recording, users should include a timestamp in the notes to specify when the vocalization in question happens, says Carney. eBird also provides for using Merlin. Thiele gave his own set of similar tips in last year.

The Merlin and eBird teams are also making changes within the app, designed to improve how the two platforms interact. Merlin now reminds users to turn on their phone鈥檚 location services to narrow down possible identifications to birds that occur in that area. The team has also to help make it easier to upload Merlin recordings to eBird.

Navigating these issues will be an ongoing learning experience as Merlin Sound ID evolves, Curtis notes. 鈥淚t鈥檚 funny,鈥 she says, 鈥測ou think about a paper field guide, the way most people learned how to bird up until recently鈥攖here鈥檚 no pop-up messaging. There鈥檚 no banner there to tell you that should exercise caution.鈥 Technology is pushing birding in exciting new directions, but it鈥檚 not quite a substitute for learning to identify birds the old-fashioned way: by spending time in the field and learning from other birders.