As snails go, the rosy wolfsnail (Euglandina rosea) is a perfect predator. An invasive species in Hawaii, French Polynesia and a number of other islands in the Pacific and Indian Ocean, the wolfsnail has hunted at least eight other snail species to extinction in Hawaii alone. It has probably caused many more extinctions in places where it has been introduced; one study1 suggested it was “highly probable” that it was responsible for more than 100 extinctions of mollusc species in the wider Pacific.
It’s easy to be hyperbolic about the rosy wolfsnail. But perhaps not as hyperbolic as Google’s search engine which, at the time of writing, informs users that the wolfsnail’s top speed is roughly 19 miles (31 kilometres) per hour — a touch below the motorized speed limit in many major cities and a fair bit faster than its actual top speed of 8 millimetres per second. Google sourced this error from a University of Florida blog post entitled ‘The Rosy Wolf Snail Is Fast For A Snail’.
Most scientists — and probably most of the general public — would be able to spot the error, but mistakes such as these are examples of a wider phenomenon, one that researchers warn could severely affect efforts to improve the trawling of scientific literature. “Search engines don’t understand context,” says Jevin West, an information scientist at the University of Washington in Seattle. Instead, they rely on a simple free text search to find relevant material: whatever text is typed into the search bar is what the engine looks for.
It’s a problem that any scientist should be wary of, especially given the rapid increase in research output. “Everyone searches,” says West. “You might be a gravitational scientist; you might be an epidemiologist. No matter what field, we all use the same search tools.”
Hyperauthorship: the publishing challenges for ‘big team’ science
Another example of how errors can be propagated involves ‘ß-carotene’ — a biological compound that doesn’t exist, and yet appears at least 121 times in the literature, according to an investigation2 by Jaime A. Teixeira da Silva, an independent researcher based in Kagawa-ken, Japan, and Serhii Nazarovets, a bibliometrics scholar at Borys Grinchenko Kyiv University. ß-carotene is a misprint of β-carotene, an organic compound that gives vegetables their colour; the real name is spelled with the Greek letter beta, rather than the German eszett, which denotes a double-s sound.
Da Silva and Nazarovets told Nature that the most probable reason for this mistake was the incorrect selection of a special character in a word processor. “Nevertheless”, they continue, “publishers are ultimately responsible for the best possible presentation of research results, so they should check the integrity of the materials they publish.”
From CARs to TRUCKs
Alongside typos, there are many other issues that can plague and corrupt academic search. One common problem is the use of acronyms.
Authors “try to create these funny, easy-to-remember names, and often they end up having an acronym”, says Eva Bugallo Blanco, a cancer immunologist at King’s College London. It’s often “not even the first letter of each word: it’s one letter from the front of a word, then the second or third from the third word, then the fourth word from somewhere else, so that they can come up with something that is easy to remember. They’re really forcing it”.
How to manage your time as a researcher
Bugallo Blanco works in chimeric antigen receptor T-cell therapy, which is commonly shortened to CAR-T therapy, and further shortened to CARs. Attempts to create new acronyms are ongoing. “There are these newer types of CARs that we call ‘armoured CARs,’ or you have TRUCKs,” she says. Armoured CARs are CAR T cells that express extra proteins that protect the T cell in a biological environment, thus armouring the CAR. TRUCKs stands for T cells Redirected for antigen-Unrestricted Cytokine-initiated Killing. “They’re getting the most out of that car metaphor,” says Bugallo Blanco.
These nicknames are easy to remember, but harder to find when searching the literature, says Bugallo Blanco. Standard search engines such as Google are unusable, because searching ‘CARs’ will produce the four-wheeled variety. Google Scholar and other academic search engines struggle, too. “Often, I’ll search ‘CAR T cells’ which might find some of what I want. But if I type ‘chimeric antigen receptor T cells’, I get a completely different set of papers. It’s searching for the keywords without understanding I’m looking for the same thing.”
Homonyms — words that sound, or are spelled, the same but mean different things — are another problem for academic searches.
In 2004, the European Space Agency (ESA) launched its Rosetta mission with the goal of landing on a comet’s surface. “I was in charge of the communications and outreach team for that mission,” says Mark McCaughrean, senior adviser for science and exploration at ESA. He remembers a friend e-mailing him a screenshot of the first page of a 2013 Google search for ‘Rosetta images’ which gave prominence to a fairy of that name featured in Disney cartoons. “He said: ‘You’ve got some work to do if people can’t find your mission,’” says McCaughrean.
How language-generation AIs could transform science
In time, as the ESA mission was covered by news outlets and the fairy fell out of fashion, the picture has changed. Multiple Rosettas now compete for the Google limelight; the ancient stone that served as the crib to translating hieroglyphics, software developed by Apple for its Mac computers and ESA’s mission all jostle for attention. “There are no more fairies on that first page, let’s put it that way,” he says.
“I often think one of the main skills of being a researcher these days is having good Google-fu,” says McCaughrean. He defines this as “being able to figure out what to ask Google that minimizes the amount of time you have to spend weeding out crap” (See ‘Good Google-Fu’).
To help others search through scientific literature effectively, West recommends that researchers work with editors to ensure relevant metadata tags are added alongside an article — search engines often prioritize these tags over the free text used in article titles or abstracts. “It becomes the responsibility of the researcher: when I submit a paper, I should spend more time thinking about the keywords. They determine whether my paper will be surfaced by a search engine,” he says.
Overcoming the obstacles to invention
Improvements to academic search engines have been slow to materialize because “academic search has become a little bit of a backwater”, says Paul von Hippel, who studies public policy, sociology, statistics and data science at the University of Texas at Austin. “There’s not as much money as there is in other kinds of search.” Nevertheless, a new generation of search engines aims to fix some of the contextual issues that conventional algorithms struggle with.
Many researchers expect that more sophisticated artificial intelligence (AI) tools could be used to surface and summarize literature more effectively. Chat GPT, the free-to-access large language model (LLM) operated by OpenAI, based in San Francisco, California, can already write summaries of research papers that are convincing enough to fool scientists into thinking that they were written by a human, according to a 2022 bioRxiv preprint3.
But LLMs have shown a tendency to spit out inaccuracies and create chaos and concern among scientists. In November last year, Facebook’s parent company Meta launched Galactica, an LLM designed to help academics quickly find relevant papers. It was pulled offline after two days, once scientists took to Twitter to share examples of the AI quoting from studies that didn’t exist, making up information (known as hallucination in AI parlance) and writing a variety of racist and homophobic remarks. Hallucination is common among LLMs, which essentially ‘guess’ what the next word in a series might be, on the basis of what they have seen from a very large corpus of text.
Other search engines have found more success. The search engine Perplexity AI, developed by San Francisco-based start-up Perplexity uses conventional search methods to find results then uses AI technology to provide a list of sources. (It still says a wolfsnail moves at 19 miles per hour but provides a neatly formatted reference back to the University of Florida blog.)
Elicit, meanwhile, claims to work as a ‘research assistant’ by using AI tools. Created by Ought, a non-profit organization in San Francisco, it’s built on a combination of conventional search techniques and a series of LLM-assisted steps and aims to help users automate some steps in a literature review.
What ChatGPT and generative AI mean for science
Built on the principle that not all citations are equal, start-up Scite in New York tracks whether citations are positive or negative to distinguish between a reference that says ‘this paper is wrong’ and one that says ‘this paper is great.’
West is a board member and adviser for Consensus, a company based in Boston, Massachusetts, whose search software — of the same name — aims to extract findings from scientific research, rather than whatever other sources sneak to the top of search-engine rankings.
In January, von Hippel co-authored a paper4 that outlined, in part, some of the characteristics that future academic search engines might need to counter scholars’ biases and errors. The authors write that distinguishing between positive, negative and neutral citations (as Scite aims to do) is an important step. They also recommend a move towards semantic search — next-generation search engines that focus on the meaning behind words rather than matching just the text itself. If a user searches for papers on CAR T cells, for instance, a semantic search engine should know to search for the acronym and the full name simultaneously.
But although incremental improvements to search engines and new technologies for LLMs might help to improve searches, Roger Chabot, an education developer and scholar of library science at Western University in London, Canada, warns that any technology can only go so far. “Relevance is a human, subjective quality. How do you replicate that idea of relevancy artificially? It’s a hard question.”