Literature searching methods or guidance and their application to public health topics: A narrative review

This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.

Abstract

Background

Information specialists conducting searches for systematic reviews need to consider key questions around which and how many sources to search. This is particularly important for public health topics where evidence may be found in diverse sources.

Objectives

The objective of this review is to give an overview of recent studies on information retrieval guidance and methods that could be applied to public health evidence and used to guide future searches.

Methods

A literature search was performed in core databases and supplemented by browsing health information journals and citation searching. Results were sifted and reviewed.

Results

Seventy‐two papers were found and grouped into themes covering sources and search techniques. Public health topics were poorly covered in this literature.

Discussion

Many researchers follow the recommendations to search multiple databases. The review topic influences decisions about sources. Additional sources covering grey literature eliminate bias but are time‐consuming and difficult to search systematically. Public health searching is complex, often requiring searches in multidisciplinary sources and using additional methods.

Conclusions

Search planning is advisable to enable decisions about which and how many sources to search. This could improve with more work on modelling search scenarios, particularly in public health topics, to examine where publications were found and guide future research.

Keywords: bibliographic databases, database searching, grey literature, information sources, information storage and retrieval, knowledge synthesis, literature searching, public health, supplementary searching, web sites

Key messages

Key questions for information specialists of how many databases and which databases to search cannot be answered with a “one size fits all” approach.

Advice from the Cochrane Handbook is to search medline , embase and central as a minimum, and this advice is often but not always followed.

Combining database searching with additional techniques including website and grey literature searching reduces bias but is time‐consuming and will not necessarily produce valuable results.

Pre‐planning is important, particularly for complex topics, and consideration needs to be given to the topic, type of intervention and type of study required.

Sources, information type and volume associated with public health searching mean that planning and taking iterative search steps can be beneficial.

BACKGROUND

The National Institute for Health and Care Excellence (NICE) was established in 1999 with the aim of providing national guidance and advice to improve health and care in the United Kingdom (UK). NICE public health guidance has been published since 2006, covering key areas of public health such as smoking cessation, obesity and physical activity. NICE methods, originally developed for clinical topics, were adapted to suit the different needs of public health guidance. NICE guideline recommendations are based on a review of the best available evidence. Systematic and reproducible methods are used to create evidence reviews, on which these recommendations are based.

When starting a systematic search on any topic, there are key questions, including “which sources should be searched?” and “how many sources should be searched?” The NICE methods manual advises searchers to include “a mix of databases, websites and other sources” (NICE, 2018). However, the number of or identity of databases, websites or sources is not specified. The manual goes on to state that the sources will depend on “the subject of the review question and the type of evidence sought.” Suggestions are offered for sources based on the type of question being searched. This paper is interested in systematic reviews and other evidence synthesis that can support public health recommendations.

Handbooks and methods manuals from other organisations supply some answers to the key questions. The Methodological Expectations of Cochrane Intervention Reviews (MECIR) standards suggest that, to identify as many relevant references as possible and to minimise bias in systematic reviews of health interventions, it should be mandatory to search Central Register of Controlled Trials (CENTRAL), embase and medline (Higgins et al., 2020). Generally, review methodology handbooks do not itemise exactly which databases they recommend. The SIGN Handbook (Scottish Intercollegiate Guidelines Network (SIGN), 2019) lists the core databases plus “Internet sites relevant to the topic” and World Health Organization International Clinical Trials Registry Platform. The Campbell Collaboration explain that the decision about which topic‐specific databases to search in addition to those routinely searched (e.g., medline and embase ) is influenced by “the topic of the review, access to specific databases and budget considerations” (Kugley et al., 2017).

The key search questions need to be reassessed in the context of public health, which is a growing discipline in evidence‐based practice that requires a different approach to clinical review questions. The Centre for Reviews and Dissemination guidance (Centre for Reviews & Dissemination, 2009) for undertaking systematic reviews points out that public health topics require a wider range of databases to be searched than a clinical review. This is due to the different searching needs of public health topics compared with clinical topics. Public health topics do not necessarily look for evidence effectiveness only. Interventions may be complex, and therefore, it is advisable to additionally search for evidence on “processes, mechanisms and theory” (Thomas et al., 2019). Searching for public health topics involves an understanding of “subject breadth and technical demands of the databases to be searched, the fluidity and lack of standardization of the vocabulary, and the relative scarcity of high‐quality investigations at the appropriate level of geographic specificity” (Alpi, 2005).

This narrative review has been undertaken alongside a NICE research project examining which sources identified included publications for public health topics at NICE. It is also a follow up and extension of an earlier paper (Levay et al., 2015) that retrospectively assessed the contribution of medline and other key sources to public health evidence reviews at NICE. Levay et al. (2015) explored two core themes of public health literature searching: the variety of databases needed to cover a multidisciplinary evidence base and the range of search techniques required to find different types of evidence. It confirmed that there is no “one size fits all” solution and recommended pre‐project planning, testing the appropriateness of sources, the value of topic‐specific databases and the efficiency and suitability of non‐database search methods. This review explores if these findings have been confirmed by later literature.

Aim and objectives

The aim was to give an overview of studies published between 2015 and March 2021 on literature searching guidance and methods that could be applied to searching systematically for reviews of public health evidence.

The objectives were to identify:

studies describing theories and concepts relating to search methods, sources, systems and techniques;

studies assessing the impact of which sources were searched and how many sources were chosen; how the retrieved studies could be applied to reviews of public health evidence; key lessons from the literature to guide searching for public health topics; and key gaps in the evidence on searching for public health topics.

METHODS

The Faculty of Public Health's definition of public health was used, to be consistent with Levay et al. (2015). Public health means “promoting and protecting health and well‐being, preventing ill health and prolonging life through the organised efforts of society,” which incorporates three key domains of health improvement, improving services and health protection (Faculty of Public Health, 2016).

A literature search was performed in May 2017 and updated in February 2019 in a range of bibliographic databases. Additional abbreviated updates were performed in December 2019 and March 2021. The strategies were developed by an information specialist at NICE and peer reviewed by another information specialist. See Appendix A for details of the search strategy. The databases were searched using a combination of subject headings and free‐text terms in the title and abstract fields to describe search methods, strategies, techniques, approaches and databases. For practical purposes, the search strategies were limited to English language as no resources were available to translate papers in other languages. The searches were limited to “2015 to current,” because the search was intended to be a follow up to Levay et al. (2015).

The search strategy was developed in the medline bibliographic database (Ovid interface, 1946 to February Week 2 2019) and adapted as appropriate for the following databases:

Applied Social Sciences Index & Abstracts (ASSIA)—ProQuest—to present; embase —Ovid—1974 to 2019 week 06; Library, Information Science & Technology Abstracts (LISTA)—EBSCO Host—to present; medline Epub ahead of print—Ovid—12 February 2019; and medline ‐in‐Process—Ovid—13 February 2019.

The database searches were supplemented in February 2019 by browsing the tables of contents on the websites of the following information science journals:

Health Information and Libraries Journal; IFLA Journal; Information Retrieval Journal; Journal of the Canadian Medical Libraries Association; Journal of European Association of Health Information and Libraries (EAHIL); Journal of Information Science; and Journal of the Medical Library Association.

The Web of Science (WoS) Core Collection was also searched to check the references of the key papers (backwards citation searching) and for later papers citing these key papers (forwards citation searching). This incorporated:

Science Citation Index Expanded (1990 to present); Social Sciences Citation Index (1990 to present); Emerging Sources Citation Index (2015 to present).

The abbreviated updates were performed by browsing an in‐house information science current awareness bulletin, papers discussed in a team journal club and an in‐house tool aiming to keep team members up to date. Papers in the current awareness bulletin are found through searches in embase , lisa and lista and browsing the information science journals named above. Papers to be discussed in the team journal club are found by manual scanning of Tables of Contents from a range of journals on public health, information science and research methods. The in‐house tool sources papers in a few ways including monitoring of email lists, social media and library and information conferences. These sources were browsed from the date of the February 2019 update. The 7569 results from the searches were downloaded to EndNote for initial processing, and after removing duplicates, there were 5587 remaining results in February 2019. One information specialist screened the title and abstracts of the results and selected abstracts to consider, conferring with two other information specialists to make a final decision about which publications to order as full text. See Appendix B for the screening criteria. In total, 122 publications were obtained at full text, and one information specialist reviewed them. A formal quality assessment of papers was not performed. After screening, 72 of the publications were deemed relevant for inclusion in this review.

RESULTS AND DISCUSSION

There is a growing literature on the two key questions of “which sources” and “how many.” Two main themes have emerged: papers that focus on issues regarding the contribution of specific databases or sources and papers examining the guidance behind the search techniques required for a search. Searching for public health topics has not been covered as extensively as clinical topics and lessons learnt when searching for clinical topics are not necessarily applicable in other subject areas.

Theme one—Database choice

Database choice—Number of databases

The number of databases searched for systematic reviews and meta‐analysis has increased since the 1990s. An analysis (Lam & McDiarmid, 2016) of the number of databases searched in 1994, 2004 and 2014 reports that the mean number of databases searched grew from one in 1994 to four by 2014. It is not surprising therefore that the subject of how many and which databases to search is a key theme.

Database choice—MECIR compliance

The Cochrane Handbook advises that the most important databases to search for reviews of interventions are medline , embase and central , as a minimum, if searching for reports of trials (Higgins et al., 2020). This is to reduce the likelihood of bias. In practice, this advice is not always strictly followed. A recent sample of systematic reviews suggested that only 10% of them conducted a comprehensive search using a range of sources (de Kock et al., 2020).

Halladay et al. (2015) examined 50 Cochrane Reviews of therapeutic interventions that searched pubmed and embase and found that the benefit of searching embase was “modest” compared with pubmed . central is not mentioned in the study.

Another paper found that pubmed contained 70.9% of included publications from a selection of Cochrane Reviews from different Cochrane groups, reinforcing that it is important not to rely on one database. However, it also highlighted that 70.9% is the upper limit of what could be found and a poor search strategy may retrieve less (Frandsen, Eriksen, et al., 2019). In a related paper by the same authors, it was found that searching embase and pubmed in the same selection of Cochrane reviews increased the coverage of included publications slightly, depending on the Cochrane group (Frandsen et al., 2021). Searching both databases still did not retrieve all relevant publications, illustrating that it is important to consider supplementing searches with additional sources.

This point is reflected by the Vassar et al. (2017) paper examining neurology systematic reviews and meta‐analyses. They reflected that only searching medline and embase could lead to bias in the sample of primary studies used to make recommendations or summaries of effects.

In a paper looking at the optimum search strategy for core outcome sets, Gargon et al. (2015) discovered that 97% of included studies were indexed in medline . However, the search strategy only found 87% of included records, demonstrating that it cannot be assumed that a search strategy will pick up each relevant result indexed in the database.

Nussbaumer‐Streit et al. (2018) and Ewald et al. (2020), in a two‐part project, experimented with the MECIR recommended databases by rerunning 60 Cochrane Review searches with combinations of medline , embase and central . In part one, they found that, with the abbreviated search approaches, in 8% to 27% of the Cochrane Reviews, there would be a change of conclusion, and in 2% to 5%, the opposite conclusion would be reached. In 5% to 12%, it would have been impossible to draw a conclusion. In part two, looking at treatment effect estimates, they found an abbreviated search approach gave identical or similar treatment effect estimates in 47 of the 60 Cochrane Reviews. However, in 6% to 13% of the Cochrane reviews, relevant differences occurred. This highlights that abbreviated searches make a different impact depending on which facet of a systematic review is being considered. Ultimately, to make conclusions with the greatest possible certainty, a comprehensive search should be performed, and the authors recommend this should include specialised databases. They acknowledge that some of the abbreviated literature searches could be an acceptable option for rapid evidence synthesis, if the “decision‐makers are willing to accept less certainty” (p. 1; Nussbaumer‐Streit et al., 2018).

These conclusions are reinforced by a later paper published by some of the same authors looking at three case studies of rapid reviews (Affengruber et al., 2020). For each of three Cochrane Reviews (two clinical and one public health), an abbreviated literature search was performed, replacing the comprehensive literature search. In this instance, the conclusions of the two clinical topics would have been unchanged if a rapid review abbreviated search had been followed. However, the third case study was a public health topic, and the abbreviated search found substantially less relevant results than the comprehensive search. This would have resulted in the Cochrane Review authors being unable to draw a conclusion anymore. This leads to a reflection that, although a rapid review approach may work for some clinical topics, an abbreviated literature search may not be adequate in public health.

Three agri‐food public health case studies have a similar finding that cross‐cutting topics do not benefit from abbreviated searches. In their case studies, “methodological shortcuts” (searching one database only or only searching bibliographic databases) resulted in relevant results being omitted (Pham et al., 2016).

Some papers found that searching the minimum medline , embase and central is not enough to find all relevant studies (Aagaard et al., 2016). However, searching additional databases is not necessarily the answer. A conference presentation (Posey et al., 2016) analysed 97 systematic reviews of clinical topics and found that an average of four or five databases were searched per review but that 95%–100% of included publications could be found in a combination of three databases (one medical, one general and one topic‐specific database). Searching additional databases increased volume without the reward of additional included publications. Both Aagaard et al. (2016) and Posey et al. (2016) recommend that instead of searching additional databases, time would be better spent using additional search methods like reference checking and citation searching.

Database choice—Topic, intervention and study

The choice of databases can depend on the topic of the review (Hartling et al., 2016). If the topic is multidisciplinary, multiple databases will need to be searched in order to find studies from each discipline involved (Harari et al., 2020). National, regional and subject‐specific databases may be relevant (Whaley et al., 2020). The type of intervention and the study type can affect the appropriate databases to choose, as well as the topic (Goossen et al., 2018; Wood et al., 2017). In the case of economic evaluations, the authors note that the majority of searching is done in medline with only some searches in embase and other databases. The key specialist database for this area was NHS Economic Evaluation Database, but with its demise in 2015, there is a need for “methodologically appropriate strategies” to be used in searching medline and embase (Arber et al., 2018).

The importance of a robust search strategy is highlighted when searching for qualitative reviews (Wright et al., 2015). The Cumulative Index of Nursing and Allied Health Literature (CINAHL) is a good source of qualitative studies, with Rogers et al. (2018) noting that this is because it has the best controlled vocabulary for qualitative research. Rogers et al. (2018) conclude that for qualitative dementia research, if CINAHL and PsycINFO are searched, then medline and embase are not required. Frandsen, Gildberg, et al. (2019) were looking at “a wide range of topics within health research” and concluded that CINAHL, along with Scopus and ProQuest Dissertations and Theses Global, provided the greatest retrieval for qualitative reviews. This illustrates the importance of considering the type of research being undertaken and the type of evidence being sought when deciding on which databases to search.

Database choice—Currency

It is important to select databases that provide the most current results. Two papers, Duffy et al. (2016) and Thompson et al. (2016), compare medline and pubmed and illustrate that the choice of source affected comprehensiveness. This has changed since 2016, and medline ALL now “covers all of the available content and metadata in pubmed with a delay of one day” (Lefebvre et al., 2019). Although the databases now provide equivalent content, it is still important that searchers check the currency of the databases they search.

Database choice—Guidance versus practice

There has been some work comparing guidance to practice in database choice. Cooper, Booth, et al. (2018) performed a literature review to see if there is a consensus between guidance documents (e.g., Cochrane Handbook and NICE manual) and published studies on literature searching methods. They found that there was not a consensus on an approved number of databases to search, leading them to state that “researchers should be focused on which databases were searched and why, and which databases were not searched and why.” This means that databases should be searched if they have a demonstrable value to the review, rather than because there is an optimal number to use.

Wood et al. (2017) compared resources searched in practice with recommendations from NICE and SuRe Info on how to conduct economic evaluations. As mentioned above, they found that although most systematic reviews conformed with NICE and SuRe recommendations to search medline , only some searched embase , and little searching was done on specialist economics databases. Reviews that do not follow recommended practice are at risk of publication bias and missing relevant studies.

Database choice—Metrics

Some papers have approached the subject of database choice in an empirical way by using metrics to assess which databases provide the most value and how much work needs to be done to find relevant studies. Demonstrating value is key to Ross‐White and Godfrey (2017), where they suggest using the number‐needed‐to‐retrieve to decide on the value of a database. This method was found to be a valuable way to measure how much effort is needed to retrieve an included publication.

Cooper, Lovell, et al. (2018) and Cooper, Varley‐Campbell, et al. (2018) found that “Capture‐recapture” is a useful method for planning work at the beginning of a review because it can be used to estimate the potential number of studies likely to be found.

Theme 2—Supplementary search sources

Supplementary search sources—Search techniques

The second theme identified in the results was the role of supplementary search techniques, including the addition of grey literature and website searching. Booth (2016b) has suggested that for qualitative research, these techniques should be the focus, rather than searching a large number of bibliographic databases. Delaney and Tamas (2018) also question reliance on databases as the principal source of evidence, finding fault with database indexing, particularly for cross‐cutting topics, and the potential bias caused by only looking at the published studies found in databases. They suggest researchers consider alternative sources and think critically about their information retrieval options. Echoing this, Boulos et al. (2021) find that a range of supplementary sources need to be searched in combination with databases to find all relevant prognostic factor studies.

Cooper, Lovell, et al. (2018) and Cooper, Varley‐Campbell, et al. (2018) provide an example of a review where supplementary searching was useful compared with bibliographic databases. They found, in a combined environmental and public health review, that the databases contributed only two minimally useful included publications out of around 21,000 results. However, the supplementary search methods retrieved just 453 references for screening, and this produced nine studies to include, of which four made unique contributions to the quantitative and qualitative synthesis.

Although investigating an approach for a systematic search for epidemiologic publications, Waffenschmidt et al. (2017) found that around 14% of publications could only be found by handsearching meeting websites and regional journals that would not have been indexed in bibliographic databases.

Supplementary search sources—Value of grey literature and unpublished data

Grey literature has been defined and redefined over time, but the general consensus is that it can include publication types such as theses, government documents, research and project reports that have not been published by a mainstream publisher (Farace & Schöpfel, 2010). It can also include unpublished data, such as clinical trials in ongoing research registries. These publication types can be found in some specialised bibliographic databases, but coverage can be sporadic, and they should not be relied on as the main source of grey literature. Although being unpublished can be what makes searchers reluctant to include this data, it should be included if it meets the objectives of a specific systematic review (Whaley et al., 2020).

Grey literature and unpublished data help to avoid publication bias, because searching sources that only cover published results may just return more of the same evidence. By contrast, grey literature searches and searches of clinical trial registries may reduce bias by retrieving evidence from a more diverse range of sources (Pradhan et al., 2018). This could influence the conclusions and consequently health care decisions (Halfpenny et al., 2016). In some cases, not including the results of unpublished trials could mean that the effects of treatments are overestimated (Bagg et al., 2020). Despite this, clinical trials registries are not a commonly searched source. Gray et al. (2019) found that for surgery reviews, registries were used in 79.2% of Cochrane Reviews compared with 6.4% of reviews in high‐impact journals, even though they contained at least one additional relevant study.

Similar to database choice, the decision to search grey literature and unpublished studies may be guided by the type of research being undertaken. Farrah and Mierzwinski‐Urban (2019) illustrate this when looking specifically at non‐drug health technologies, finding that in horizon scanning reports for new and emerging technologies, almost half of the studies cited were grey literature and that clinicaltrials.gov was one of the most frequently cited sources.

Grey literature may also be a useful source in fields where it is challenging to generate evidence because the population is vulnerable or it is difficult for services to access or engage with them. Enticott et al. (2018) found that useful research had been done by agencies who had access to refugees and asylum seekers but that this work is generally disseminated via grey literature instead of in journals. A search focussed on peer‐reviewed literature would probably miss this valuable source of evidence.

Similarly, for environmental evidence reviews, Konno and Pullin (2020) warn of “unrepresentative samples of studies and biased estimates of true effects” in reviews that do not search multiple platforms or supplementary sources.

Coleman et al. (2020) discuss this in relation to searching for evidence on programme theories, a challenging area to search with evidence found in sources including websites, blogs and newspaper articles. They conclude that the optimal search for their programme theory topic involves databases like medline , embase and cinahl combined with searches of Google and Google Scholar.

Supplementary search sources—Challenges of grey literature and unpublished data

Being systematic and reproducible are key tenets of evidence‐based medicine reviews, and it can be difficult to uphold these principles in grey literature searches. For example, the nature of grey literature means that it is generally web based, not necessarily indexed in a standard way, and does not usually have a standard vocabulary. Even the titles can be misleading (Godin et al., 2015). Hanneke and Young (2017) comment on the lack of detailed information often given about grey literature in search histories. Some reviews will state that grey literature has been searched without providing detail of which database, website or search engine has been used, which means it is difficult to judge how systematic they have been or to reproduce their searches. Unlike records found in a peer reviewed source, records found in a grey literature source may also lack information on publication characteristics like date and contributor (Godin et al., 2015).

Even in a discipline like public health, where evaluation of interventions may not be reported in journal articles, the challenges of grey literature searching are noted (Adams et al., 2016). After looking at three public health case studies, the authors reflect on the importance of search methods, search efficiency and the challenges introduced by grey literature searching like replicability of searches and time needed to perform them.

The additional time required for grey literature searching may not be rewarded by finding relevant results (Halfpenny et al., 2016). Hartling et al. (2017) explored how often systematic reviews search for unpublished literature, dissertations and non‐English reports in two clinical and one psychosocial reviews. Their results show that, although most in their sample search for it, few included publications are found, and those that are retrieved do not usually have any impact on the review findings. This conclusion is echoed by Schmucker et al. (2017) and Wilson et al. (2017) who both found that searching for unpublished data made a difference in only a small number of reviews and did not necessarily change the conclusions or strength of evidence of recommendations.

Supplementary search sources—Website searching

Website searching may also be considered as a supplementary technique. However, it can be difficult to distinguish it from grey literature searching. On one hand, the aim of a web search is often to find grey literature (Briscoe et al., 2020), and it may be the route to finding unpublished reports, government papers and trials from ongoing registries. However, website search results also cover published literature like journal articles. The two can therefore be considered separately.

The growth of the web in recent years means that a far greater amount of literature can be found than in the past (Briscoe, 2016). Google and Google Scholar are the most popular search engines. Typically, web searches are cut down versions of bibliographic database searches (Briscoe, Nunns, et al., 2020).

Like grey literature searching, website searching can present the challenges of poor website functionality, not knowing which to search and the possibility of not finding unique relevant results (Stansfield et al., 2016).

Website searching can also introduce bias to a search, if not planned effectively. Curkovic and Kosec (2018) warn of the risk of a “bubble effect” in which the algorithms used in some commercial search engines to personalise results mean that many internet search results are biased and can affect the validity of reviews. Even geographical location can affect both the results and ranking of results in Google searches (Cooper et al., 2021) and Google Scholar searches (Pozsgai et al., 2020). Similarly, using language restrictions can introduce bias, particularly in topics where the evidence base is mainly in a non‐English setting (Chaabna et al., 2020).

Reproducibility of searches, a key feature of systematic searching, can also be challenging because of the transient and changing nature of the internet and website domains. Curkovic and Kosec (2018) suggest that internet searches should therefore be used only as a supplementary source for scoping out systematic review strategies.

Google Scholar is a free web search engine that can be used to find academic literature, and the subject of its suitability for this task is discussed in recent papers. Halevi et al. (2017) focus specifically on whether Google Scholar is suitable as a source of scientific information. They acknowledge that it is “essentially an enormous web crawler” and therefore has deeper coverage than WoS and Scopus. Bramer et al. (2017) explain that this is likely to be because it indexes the full text of articles, meaning that it can find studies where the context of the subject is described in the full text but not the abstract or subject terms. There is agreement, however, that Google Scholar should only be used as a supplement to other sources. This is based on “lack of quality assurance and lack of transparency about the resources it covers” and search shortcomings like a 256 character limit for search terms (Halevi et al., 2017; Harari et al., 2020).

These issues are acceptable if the researcher is only interested in a specific result but not for reproducible searches. Gusenbauer and Haddaway (2019) criticise Google Scholar as being “user friendly at any cost” and explain its popularity with users as being due to convenience and lack of awareness of its shortcomings. They go on to say in a later paper (Gusenbauer & Haddaway, 2021) that researchers should know what can and cannot be done depending on the functional capabilities of any individual search system. They cite Google Scholar as an example of “how a system can be perfectly suited for one type of search, while failing miserably for another” (p. 3). This conveys the message that it is important to understand the strengths and limitations of each resource when planning which resources to search.

There are also practical issues to consider in using websites efficiently. Levay et al. (2016) compared Google Scholar with WoS for citation searching for public health reviews. They recommend using WoS instead of Google Scholar, based on the reliability of WoS, and the ease of searching it and downloading results. Despite Google Scholar being free to use and WoS requiring a subscription, the time spent on finding and downloading results made WoS more cost effective. This could be why a paper examining the use of citation searching in Cochrane Reviews found that Google Scholar was the least popular source (Briscoe et al., 2020).

Supplementary search techniques—Citation searching

Supplementary search techniques and their benefits are discussed in recent papers. Their importance in finding additional relevant results leads Booth (2016a) to recommend that reference checking (i.e., backwards citation searching) should be standard practice rather than regarded as a supplementary technique.

Citation searching is particularly useful in scenarios where core concepts are hard to find using keywords (Briscoe, Bethel, et al., 2020) or where topics are broad or ill defined (Rogers et al., 2020). An example of its value is illustrated by Bethel et al. (2021) when they describe included references from medline and embase that were missed in database searches but retrieved by citation searching. In this case, it “reaffirmed the purpose of supplementary searching” and illustrates a place in finding results not only unavailable in bibliographic databases but also missed by these searches.

The Cochrane Handbook lists searching reference lists as mandatory (Lefebvre et al., 2019), recommending that “review authors should use included studies and any relevant systematic reviews when conducting backward citation searching” (Briscoe, Nunns, et al., 2020, p. 171). Goossen et al. (2018) support this idea in their study of systematic reviews in surgery, in which screening citation lists in WoS and searching citation lists of related reviews contributed substantially. However, Rogers et al. (2020) introduce a caveat in their study looking at citation searching for implementation studies for dementia care. Although both Scopus and WoS found relevant studies missed by database searches, the paper noted that being able to locate a record in Scopus or WoS did not mean it would be retrieved by the search strategy. This echoes the same finding in papers discussing bibliographic databases.

There are additional benefits to citation searching as part of a systematic search. Levay et al. (2016) examined public health literature searching and found that “citation searches can be developed in a series of focussed steps that avoid unnecessary amounts of results.” Unlike web searches and grey literature, a systematic approach can be maintained and recorded to aid reproducibility, as long as the required information is retained at the time of searching. Citation searching also has the potential to reveal parallel topics of interest to the research, which may not be identified by a traditional keyword search (Hinde & Spackman, 2015). Citation searching and other supplementary techniques may identify potentially relevant studies, but their value can be affected by the effort involved, and it is unclear if it will lead to additional studies if it is done after extensive database searches (Wright et al., 2015).

Supplementary search techniques—Guidance versus practice

As with database choice, there has been some work comparing guidance that has been set out in handbooks versus practice in supplementary searching. Cooper et al. (2017) identified five search methods from the methodology handbooks (contacting study authors or experts, citation chasing, handsearching, trial register searching and web searching) and examined how these were applied in practice. They found that, although studies do generally follow recommended best practice, further research is needed to help understand how and when to use supplementary search strategies.

Alternative approaches to traditional searching

There are also papers that focus on alternative models of searching for topics that do not fit easily into the Patient population, Intervention, Comparison and Outcome (PICO) structure. They are useful for subjects with complex interventions or concepts where either appropriate indexing terms are not available or they cannot be expressed in a series of well‐defined subject headings. It is important to select a search approach that is appropriate to the type of review being done, the type of evidence required and the subject area. Savolainen's theoretical paper explains “exploratory search,” discussing two frameworks: the berrypicking model and information foraging theory. Both frameworks involve exploratory browsing and focussed searching (Savolainen, 2018). Searching behaviour needs to be “open‐ended, dynamic and multi‐faceted” in these approaches, meaning that both frameworks provide a “different but complementary” image of the exploratory search process as a combination of focused searching and exploratory browsing.

A few papers have approached complex topic areas by taking an iterative or “stepped” approach to searching. For this approach, “searching is done in several stages, with each search taking into account the evidence that has already been retrieved” (NICE, 2018). Public health is an example of a subject area where search questions are often highly complex and do not necessarily lend themselves to methods that work for clinical reviews, meaning that alternative search methods may be particularly useful (Mathes et al., 2017).

Enticott et al. (2018) used this approach for their systematic search for grey literature on refugees and asylum seekers. They started by looking at the included studies from an initial search of academic literature and advice from experts to inform an initial grey literature search. This was followed by a targeted search for grey literature from 20 countries that resettle refugees, supplemented with further Google and Google Scholar searches. This targeted and stepped approach led to the discovery of almost double “eligible results” of the initial grey literature search.

Palliative care is another example of a challenging topic area that has concepts and terms that are heterogeneous, poorly indexed and non‐standardised. Zwakman et al. (2018) test an approach to this challenge, describing “PALETTE,” an iterative search method that involves doing an initial literature search to develop an understanding of the topic, gaining expert opinion and doing other exploratory work. The search strategy is built using “golden bullets” (key studies), which are analysed to mine key indexing terms and free text to use in the primary literature search in key databases, which is followed by citation tracking.

The search should be appropriate to the type of review being conducted, and this may require alternative approaches. Booth et al. (2019) give a framework for conducting literature searches for realist reviews. Realist reviews offer a theory‐driven method to evidence synthesis and “explore how a complex intervention works, for whom and under what circumstances” (p. 2). A realist literature search needs to be iterative and may include a scoping search, using grey literature sources and supplementary search methods. These realist approaches are interesting for public health reviews because they also consider complex interventions and how they might be applied across large populations.

Application to public health evidence

Coverage in public health versus clinical topics

The aim of this narrative review was to give an overview of recent studies on information retrieval guidance and methods that could be applied to public health evidence. The largest proportion of results examined clinical topics with fewer focussing on public health topics. This could be because evidence synthesis is more likely to be done on clinical topics. A survey on characteristics of Health Technology Assessment (HTA) in five countries found that, although over 80% were on drugs, devices, surgery or other clinical programmes, only 5% were on public health (Lavis et al., 2010). If most HTA topics are clinical, it is not surprising that the majority of studies on information retrieval are also on clinical subjects.

Public health research synthesis is complex and challenging compared with clinical research synthesis for a variety of reasons. Although clinical medicine can prioritise randomised controlled trials (RCTs) to provide evidence, the nature of public health interventions means that it may not be possible or ethical to conduct an RCT. Public health reviews may therefore have to rely on other study designs, such as observational studies (Frieden, 2017; Mathes et al., 2017). Evidence reviews in public health may need to search for a wider range of study types and be less able to use search filters to narrow the results. Managing this additional volume is a key task to consider in planning an evidence review for a public health topic.

Additionally, outcomes for public health interventions “do not always occur at the same operational level as the intervention,” meaning that interventions at a personal level may have outcomes at population level and vice versa (Kelly et al., 2010). This means that the search needs to be planned and scoped in the initial stages of the review, so that the right decisions can be made about where to search and which methods to employ. The complex relationships between these factors mean that it can be difficult to understand them all at the beginning of the review and the search needs to build up a picture of the area, using alternative approaches, such as the stepped methods described above.

Terminology is complicated in public health literature searching, as the concepts can be hard to define and have multiple meanings according to the context in which they are being used. This is problematic for searching because it is time‐consuming to construct a search, it can be difficult to be comprehensive and it could lead to large volumes of results. This is illustrated by the search on exercise by Grande Antonio et al. (2015) who note the relevance of exercise to many disciplines. This could result in extra time spent finding all indexing terms on exercise and assessing which are appropriate for any individual search. The authors also report on changing definitions for exercise which could add to the relevant free‐text terms used and therefore impact on time spent constructing strategies and volume of results found.

The multidisciplinary nature of public health means that all relevant evidence may not be found in one location or type of source. The search is more likely to return a higher volume of results, as it will need to be run in multiple databases (Hanneke & Young, 2017). Hanneke and Young (2017) examined information sources for obesity prevention policy and recommend searching pubmed , multidisciplinary and economics databases and grey literature plus citation reference searching and handsearching. This illustrates that there is a need to tailor search resources to the topic. Also, to be comprehensive, there needs to be a range of sources searched and different search techniques used.

Search scenarios

Although many of the studies found in the literature investigate how the publications included in a review were actually retrieved, some papers have taken this analysis a step further to model scenarios of searching a reduced number of databases without missing any of the included publications. For example, Aagaard et al. (2016) present tables on the cumulative effects of searching up to five databases, discussing the combined recall of medline , embase and central on reviews of musculoskeletal topics. For this specific clinical area, they recommend that an optimal literature search for RCTs should include the three core databases plus two additional databases and other search techniques like grey literature and citation searching.

Bramer et al. (2017) retrospectively checked the source of each included publication in 58 systematic reviews from clinical and public health topics to discover where they were found and which resources retrieved unique included publications. The best combination of databases in terms of recall were embase , medline , WoS and Google Scholar, although it was recommended to search subject‐specific databases when relevant to the topic.

Urhan et al. (2019) evaluated the source of included publications in food science reviews. They too looked for the best combination of resources and found this to be WoS, two specialised databases and reference checking. The specialised databases and reference checking found included publications that were present in the other databases but had been missed by the strategies. As other authors have commented, this illustrates the importance of recognising that no search is 100% sensitive and it is worth investing time in making decisions about the best sources to search and techniques to use to maximise retrieval of relevant evidence.

Goossen et al. (2020) focused on systematic reviews in a sample of 86 Overviews of Reviews and also reported on the value of reference checking to complement database searching. In their sample, medline , Epistemonikos and reference checking were the optimal combination of sources to find the majority of systematic reviews.

All these papers echo the findings of other authors in this review, that a mix of core databases and supplementary sources or techniques are needed for optimal results with the possible addition of topic‐specific resources when needed.

An aid to facilitate the modelling of search sources is provided by Bethel et al. (2021) in their paper describing a search summary table. They developed the table to aid decisions around which databases and supplementary search sources to search based on where evidence has been found in previous searches. The benefit of using this search summary table to work out the optimal search strategy for a topic is illustrated by Coleman et al. (2020). They use it to determine the minimum set of resources needed to find all the primary publications for their topic on programme theories on pressure ulcers.

Implications of the results

The results of the narrative review illustrate that researchers have, in many cases, been following recommendations from the Cochrane Handbook and other methods manuals to search several databases, including core databases such as medline and embase . There cannot be an exact recommended number to search or a defined list of databases to search. This is because the optimal databases to search depends on the topic, the type of interventions being searched for and the type of study required. The papers reviewed indicate that it is important to consider what kind of research is being done at the beginning and to let this inform decisions about where and what to search.

Additional search techniques (e.g., citation searching) and sources (e.g., grey literature) have been shown to increase the likelihood of finding more relevant studies. In some cases, these are publications that would influence decisions about effectiveness of an intervention. However, using additional techniques or sources can also increase the time spent on a search without producing anything valuable. The additional sources or search techniques may find additional relevant papers, but these will not necessarily change recommendations or conclusions. This calls into question whether the additional time and effort spent on this work is always of value. It also suggests that it is worthwhile to do some testing in the scoping stage of a review, to estimate whether additional sources will retrieve results that will reward the extra time and effort required. This would involve checking a small sample of results to see if they were relevant for the review. Cooper, Lovell, et al. (2018), Cooper, Varley‐Campbell, et al. (2018) and Bethel et al. (2021) have both discussed methods for this kind of testing work.

Limiting bias is seen as a key reason for searching beyond one or two bibliographic databases and for searching additional sources or techniques. In the case of grey literature, it can potentially find unpublished results, meaning that conclusions and recommendations are not solely based on published research.

Some sources are, by their nature, difficult to search systematically, and it is challenging to reproduce results. Web searching, particularly a source like Google Scholar, is criticised for a lack of transparency and consistency of results. Interestingly, although researchers may search additional sources to minimise bias, they may inadvertently introduce bias, unless they consider how some search engines personalise results.

A few papers made observations about the importance of the quality of search strategies, for example, Arber et al. (2018), Frandsen, Gildberg, et al. (2019) and Wright et al. (2015). A study could be indexed in a database but not found by the search strategy, so the construction of a robust search strategy is key to a good literature search. The discussion highlighted the difficulty of producing a comprehensive search strategy in public health reviews, given that the multidisciplinary nature of the search often leads to a higher volume of results.

Key gaps in the evidence on searching for public health topics

There is less direct evidence on public health literature searching compared with clinical topic literature searching. Although there are a few papers exploring the challenges and complexities of public health literature searching, there is room for more studies describing or comparing approaches to searching for public health topics.

Most of the existing studies on both clinical and public health topics are descriptive studies looking at one example. It would be interesting to see more studies covering multiple examples which could be synthesised to provide stronger recommendations.

It would also be useful to see more work on scenarios examining where included publications have been found in previous reviews, to inform future searches in similar topics. Although there has been some work done on this generally, there has been a lack of work specifically on public health reviews in recent years. This type of modelling could be useful for helping inform decisions about how to approach searches for public health topics.

Key lessons from the literature to guide searching for public health topics

Public health literature searches need to be managed effectively, according to the time and resources available to the review. The groundwork needs to be given sufficient time to explore the topic and the different contexts in which an intervention might have been used. This leads to questions about the suitability of databases and the identification of resources applicable to the topic. Once the sources have been identified, they must be searched efficiently without creating unmanageable volumes of results. This can be difficult to achieve in a multidisciplinary topic where terminology has multiple and competing meanings. The search must be planned to achieve a good balance of resources. These are the kinds of decision that information specialists are well placed to advise.

However, adding more databases with similar coverage may not be effective. It can be more helpful to spend time on other techniques to find other types of evidence. The searches for grey literature and unpublished data can be time‐consuming, given the difficulties of doing them in a transparent and reproducible way. Therefore, search planning and iterative steps are required to understand the evidence base in order to plan the optimal approach to that review and the types of evidence it requires. Realist review literature search methods may be a good model to follow, because of their exploratory approach to searching.

There are some key lessons from the literature that can be applied to public health reviews. In public health topics as in other fields, there is no set number or list of sources that should be searched. Public health is also a discipline that benefits, perhaps more than clinical disciplines, from pre‐search planning and consideration of the type of information being sought. Searching additional sources to retrieve grey literature may be particularly rewarding when seeking evidence on populations or interventions that are harder to find in journals. The additional sources are, however, not guaranteed to retrieve unique papers, and they need to be searched carefully to avoid introducing new sources of bias.

CONFLICT OF INTEREST

Andrea Heath has no interests to declare. Paul Levay has no interests to declare. Daniel Tuvey has no interests to declare.

ACKNOWLEDGEMENTS

The authors would like to thank Marion Spring, Liz Walton, Nicola Walsh and Lynda Ayiku.

APPENDIX A.

SEARCH STRATEGY

The search strategy was developed in the medline bibliographic database (Ovid interface, 1946 to February Week 2 2019) and adapted as appropriate.

MEDLINE

1exp "Information Storage and Retrieval"/
2*Information Services/
3Medical Subject Headings/
4*Information Systems/
5Databases, Bibliographic/ or Databases as Topic/ or PubMed/ or Medline/
6Search Engine/
7Public Health Informatics/
8Librarians/ or Libraries, Medical/ or Library Services/
9(medline or embase or pubmed or cinahl or "Cumulative Index to Nursing and Allied Health Literature" or psycinfo or assia or "applied social sciences index and abstracts" or BNI or british nursing index or google scholar or scopus or cochrane library or ovid or ebsco or wiley).ti.
10((database* or source*) adj3 (select* or choos* or choice* or compar* or valu*)).tw.
11(search* adj3 (strateg* or method* or technique* or question* or approach* or precision or effectiv* or efficien* or recall* or literature or citation* or electronic* or hand* or online* or multifacet* or multi facet* or database* or iterative* or evidence or supplementary)).ti.
12((grey or gray) adj3 literature*).ti.
13(information adj1 (specialist* or scientist* or professional* or retrieval*)).ti.
14librarian*.ti.
15(berrypick* or berry‐pick* or pearl grow* or snowball*).tw.
16(reference adj2 (harvest* or check*)).tw.
17((Literature or data) adj2 source*).ti.
18or/1‐17
19"Review Literature as Topic"/
20Practice Guidelines as Topic/
21exp evidence‐based practice/
22government publications as topic/ or consensus development conferences as topic/
23Meta‐Analysis as Topic/
24Technology Assessment, Biomedical/ or Comparative Effectiveness Research/
25(systematic reviews or qualitative studies or rcts or randomised controlled trials or observational studies or health technology assessments or guidelines or comparative effectiveness or meta analyses or metaanalyses or metanalyses).ti.
26(evidence adj3 synthes*).tw.
27("evidence based" adj1 (practice or medicine or health or nursing)).ti.
28or/19‐27
2918 and 28
30limit 29 to english language
31limit 30 to yr="2015 ‐Current"

APPENDIX B.

INCLUSION/EXCLUSION CRITERIA

Inclusion criteria
Subject of studyRefers to literature searching with databases
or refers to non‐database or other search approaches
or refers to databases sources or systems
Public health relevanceRefers to public health guidance
or refers to public health topics
or could be applied to PH topics
Exclusion criteria
Types of studiesNon‐English language
Published pre 2015

Notes

Heath, A. , Levay, P. , & Tuvey, D. (2022). Literature searching methods or guidance and their application to public health topics: A narrative review . Health Information & Libraries Journal , 39 , 6–21. 10.1111/hir.12414 [PMC free article] [PubMed] [CrossRef] [Google Scholar]

Funding information

This study was conducted as part of NICE methods development, and no additional funding was received.

REFERENCES