Literature searches: what databases are available?
Posted on 6th April 2021 by Izabel de Oliveira
Many types of research require a search of the medical literature as part of the process of understanding the current evidence or knowledge base. This can be done using one or more biomedical bibliographic databases. 
Bibliographic databases make the information contained in the papers more visible to the scientific community and facilitate locating the desired literature.
This blog describes some of the main bibliographic databases which index medical journals.
PubMed was launched in 1996 and, since June 1997, provides free and unlimited access for all users through the internet. PubMed database contains more than 30 million references of biomedical literature from approximately 7,000 journals. The largest percentage of records in PubMed comes from MEDLINE (95%), which contains 25 million records from over 5,600 journals. Other records derive from other sources such as In-process citations, ‘Ahead of Print’ citations, NCBI Bookshelf, etc.
The second largest component of PubMed is PubMed Central (PMC). Launched in 2000, PMC is a permanent collection of full-text life sciences and biomedical journal articles. PMC also includes articles deposited by journal publishers and author manuscripts, published articles that are submitted in compliance with the public access policies of the National Institutes of Health (NIH) and other research funding agencies. PMC contains approximately 4.5 million articles.
Some National Library of Medicine (NLM) resources associated with PubMed are the NLM Catalog and MedlinePlus. The NLM Catalog contains bibliographic records for over 1.4 million journals, books, audiovisuals, electronic resources, and other materials. It also includes detailed indexing information for journals in PubMed and other NCBI databases, although not all materials in the NLM Catalog are part of NLM’s collection. MedlinePlus is a consumer health website providing information on various health topics, drugs, dietary supplements, and health tools.
MeSH (Medical Subject Headings) is the NLM controlled vocabulary used for indexing articles in PubMed. It is used by indexers who analyze and maintain the PubMed database to reflect the subject content of journal articles as they are published. Indexers typically select 10–12 MeSH terms to describe every paper.
Embase is considered the second most popular database after MEDLINE. More than 32 million records from over 8,200 journals from more than 95 countries, and ‘grey literature’ from over 2.4 million conference abstracts, are estimated to be in the Embase content.
Embase contains subtopics in health care such as complementary and alternative medicine, prognostic studies, telemedicine, psychiatry, and health technology. Besides that, it is also widely used for research on drug-related topics as it offers better coverage than MEDLINE on pharmaceutics-related literature.
In 2010, Embase began to include all MEDLINE citations. MEDLINE records are delivered to Elsevier daily and are incorporated into Embase after de-duplication with records already indexed by Elsevier to produce ‘MEDLINE-unique’ records. These MEDLINE-unique records are not re-indexed by Elsevier. However, their indexing is mapped to Emtree terms used in Embase to ensure that Emtree terminology can be used to search all Embase records, including those originally derived from MEDLINE.
Since this coverage expansion—at least in theory and without taking into consideration the different indexing practices of the two databases—a search in Embase alone should cover every record in both Embase and MEDLINE, making Embase a possible “one-stop” search engine for medical research .
Emtree is a hierarchically structured, controlled vocabulary for biomedicine and the related life sciences. It includes a whole range of terms for drugs, diseases, medical devices, and essential life science concepts. Emtree is used to index all of the Embase content. This process includes full-text indexing of journal articles, which is done by experts.
The most important index of the technical-scientific literature in Latin America and the Caribbean, LILACS, was created in 1985 to record scientific and technical production in health. It has been maintained and updated by a network of more than 600 institutions of education, government, and health research and coordinated by Latin America and Caribbean Center on Health Sciences Information (BIREME), Pan American Health Organization (PAHO), and World Health Organization (WHO).
LILACS contains scientific and technical literature from over 908 journals from 26 countries in Latin America and the Caribbean, with free access. About 900,000 records from articles with peer review, theses and dissertations, government documents, conference proceedings, and books; more than 480,000 of them are available with the full-text link in open access.
The LILACS Methodology is a set of standards, manuals, guides, and applications in continuous development, intended for the collection, selection, description, indexing of documents, and generation of databases. This centralised methodology enables the cooperation between Latin American and Caribbean countries to create local and national databases, all feeding into the LILACS database. Currently, the databases LILACS, BBO, BDENF, MEDCARIB, and national databases of the countries of Latin America are part of the LILACS System.
Health Sciences Descriptors (DeCS) is the multilingual and structured vocabulary created by BIREME to serve as a unique language in indexing articles from scientific journals, books, congress proceedings, technical reports, and other types of materials, and also for searching and retrieving subjects from scientific literature from information sources available on the Virtual Health Library (VHL) such as LILACS, MEDLINE, and others. It was developed from the MeSH with the purpose of permitting the use of common terminology for searching in multiple languages, and providing a consistent and unique environment for the retrieval of information. DeCS vocabulary is dynamic and totals 34,118 descriptors and qualifiers, of which 29,716 come from MeSH, and 4,402 are exclusive.
The Cochrane Central Register of Controlled Trials (CENTRAL) is a database of reports of randomized and quasi-randomized controlled trials. Most records are obtained from the bibliographic databases PubMed and Embase, with additional records from the published and unpublished sources of CINAHL, ClinicalTrials.gov, and the WHO’s International Clinical Trials Registry Platform.
Although CENTRAL first began publication in 1996, records are included irrespective of the date of publication, and the language of publication is also not a restriction to being included in the database. You won’t find the full text to the article on CENTRAL but there is often a summary of the article, in addition to the standard details of author, source, and year.
Within CENTRAL, there are ‘Specialized Registers’ which are collected and maintained by Cochrane Review Groups (plus a few Cochrane Fields), which include reports of controlled trials relevant to their area of interest. Some Cochrane Centres search the general healthcare literature of their countries or regions in order to contribute records to CENTRAL.
ScienceDirect is Elsevier’s most important peer-reviewed academic literature platform. It was launched in 1997 and contains 16 million records from over 2,500 journals, including over 250 Open Access publications, such as Cell Reports and The Lancet Global Health, as well as 39,000 eBooks.
ScienceDirect topics include:
- health sciences;
- life sciences;
- physical sciences;
- social sciences; and
Web of Science
Web of Science (previously Web of Knowledge) is an online scientific citation indexing service created in 1997 by the Institute for Scientific Information (ISI), and currently maintained by Clarivate Analytics.
Web of Science covers several fields of the sciences, social sciences, and arts and humanities. Its main resource is the Web of Science Core Collection which includes over 1 billion cited references dating back to 1900, indexed from 21,100 peer-reviewed journals, including Open Access journals, books and proceedings.
Web of Science also offers regional databases which cover:
- Latin America (SciELO Citation Index);
- China (Chinese Science Citation Database);
- Korea (Korea Citation Index);
- Russia (Russian Science Citation Index).
To make the search more precise, we can use boolean operators in databases between our keywords.
We use boolean operators to focus on a topic, particularly when this topic contains multiple search terms, and to connect various pieces of information in order to find exactly what we are looking for.
Boolean operators connect the search words to either narrow or broaden the set of results. The three basic boolean operators are: AND, OR, and NOT.
- AND narrows a search by telling the database that all keywords used must be found in the article in order for it to appear in our results.
- OR broadens a search by telling the database that any of the words it connects are acceptable (this is useful when we are searching for synonymous words).
- NOT narrows the search by telling the database to eliminate all terms that follow it from our search results (this is helpful when we are interested in a specific aspect of a topic or when we want to exclude a type of article.
You may also be interested in the following blogs for further reading:
Conducting a systematic literature search
Reviewing the evidence: what method should I use?
Cochrane Crowd for students: what’s in it for you?