HTTPS at NCBI: Guidance for Users

What is happening?

To improve security and privacy, and by Federal government mandate, NCBI moved its Web sites to HTTPS only by September 30, 2016.

To give software vendors time to respond, that deadline was extended for users of NCBI Web APIs to November 9, 2016.

If you use NCBI only through a Web browser (like Safari, Firefox, Chrome, Internet Explorer, Opera, etc.), this document is not of interest to you. The only change you should notice after the deadline is that a green lock icon should appear inside the box, and the web addresses of the NCBI pages you visit will start with https://.

If you maintain software that uses NCBI APIs or accesses NCBI servers through the Web, you should understand and act before the deadline to ensure uninterrupted service.

NCBI Web services include APIs such as NCBI eutilities and BLAST URLAPI that client applications use to access NCBI data. A number of them (though not a comprehensive set) are listed on or linked from our APIs page.

Applications that access NCBI web servers using http:// URLs, instead of https:// URLs, may fail partially or completely after NCBI switches to HTTPS-only.

This document explains our transition plan, and provides guidance to developers about how to update their applications (scripts, server-side applications like CGIs, browser plugins, etc.), before the switchover, to prevent failure.

NCBI is moving all web services to HTTPS

The HTTP protocol does not provide encryption, so anyone who can see web traffic between a client (for example, a web browser) and a server can intercept potentially sensitive information, and/or inject malware into users' browsers or operating systems. HTTPS solves this problem. It works just like HTTP, except that traffic is encrypted in both directions, so observers between the client and the server can't intercept or tamper with the requests or responses. It also provides authentication, ensuring that the client is communicating with the intended server given by the hostname, and not some impostor.

The Federal Office of Management and Budget requires all Federal Web sites to switch to HTTPS-only (meaning, HTTP will be disabled) by December 31, 2016. However, NCBI, being a part of the National Library of Medicine, had an earlier deadline of September 30, 2016.

All public-facing web pages at NCBI now operate exclusively over https. To give software vendors and their customers more time to update their software, NCBI extended the deadline for web service https compliance to November 9, 2016.

Update your applications as soon as possible

NCBI Web resources are all available now on HTTPS, so you can update your software immediately.

To ensure that your applications work before and after the switchover, update them so that URLs for all requests to NCBI servers start with https: instead of http:. For example, if your application searches PubMed using http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi, update it to use https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi instead. Please report any problems you encounter to info@ncbi.nlm.nih.gov.

Many script authors access NCBI services using third-party libraries like biojava, bioperl, biopython, bioruby, etc. In these cases, you may be able to update your application by simply updating the library you use to the most recent version. The table below provides information on versions of libraries we know about that already use HTTPS to interact with NCBI servers.

Library
Uses HTTPS for NCBI services?
Compliant Version
BioJava Yes

biojava-legacy 1.9.3
biojava 4.2.4

BioPerl Yes BioPerl 1.7 (pull request)
Biopython Yes Biopython 1.67 (release notes)
BioRuby Yes BioRuby 1.5.1 (github issue,) (release notes)
biogo Yes most recent HEAD revision of master branch (github issue)
reutils (R) Yes 0.2.3, see https://github.com/gschofl/reutils (github issue)

Once you have updated and tested your application, it will continue to work as before, and no other action is required. This is the best option for scripts, CGIs, and other Web client software for which you have the source code and the ability to update it and deploy a new release before the deadline.

After November 9, 2016, NCBI HTTP servers will redirect or reject all HTTP requests.

All interactive web traffic to NCBI servers has been successfully moved to HTTPS. After the switchover date, November 9, 2016, requests to web services such as eutilities and BLAST URLAPI will also begin redirecting http requests to https.

If you do not update your application before the switchover date, these redirects from NCBI HTTP servers may buy you time to make the updates later.

After November 9, 2016, all traffic from NCBI HTTP servers, including Web services, will:

  1. respond with a server-side redirect (HTTP 301 Moved permanently) to the corresponding URL on HTTPS, only for HTTP GET and HEAD requests;
  2. respond with HTTP 403 Forbidden and an error message, to all requests other than GET and HEAD (including and especially HTTP POST);
  3. include in every HTTPS response an HTTP Strict Transport Security (HSTS) header, which instructs browsers to automatically communicate thereafter only with HTTPS on that domain. (HSTS applies only to browsers, though other Web clients like scripts are free to implement it.) The HSTS header has a 1-year expiration date.
  4. include in every HTTPS response the header Content-Security-Policy: upgrade-insecure-requests, which causes most browsers to automatically upgrade http:// links to https://, automatically avoiding most mixed content problems.

After switchover, the HTTP redirects will remain in place for an as-yet undetermined period, but at least until the Federal deadline of December 31, 2016.

After switchover, applications that access NCBI APIs using HTTP may fail

After the switchover date, applications that still try to access NCBI via HTTP (i.e., on port 80) may fail for a few possible reasons:

  1. Your programming environment's HTTP facility does not automatically follow redirects from HTTP to HTTPS. Some libraries follow redirections from HTTP to HTTPS; others do not. Java's URLConnection, for example, does not automatically follow HTTP-to-HTTPS redirects by design, even for safe methods like GET and HEAD.
  2. Your application uses HTTP verbs other than GET and HEAD. All other HTTP requests (including especially POST and PUT requests) to HTTP URLs at NCBI will fail unconditionally (with HTTP 403 Forbidden) after the switchover date.
  3. Your application access NCBI resources through a proxy. Some organizations use proxy servers to access the NCBI web site. These proxy servers must communicate with NCBI using https, which means they need valid certificates. If your application access NCBI through a proxy, check with the proxy vendor about https support and how to add or update certificates.
  4. Your programming environment does not support HTTPS.

In any of these cases, if the application does not work with https, the only solution is to update your all NCBI URLs to use HTTPS exclusively.

Some requests will be temporarily exempt from redirection

For various technical reasons, certain requests will be temporarily exempt from redirection. Once the underlying technical issue is resolved, the exemption will be lifted, and redirection will begin without further public warning.

The following http requests will be temporarily exempt from redirection:

  • Requests with request-uri matching the regular expression \.(xsd|xml|dtd|ent)$
  • Requests to the hosts dtd.nlm.nih.gov and jats.nlm.nih.gov

Redirects will be maintained indefinitely

All public NCBI servers are already enabled for HTTPS, so you can update your application to use HTTPS now, and test it on our live servers. Once you have updated to HTTPS, no further action is required. Please send questions or report problems to info@ncbi.nlm.nih.gov.

In keeping with current US Federal Government policy, NCBI intends to maintain these redirects on public servers indefinitely. Nevertheless, it is to your advantage to update you applications to use https only as soon as possible, both for performance and security reasons.

About Referrers

A "referrer" is an HTTP header, HTTP_REFERER [sic], that contains the address of the webpage that linked to the page being retrieved. Some websites analyze referrers to better understand their incoming web traffic; for example, to find out what percentage of their traffic comes from a particular search engine. But third-party websites can also use referrer information to discover information about individual users, such as their search terms and the pages they have visited.

Because of this privacy concern, NCBI's website tells web browsers to limit the referrer to just the scheme and domain name (e.g., https://www.ncbi.nlm.nih.gov), and to omit the request URI and query string. This limitation is enforced by the Referrer-Policy HTTP header and the <meta name="referrer"> meta tag. Limiting the referrer to just the scheme and domain name balances the user's right to privacy with website owners' need to understand their web traffic. See The Meta Referrer Tag: An Advancement for SEO and the Internet for a detailed description of the problem and the solution.

This policy follows official cio.gov guidance on referrers; see http://bit.ly/gov-https-referrer for details.

For more information

For more on the US Federal government HTTPS-only initiative, see https://https.cio.gov.

For questions, comments, or problems, contact the NCBI service desk at info@ncbi.nlm.nih.gov.