Skip to main content

Scan websites for HTTPS deployment best practices

Project description

Pushing HTTPS :lock:

Latest Version Coverage Status Build Status

pshtt ("pushed") is a tool to scan domains for HTTPS best practices. It saves its results to a CSV (or JSON) file.

pshtt was developed to push organizations — especially large ones like the US Federal Government :us: — to adopt HTTPS across the enterprise. Federal agencies must comply with M-15-13, a 2015 memorandum from the White House Office of Management and Budget, and BOD 18-01, a 2017 directive from the Department of Homeland Security, which require federal agencies to enforce HTTPS on their public web services. Much has been done, but there's more yet to do.

pshtt is a collaboration between the Department of Homeland Security's National Cybersecurity Assessments and Technical Services (NCATS) team and the General Service Administration's 18F team, with contributions from NASA, Lawrence Livermore National Laboratory, and various non-governmental organizations.

Getting Started

pshtt requires Python 3.4+. Python 2 is not supported.

pshtt can be installed as a module, or run directly from the repository.

Installed as a module

pshtt can be installed directly via pip:

pip install pshtt

It can then be run directly:

pshtt example.com [options]

Running directly

To run the tool locally from the repository, without installing, first install the requirements:

pip install -r requirements.txt

Then run it as a module via python -m:

python -m pshtt.cli example.com [options]

Usage and examples

pshtt [options] DOMAIN...
pshtt [options] INPUT

pshtt dhs.gov
pshtt --output=homeland.csv --debug dhs.gov us-cert.gov usss.gov
pshtt --sorted current-federal.csv

Note: if INPUT ends with .csv, domains will be read from the first column of the CSV. CSV output will always be written to disk (unless --json is specified), defaulting to results.csv.

Options

  -h --help                     Show this message.
  -s --sorted                   Sort output by domain, A-Z.
  -o --output=OUTFILE           Name output file. (Defaults to "results".)
  -j --json                     Get results in JSON. (Defaults to CSV.)
  -m --markdown                 Get results in Markdown. (Defaults to CSV.)
  -d --debug                    Print debug output.
  -u --user-agent=AGENT         Override user agent.
  -t --timeout=TIMEOUT          Override timeout (in seconds).
  -c --cache-third-parties=DIR  Cache third party data, and what directory to cache it in.
  -f --ca-file=PATH             Specify custom CA bundle (PEM format)
Using your own CA Bundle

By default, pshtt relies on the root CAs that are trusted in the Mozilla root store. If you work behind a corporate proxy or have your own certificates that aren't publicly trusted, you can specify your own CA bundle:

pshtt --ca-file=/etc/ssl/ca.pem server.internal-location.gov
Using Docker (optional)
./run [opts]

opts are the same arguments that would get passed to pshtt.

What's Checked?

A domain is checked on its four endpoints:

  • http://
  • http://www
  • https://
  • https://www

The following values are returned in results.csv:

Domain and redirect info

  • Domain - The domain you're scanning!
  • Base Domain - The base domain of Domain. For example, for a Domain of sub.example.com, the Base Domain will be example.com. Usually this is the second-level domain, but pshtt will download and factor in the Public Suffix List when calculating the base domain. (To cache the Public Suffix List, use --suffix-cache as documented above.)
  • Canonical URL - One of the four endpoints described above; a judgment call based on the observed redirect logic of the domain.
  • Live - The domain is "live" if any endpoint is live.
  • Redirect - The domain is a "redirect domain" if at least one endpoint is a redirect, and all endpoints are either redirects or down.
  • Redirect to - If a domain is a "redirect domain", where does it redirect to?

Landing on HTTPS

  • Valid HTTPS - A domain has "valid HTTPS" if it responds on port 443 at the hostname in its Canonical URL with an unexpired valid certificate for the hostname. This can be true even if the Canonical URL uses HTTP.
  • Defaults to HTTPS - A domain "defaults to HTTPS" if its canonical endpoint uses HTTPS.
  • Downgrades HTTPS - A domain "downgrades HTTPS" if HTTPS is supported in some way, but its canonical HTTPS endpoint immediately redirects internally to HTTP.
  • Strictly Forces HTTPS - This is different than whether a domain "defaults" to HTTPS. A domain "Strictly Forces HTTPS" if one of the HTTPS endpoints is "live", and if both HTTP endpoints are either down or redirect immediately to any HTTPS URI. An HTTP redirect can go to HTTPS on another domain, as long as it's immediate. (A domain with an invalid cert can still be enforcing HTTPS.)

Common errors

  • HTTPS Bad Chain - A domain has a bad chain if either HTTPS endpoint contains a bad chain.
  • HTTPS Bad Hostname - A domain has a bad hostname if either HTTPS endpoint fails hostname validation
  • HTTPS Expired Cert - A domain has an expired certificate if the either HTTPS endpoint has an expired certificate.

HSTS

  • HSTS - A domain has HTTP Strict Transport Security enabled if its canonical HTTPS endpoint has HSTS enabled.
  • HSTS Header - This field provides a domain's HSTS header at its canonical endpoint.
  • HSTS Max Age - A domain's HSTS max-age is its canonical endpoint's max-age.
  • HSTS Entire Domain - A domain has HSTS enabled for the entire domain if its root HTTPS endpoint (not the canonical HTTPS endpoint) has HSTS enabled and uses the HSTS includeSubDomains flag.
  • HSTS Preload Ready - A domain is HSTS "preload ready" if its root HTTPS endpoint (not the canonical HTTPS endpoint) has HSTS enabled, has a max-age of at least 18 weeks, and uses the includeSubDomains and preload flag.
  • HSTS Preload Pending - A domain is "preload pending" when it appears in the Chrome preload pending list with the include_subdomains flag equal to true. The intent of pshtt is to make sure that the user is fully protected, so it only counts domains as HSTS preloaded if they are fully HSTS preloaded (meaning that all subdomains are included as well).
  • HSTS Preloaded - A domain is HSTS preloaded if its domain name appears in the Chrome preload list with the include_subdomains flag equal to true, regardless of what header is present on any endpoint. The intent of pshtt is to make sure that the user is fully protected, so it only counts domains as HSTS preloaded if they are fully HSTS preloaded (meaning that all subdomains are included as well).
  • Base Domain HSTS Preloaded - A domain's base domain is HSTS preloaded if its base domain appears in the Chrome preload list with the include_subdomains flag equal to true. This is subtly different from HSTS Entire Domain, which inpects headers on the base domain to see if HSTS is set correctly to encompass the entire zone.

Scoring

These three fields use the previous results to come to high-level conclusions about a domain's behavior.

  • Domain Supports HTTPS - A domain 'Supports HTTPS' when it doesn't downgrade and has valid HTTPS, or when it doesn't downgrade and has a bad chain but not a bad hostname (a bad hostname makes it clear the domain isn't actively attempting to support HTTPS, whereas an incomplete chain is just a mistake.). Domains with a bad chain "support" HTTPS but user-side errors can be expected.
  • Domain Enforces HTTPS - A domain that 'Enforces HTTPS' must 'Support HTTPS' and default to HTTPS. For websites (where Redirect is false) they are allowed to eventually redirect to an https:// URI. For "redirect domains" (domains where the Redirect value is true) they must immediately redirect clients to an https:// URI (even if that URI is on another domain) in order to be said to enforce HTTPS.
  • Domain Uses Strong HSTS - A domain 'Uses Strong HSTS' when the max-age ≥ 31536000.

Troubleshooting

DNS Blackhole / DNS Assist

One issue which can occur when running pshtt, particularly for home/residential networks, with standard ISPs is the use of "DNS Assist" features, a.k.a. "DNS Blackholes".

In these environments, you may see inconsistent results from pshtt owing to the fact that your ISP is attempting to detect a request for an unknown site without a DNS record and is redirecting you to a search page for that site. This means that an endpoint which should resolve as "not-alive", will instead resolve as "live", owing to the detection of the live search result page.

If you would like to disable this "feature", several ISPs offer the ability to opt out of this service, and maintain their own instructions for doing so:

Who uses pshtt?

Acknowledgements

This code was modeled after Ben Balter's site-inspector, with significant guidance from Eric Mill.

Public domain

This project is in the worldwide public domain.

This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication.

All contributions to this project will be released under the CC0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pshtt-0.4.2rc9.tar.gz (24.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pshtt-0.4.2rc9-py2.py3-none-any.whl (21.1 kB view details)

Uploaded Python 2Python 3

File details

Details for the file pshtt-0.4.2rc9.tar.gz.

File metadata

  • Download URL: pshtt-0.4.2rc9.tar.gz
  • Upload date:
  • Size: 24.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pshtt-0.4.2rc9.tar.gz
Algorithm Hash digest
SHA256 b3ca8c1045ff2c10c9bfc8f5437df231856c76b7b8ead4c4a44019c9c2f00dd1
MD5 c3b5b1e22989807376f5d0f9f7d34f80
BLAKE2b-256 1952df33e0eed460699b8b47d6aeb61212af59afa69f6b8db4e60b054afdeabd

See more details on using hashes here.

File details

Details for the file pshtt-0.4.2rc9-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for pshtt-0.4.2rc9-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b10d3c9bfc142c8e77c697fa066243807584fd641b0400a6698ad0e5ae34b273
MD5 f9cb08dc8a8c1324443ac0eed2257e50
BLAKE2b-256 51630402c57dadbcbfeaa9e68d7eb5af6f82eb9a2095ae74a07f4258135e0622

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page