Skip to content
har.fyi 🧪

Technology struct

Appears in: pages table

Technologies are detected by Wappalyzer. Refer to HTTP Archive’s fork of the Wappalyzer repository on GitHub to request a new technology detection or to browse the source code of existing detections.

Example queries

Pages using WordPress in the top 5k

As the technologies field is a repeated struct, we need to use UNNEST to query it.

SELECT DISTINCT
  root_page
FROM
  `httparchive.all.pages`,
  UNNEST(technologies) AS t
WHERE
  date = '2023-09-01' AND
  rank = 1000 AND
  t.technology = 'WordPress'

Top 10 CMSs

Within the technologies field, the categories field is also repeated. We can use UNNEST to query it as well.

It’s straightforward to detect whether a page uses a technology. However, to generalize that to an entire website (or origin), we detect if either its root_page or secondary page use it. To handle this in the query, we count the distinct number of pages’ root_page fields.

SELECT
  t.technology AS cms,
  COUNT(DISTINCT root_page) AS sites
FROM
  `httparchive.all.pages`,
  UNNEST(technologies) AS t,
  UNNEST(t.categories) AS category
WHERE
  date = '2023-09-01' AND
  category = 'CMS'
GROUP BY
  cms
ORDER BY
  sites DESC
LIMIT 
  10

Top 5 WordPress versions

There is usually only one technology version on a given page, but in some cases a site uses the same technology twice. For example, multiple widgets load different versions of jQuery.

To account for these edge cases, the info field is also repeated, so we need to use UNNEST to query it as well.

Also note that some pages omit version numbers, so you may see empty or null values in the results.

Regular expressions can be used to parse major version numbers, for example REGEXP_EXTRACT(version, r'^(\d+)'). Beware of garbage values, as the version info is extracted from the source HTML. For example, you may encounter a subset of pages with a version number that hasn’t even been released yet.

SELECT
  APPROX_TOP_COUNT(version, 10)
FROM
  `httparchive.all.pages`,
  UNNEST(technologies) AS t,
  UNNEST(t.info) AS version
WHERE
  date = '2023-09-01' AND
  t.technology = 'WordPress'

Schema

Field nameTypeDescription
technologySTRINGName of the detected technology
categoriesARRAY<STRING>List of categories to which this technology belongs
infoARRAY<STRING>Additional metadata about the detected technology, ie version number

technology

Name of the detected technology

categories

List of categories to which this technology belongs

info

Additional metadata about the detected technology, ie version number