CAPO function
The httparchive.fn.CAPO
function takes an HTML response body and returns an array of objects containing the relative performance weighting for each element in the static HTML <head>
.
Learn more about using Capo on BigQuery in the capo.js docs.
Input
html
The HTML response body.
Type: STRING
Response bodies can be sourced from the response_body
field of the requests
table, or the body
field of legacy response_bodies
tables.
The HTML does not need to be complete and can contain only the <head>
element, so for faster results and to avoid hitting memory limitations of BigQuery functions, itβs recommended to extract everything before the opening <body>
tag using a regular expression like this:
REGEXP_EXTRACT(response_body, r'(?s)(.*)(<body.*?>)')
Output
Capo object.
Type: ARRAY<STRUCT<vizWeight STRING, weight INT64, element STRING>>
Example usage
Static input
SELECT httparchive.fn.CAPO('''
<html>
<head>
<title>Example</title>
<link rel="manifest" href="/manifest.json">
<style></style>
<script defer src="script.js"></script>
<meta charset="utf-8">
</head>
</html>
''')
vizWeight | weight | element |
---|---|---|
ββββββββββ | 9 | <title>Example</title> |
β | 0 | <link rel="manifest" href="/manifest.json"> |
βββββ | 4 | <style></style> |
βββ | 2 | <script defer="" src="script.js"></script> |
βββββββββββ | 10 | <meta charset="utf-8"> |
Live input
SELECT
page,
httparchive.fn.CAPO(response_body) AS capo
FROM
`httparchive.all.requests` TABLESAMPLE SYSTEM (0.001 PERCENT)
WHERE
date = '2023-05-01' AND
client = 'desktop' AND
is_main_document
LIMIT
1
page | vizWeight | weight | element |
---|---|---|---|
https://www.example.com/ | ββββββββββ | 9 | <title>Example Domain</title> |
https://www.example.com/ | βββββββββββ | 10 | <meta charset="utf-8"> |
https://www.example.com/ | βββββββββββ | 10 | <meta http-equiv="Content-type" content="text/html; charset=utf-8"> |
https://www.example.com/ | βββββββββββ | 10 | <meta name="viewport" content="width=device-width, initial-scale=1"> |
https://www.example.com/ | βββββ | 4 | <style type="text/css">...</style> |