WebKit Results Database

API

Commits

Commit endpoints allow commits, from multiple repositories, to be queried. These commits are sorted by UUID, which is a combination of the commit timestamp and the commit order. Commit order is defined as a commit's ording within a specific timestamp. Most commits have an order of '0', unless they are part of a patch series which was landed at exactly the same timestamp. UUID's allow commits to be transparently sorted even if they are in different repositories. Commits are represented as objects within the results database, and all endpoints which return commits will represent commits like this:
{
    "repository_id": <string representing repository identifier>,
    "branch": <branch commit is registered on>,
    "identifier": <commit identifier>,
    "hash": <git hash>,
    "revision": <svn revision>,
    "timestamp": <integer UTC timestamp when commit was committed>,
    "order": <order of commit within patch series>,
    "author": {
        "name": <name of person who authored change>,
        "emails": [<list of emails associated with change author>]
    }, "message": <commit message or changelog associated with commit>
}
/api/commits
GET
POST
Endpoint for finding and registering commits. The GET behavior is identical to /api/commits/find. The POST behavior is identical to /commits/register.
Supported Parameters
/api/commits/find
GET
Return a list of commit objects satisfying the query. This list will be ordered, with the oldest commit first and the newest last.
Supported Parameters
/api/commits/repositories
GET
Return a list of repositories tracked by this instance to the results database. The output is of the form:
[
    <repository-id (a)>,
    <repository-id (b)>
]
Supported Parameters
/api/commits/representations
GET
Return a dictionary of prioritized valid commit representations for repositories tracked by this instance to the results database. The output is of the form:
{
    <repository-id (a)>: ["hash", "identifier"],
    <repository-id (b)>: ["identifier", "revision"]
}
Supported Parameters
/api/commits/branches
GET
Returns a dictionary of lists of branches associated with each repository. The output is of the form:
{
    <repository-id (a)>: ["master", "branch-a", "branch-b"],
    <repository-id (b)>: ["main", "branch-a", "branch-c"]
}
Supported Parameters
/api/commits/siblings
GET
With multiple repositories, every commit has a least 1 other commit which was the tip of the tree on the other repository (or repositories) while the primary commit was the tip of it's repository. We refer to these commits as the 'sibling' commits. Given a query which refers to a single commit, this endpoint will return all sibling commits associated with that commit. The result will be a dictionary of lists formated like this:
{
    <repository-id (a)>: [<commit-a2>, <commit-a1>],
    <repository-id (b)>: [<commit-b2>, <commit-b1>]
}
Where <commit-*> are commit objects. These lists are sorted, with the first commit in the list being the latest and the last commit in the list being the oldest. Note that while the sibling endpoint accepts the standard UUID query parameters, this endpoint will return an error if the query parameters refer to multiple commits
Supported Parameters
/api/commits/next
GET
Return a list containing a single commit objects which occurred imiediately after the commit specified by the provided query. Note that while the next endpoint accepts the standard UUID query parameters, this endpoint will return an error if the query parameters refer to multiple commits
Supported Parameters
/api/commits/previous
GET
Return a list containing a single commit objects which occurred imiediately before the commit specified by the provided query. Note that while the previous endpoint accepts the standard UUID query parameters, this endpoint will return an error if the query parameters refer to multiple commits
Supported Parameters
/api/commits/register
POST
Register a single commit in the results database. This commit must be associated with a repository already known by the results database. While a commit objects can be uploaded to this endpoint, it is recommended that the registration of commits outside of automation allow the results database to leverage your source control's API. Such a request looks like this
/api/commits/register?repository_id=webkit&branch=main&id=247355
More generally, any definition of a commit which defines the repository_id, id and branch provides enough information for the results database to query your source control's API and retreive commit information.
Supported Parameters
/commits/info
GET
Redirect to the source-control URL with more information about the specified commit. Note that while the info endpoint accepts the standard UUID query parameters, this endpoint will return an error if the query parameters refer to multiple commits
Supported Parameters

Uploads

Uploads are the input to the results database. Uploads are sorted by configuration and UUID. Uploads are json dictionaries organized in a trie, which looks like this:
{
    "commits": [<commit-a>, <commit-b>],
    "configuration": <configuration-object>,
    "suite": <suite>,
    "timestamp": <UTC timestamp of test run>,
    "test_results": {
        "details": {
            "build-number": "5285",
            "buildbot-master": "build.webkit.org",
            "buildbot-worker": "bot198",
            "builder-name": "Apple-Mojave-Release-WK2"
        },
        "run_stats": {
            "start_time": <UTC timestamp test run started>,
            "end_time": <UTC timestamp test run ended>,
            "tests_skipped": <Number of tests not run>
        },
        "results": {
            "dir-a": {
                "dir-b": {
                    "test-1": {"actual": "FAIL"},
                    "test-2": {}
                },
                "test-3": {"actual": "TIMEOUT", "expected": "TIMEOUT"}
            },
            "dir-c": {
                "test-4": {"actual": "CRASH", "expected": "FAIL"}
            }
        }
    }
}
where <commit-a> and <commit-b> are both commit objects and <configuration-object> is a configuration object. The 'details' dictionary contains information needed to link a specific upload to a run inside a continuous integration system. All test result information is derived directly from uploads.
/api/upload
GET
POST
GET requests against the upload endpoint will return a list of upload objects. This endpoint can be used to transfer results from one results database to another, which is especially useful for testing
POST requests against the upload endpoint will take the uploaded file, and parse it as json, expecting an upload object. Uploading results will register the commits associated with those results and process the result. Note that the POST endpoint does not accept any query paramters.
/api/upload/process
POST
Every upload must be processed, to create individual database entries for each test result. The results database conceptually seperates this processing so that uploads can be reprocessed by a POST request to this endpoint. The parameters to this endpoint should be the same parameters you would send to the /api/upload endpoint.
This endpoint will queue the processing and return before the processing has been completed and will return a list of dictionaries looking like this:
{
    "commits": [<commit-a>, <commit-b>],
    "configuration": <configuration-object>,
    "suite": <suite>,
    "timestamp": <UTC timestamp of test run>,
    "processing": {
        "ci-urls": {"status": "Queued"},
        "suite-results": {"status": "Queued"},
        "test-result": {"status": "Queued"}
    }
}
where <commit-a> and <commit-b> are both commit objects and <configuration-object> is a configuration object. The data inside the 'processing' dictionary indicates any failures which occurred when attempting to process the upload.

Test Lists

Most enpoints on the results database require information about the suite, test or configuration results are associated with. Because these configurations, suites or tests may change over time, the results database exposes some endpoints allowing this data to be retreived in an automated way.
/api/suites
GET
This enpoint returns a list of configuration/suite pairs matching the provided parameters. The /api/suites endpoint is also used to generate a list of valid configurations, and will return a list which is of the form:
[
    [
        <configuration-object-a>,
        ["test-suite-a", "test-suite-b"]
    ], [
        <configuration-object-b>,
        ["test-suite-a"]
    ]
]
where <configuration-object-a> and <configuration-object-b> are both configuration objects.
Supported Parameters
/api/<suite>/tests
GET
Returns a list of tests associated with a specific suite. This list is useful for mapping a partial string to a test name.
Supported Parameters

Test Results

The results-database preforms post-processing on every upload to sort test results. Each test has it's results saved independently and high-level results for a test suite are also stored. The results database classifies every test failure with the following mapping:
{
    "CRASH": 0,
    "TIMEOUT": 8,
    "IMAGE": 16,
    "AUDIO": 24,
    "TEXT": 32,
    "FAIL": 40,
    "ERROR": 48,
    "WARNING": 56,
    "PASS": 64
}
Results which have lower numbers will take precedence over those with smaller ones. For example, if a test has a result which is simultaniously a TIMEOUT and a FAILURE, the results database would treat that test as a TIMEOUT.
Tests may also define an 'expected' result. By default, all tests are expected to pass. If a test defines a result that is not PASS, most facilities within the results database will treat that test as passing so long as it's result matches the expected result. This leads to an idea of expected results verse actual results. Endpoints which colapse results from multiple tests handle this idea like so:
{
    "tests_crashed": <number of tests which crashed or worse>,
    "tests_timedout": <number of tests which timed-out or worse>,
    "tests_failed": <number of tests which failed or worse>,
    "tests_unexpected_crashed": <number of tests which unexpectadly crashed or worse>,
    "tests_unexpected_timedout": <number of tests which unexpectadly timed-out or worse>,
    "tests_unexpected_failed": <number of tests which unexpectadly failed or worse>,
    "tests_skipped": <number of tests which were skipped>,
    "tests_run": <number of tests run>
}
It's important to note that when aggregating test results, the aggregation of timeouts will also include results worse than timeouts (namely, crashes) and the aggregation of failures will also include results worse than failures (so crashes and timeouts).
/api/results/<suite>
GET
Endpoint which returns results for a specific test run. On this endpoint, results are aggregated results in a dictionary formated like this:
{
    "start_time": <UTC time test run started>,
    "uuid": <UUID for test run>,
    "details": {
        "build-number": "5285",
        "buildbot-master": "build.webkit.org",
        "buildbot-worker": "bot198",
        "builder-name": "Apple-Mojave-Release-WK2"
    },
    stats: {
        "start_time": <UTC time build started>,
        "end_time": <UTC time build ended>,
        "tests_crashed": <number of tests which crashed or worse>,
        "tests_timedout": <number of tests which timed-out or worse>,
        "tests_failed": <number of tests which failed or worse>,
        "tests_unexpected_crashed": <number of tests which unexpectadly crashed or worse>,
        "tests_unexpected_timedout": <number of tests which unexpectadly timed-out or worse>,
        "tests_unexpected_failed": <number of tests which unexpectadly failed or worse>,
        "tests_skipped": <number of tests which were skipped>,
        "tests_run": <number of tests run>
    }
}
These results are organized in a list of dictionaries organized like so:
[
    {
        "configuration": <configuration-object-a>,
        "results": [
            <run-a1>,
            <run-a2>
        ]
    }, {
        "configuration": <configuration-object-b>,
        "results": [
            <run-b1>,
            <run-b2>
        ]
    }
]
where <configuration-object-a> and <configuration-object-b> are both configuration objects' and <run-a1>, <run-a2>, <run-b1> and <run-b2> are all the afformentioned aggregated result dictionaries.
/api/results/<suite>/<test>
GET
Access results for a specific test on a specific commit with a specific configuration. This endpoint only returns results for a single test. Each result is stored in dictionary formatted like this:
{
    "start_time": <UTC time test run started>,
    "uuid": <UUID for test run>,
    "actual": <result of run>,
    "expected": <expected result of run>,
    "time": <miliseconds it took test to run>
}
The 'time' element optional. If 'actual' or 'expected' are undefined, they are assumed to be PASS. These results are organized in a list of dictionaries organized like so:
[
    {
        "configuration": <configuration-object-a>,
        "results": [
            <run-a1>,
            <run-a2>
        ],
    }, {
        "configuration": <configuration-object-b>,
        "results": [
            <run-b1>,
            <run-b2>
        ]
    }
]
where <configuration-object-a> and <configuration-object-b> are both configuration objects' and <run-a1>, <run-a2>, <run-b1> and <run-b2> are all the afformentioned single test result dictionary.
/api/results-summary/<suite>/<test>
GET
Compute the combined results of a given test by aggregating results from a set of runs surrounding a specific commit:
{
    "pass": 80,
    "fail": 15,
    "crash": 5,
}
These results are always the weighted aggregation of results for the provided configurations, with weights to be understood as percent liklihood a test will have a certain result on a given revision.

Failure Analysis

Results databases provide a few APIs to assist in the investigation of test failures. These analysis endpoints aggregate data from multiple test runs for consumption by both humans and automated systems.
/api/failures/<suite>
GET
Returns a list of tests which failed during test runs matching the specified criteria. When collapsed, these results will be a sorted list looking like this:
[
    "suite.sub-1.test-1",
    "suite.sub-1.test-2",
    "suite.sub-2.test-1"
]
When uncollapsed, these results will be separated by the upload that generated them. These results are laid out much like the /api/results/<suite> and /api/results/<suite>/<test> endpoints.
[
    {
        "configuration": <configuration-object-a>,
        "results": [
            {
                "start_time": <UTC time test run started>,
                "uuid": <UUID for test run>,
                "suite.sub-1.test-1": "FAIL",
                "suite.sub-1.test-2": "FAIL",
            }
        ]
    }, {
        "configuration": <configuration-object-b>,
        "results": [
            {
                "start_time": <UTC time test run started>,
                "uuid": <UUID for test run>,
                "suite.sub-1.test-1": "FAIL",
                "suite.sub-1.test-2": "CRASH",
            }
        ]
    }
]
where <configuration-object-a> and <configuration-object-b> are both configuration objects.

CI Links

Results database instances are usually storing test results from some sort of continuous integration system. While the results database doesn't assume any particular continuous integration system, it does make some basic assumptions. The results database assumes that every upload has a URL associated with it and that every upload was run on a specific machine. Additionally, the results database assumes that for a given configuration, there is a corresponding 'queue' that all continuous integration runs with that specific configuration are associated with
If these assumptions aren't true for a particular instance of the results database, or if URLs are not included in upload data, the continuous integration endpoints may be dead links. That should not effect the operation or usage of the results database.
/api/url/queue
GET
Returns a list of dictionaries which associate configuration objects to links to queues. This list is organized like so:
[
    {
        "configuration": <configuration-object-a>,
        "url: <url to queue a>
    }, {
        "configuration": <configuration-object-b>,
        "url": <url to queue a>
    }
]
where <configuration-object-a> and <configuration-object-b> are both configuration objects.
Supported Parameters
/api/urls
GET
Returns a list of dictionaries which associate configuration objects and UUIDs with specific queue, worker and build links. The list is organized like this:
[
    {
        "configuration": <configuration-object-a>,
        "urls": [
            {
                "start_time": <UTC time build started>,
                "end_time": <UTC time build ended>,
                "queue": <url to queue a>,
                "worker": <url to worker a>,
                "build": <url to build a>
            }
        ]
    }, {
        "configuration": <configuration-object-b>,
        "urls": [
            {
                "start_time": <UTC time build started>,
                "end_time": <UTC time build ended>,
                "queue": <url to queue a>,
                "worker": <url to worker a>,
                "build": <url to build a>
            }
        ]
    }
]
where <configuration-object-a> and <configuration-object-b> are both configuration objects.
/urls/queue
GET
Redirect to the continuous integration URL for a specific queue associated with the provided parameters. Note that while this endpoint accepts the standard configuration query parameters, this endpoint will return an error if the query parameters refer to multiple queues.
Supported Parameters
/urls/worker
GET
Redirect to the continuous integration URL for a specific worker associated with the provided parameters. Note that while this endpoint accepts the standard configuration and UUID query parameters, this endpoint will return an error if the query parameters refer to multiple workers.
/urls/build
GET
Redirect to the continuous integration URL for a specific build associated with the provided parameters. Note that while this endpoint accepts the standard configuration and UUID query parameters, this endpoint will return an error if the query parameters refer to multiple builds.

Query Parameters

Aggregation

Some endpoints in the results database aggregate data from multiple test runs. Such endpoints accept query parameters that control how this aggregation is preformed. The first of these is the collapsed parameter, which is set to True by default in aggregation endpoints:
collapsed=False
The collapsed parameter indicates that a single result will be returned for all uploads which match the specified criteria. If false, aggregation endpoints will return the results which would have otherwise been aggregated.
Because the results database distinguishes between expected and unexpected failures, endpoints performing aggregation will often filter out expected failures, and flag unexpected passes. To modify the behavior of these algorithms, these endpoints will support the unexpected flag:
unexpected=False
By default, this flag is 'True' and endpoints will ignore tests which matched their expected behavior. If set to 'False', endpoints will return results for all failing tests, regardless of what their expectation is.

Branch

Most data in the results database is partitioned by branch, and it is generally expected that results on seperate branches are independent of one another. By default, endpoints that support the branch query will assume that the branch is master or main for git repositories and trunk for SVN repositories if no value is specified. If multiple values for branch are specified, only the first will be respected. A request which intended to search for results on only results on the safari-607 branch would use a query like this:
branch=safari-607

Configuration

Configurations are the key which defines a specific row within the results database. Uploads which share a configuration, but not a UUID, will appear in the same row on a timeline. Configurations are represented within the results database as an object:
{
    "architecture": <string representing architecture, e.g. x86_64, arm64>,
    "platform": <string representing platform family, e.g. mac, ios>,
    "is_simulator": <boolean which is true if the configuration was simulating an embedded device>,
    "version": <string of the form x.x.x representing the OS version>,
    "flavor": <wild-card string allowing for custom configurations>,
    "style": <debug, release, guard-malloc, ect.>,
    "model": <iPhone 7, Macmini6,2, ect.>,
    "version_name": <Mojave, Catalina, iOS 13, ect.>,
    "sdk": <18A391, 15A432, ect.>
}
Any of the variables which make up a configuration are valid query parameters to an endpoint which supports configuration queries. For example, a query containing:
platform=mac&flavor=release
will only return data associated with Mac's running release binaries. If the same variable is provided multiple times in a single query, like this:
model=iPhone%20SE&model=iPhone%207
the resulting query will return data associated with iPhone SE or iPhone 7.
Endpoints which support querying by configurations are optimized to only search the last 2 weeks of configurations. This means that if a configuration has not reported results in more than 2 weeks, it's data will not be accessable by default. To search all historic configurations, add this:
recent=False
to your query. The downside of searching all configurations is that the default behavior of searching all recent configurations if no query parameters are provided is disabled because the results database cannot search by an unbounded number of configurations.

Include Expectations

Some endpoints return different results if the caller requests expectations be taken into consideration. By default, this flag is disabled, but may be enabled on supporting enpoints with this query:
include_expectations=False

Limit

The underlying architecture of the results database does not allow unlimited query sizes. Most endpoints accept a limit query which looks like this:
limit=150
By default, the limit of most queries is 100. Because of the architecture of the results database's backend, the limit will actually operate on each partitioning key seperately. For the /api/commits/find endpoint, for example, the limit will operate on each repository seperately. So a request like this:
/api/commits/find?limit=150
on a results database instance tracking 2 repositories could potentially return 300 commits, instead of the 150 that you might expect.

Repository

The 'repository_id' query arguement allows a request to be limited to a specific repository. By default, endpoints which support the repository_id query argument will search all repositories if no repository_id is provided. A request which only searches for commits in the WebKit repository would have a query argument like this:
repository_id=webkit

Suite

In most cases, the suite is defined by the path of an endpoint, not the parameters. In some cases, however, suite is defined by the parameters. In these cases, the query parameters will look like this:
suite=layout-tests
It is also valid to define multiple suites in a single query on endpoints that support suite queries, like this:
suite=layout-tests&suite=webkitpy-tests
A query like this will return data associated with the layout-tests suite and the webkitpy-tests suite. If not suite is provided, data for all available suites will be returned.

Time

The time query is seperate from the UUID query. The time query allows data to be sorted and retreived by the UTC timestamp of the specific run that data is associated with. For example, a request provided with this query:
after_time=1562952473&before_time=1562961006
will only return data associated with runs that occurred after 7/12/2019, 5:27:53 PM but before 7/12/2019, 7:50:06 PM.

Test

In most cases, the test is defined by the path of an endpoint, not the parameters. In some cases, however, test is defined by the parameters. In these cases, the query parameters will look like this:
test=test.name
It is also valid to define multiple tests in a single query on endpoints test support test queries, like this:
test=test.name&test=other.name
A query like this will return test names that start with either 'test.name' or 'other.name'.

Ref

Although UUIDs are the primary mechanism by which commits are identified in the results database, many APIs allow callers to specify a more generic commit ref instead. A commit ref is a string representation of a commit that will be converted to a UUID. This ref will be different depending on the underlying repository, but should be a commit revision, identifier or hash. Some examples of commit refs in queries are:
ref=r275886
ref=57015967fef9
ref=236452@main
For both revision and identifier representations, it may be necessary to specify a repository if the results database instance has multiple repositories, because an identifier or revision may exist in both repositories.
ref=236452@main&repository_id=webkit
Much like UUIDs, commit refs can be prefixed by 'before_' and 'after_' to provide a range:
after_ref=22a1e116cb25&before_ref=57015967fef9

UUID

Ultimately, most data in the results database is sorted by UUID. As mentioned in the commits section, UUIDs are defined by the timestamp of a commit and the commit order, where the commit order is the order a commit appears in it's patch series. Since most commits are not in a patch series, most commits have an order of 0. Commit UUIDs are calculated with the following equation:
commit.uuid = commit.timestamp * 100 + commit.order
All endpoints which accept time queries allow data to be retrieved by a UUID with a query like this:
uuid=156295247300
Since UUIDs are integers, endpoints which accept time queries also accept UUID ranges. A query looking for data between UUIDs 156295149100 and 156295247300 would be formated like this:
after_uuid=156295149100&before_uuid=156295247300
We also know that timestamps can be easily converted to UUIDs. Endpoints which support querying by UUID also support querying by UTC timestamp. Our previous query could be instead written like this:
after_timestamp=1562951491&before_timestamp=1562952473
Commits can also be translated to timestamp, although with a bit more work required from the back-end. Endpoints which support querying by UUID also support querying by commit information. In the first example in this section, we queried by UUID 156295247300. This corresponds to r247391. We could instead query by the commit information:
id=247391&repository_id=webkit&branch=main
But, as outlined in the overviews about branches and repositories, endpoints which support those queries have defaults which cover many cases, so the previous query could be simplified to:
id=247391
Because endpoints supporting time are convert everything to UUID on the backend, the queries to these endpoints are quite flexible. The following are all examples of valid time queries:
after_id=247391&repository_id=webkit
after_id=247390&before_timestamp=1562952473
before_id=247391&after_uuid=156295149100&before_branch=main