Skip to content

Commit

Permalink
fixup! Introduce links to CDEvents
Browse files Browse the repository at this point in the history
Signed-off-by: benjamin-j-powell <benjamin_j_powell@apple.com>
  • Loading branch information
benjamin-j-powell committed Jul 17, 2023
1 parent bdf09bd commit fb3b5e0
Showing 1 changed file with 42 additions and 50 deletions.
92 changes: 42 additions & 50 deletions links.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,14 @@

This proposal will outline how to connect individual CDEvents to eachother.
Right now there's no way of associating events to one another without needing
to backtrack across certain context attributes, eg [id](https://github.com/CDEvents/spec/blob/main/spec.md#id-context).
There is limitations, however, in that we don't know when an event begins
nor finishes.
to backtrack across certain context attributes, eg
[id](https://github.com/CDEvents/spec/blob/main/spec.md#id-context). While
that does give us the ability to construct some graph, we do not know when a
particular span starts or finishes.

This proposal will outline a new approach that will allow for
connecting CDEvents to one another.
This proposal will outline a new approach that will allow for connecting
CDEvents to one another and give a clear distinction of when a span start and
finishes.

## Semantics

Expand All @@ -21,7 +23,7 @@ assumptions being made when we talk about linking events
* **CDEvents Span** - A CDEvents' span is an end to end representation of the
CI/CD lifecycle of some entity
* **Link** - A link is some relation to some thing, eg event, grouping, etc.
* **Global ID** - A global ID is some overarching ID for a given event.
* **Global ID** - A global ID is some overarching ID for a given span.

## Goals

Expand All @@ -40,26 +42,27 @@ can be linked to one another and be described in a way where it represents the
complete picture of the whole CI/CD span. The second portion will address the
goal of scalability.

### New Field(s)
### New Headers

When determining a course of action, it is generally important to consider what
the parent event had done. To cater to this very common use case, we will
introduce a new field called `parent` in each event.
To allow for connecting events and proper propogation, we will add two new HTTP
headers:
* `X-CDEVENTS-GLOBAL-ID`
* `X-CDEVENTS-PARENT-IDS`

The new `parent` field will be an array of the immediate preview parent(s)
events.

`global_id` will also be added to the context of events which will allow for
querying all links from a given global id.

The new parent field will be an added optional field, but the `global_id` will be
a required field in **all** cdEvent types looking something like:
The reasoning for using headers instead of adding new fields to the payload, is
that some services may not be adhering to CDEvents. It is generally easier for
a service to enable certain headers to be forwarded than needing to adapt or
restructure some payload to accomodate some standard like CDEvents.

```json
# Headers
# X-CDEVENTS-GLOBAL-ID: "00000000-0000-0000-0000-000000000001"
# X-CDEVENTS-PARENT-IDS: "271069a8-fc18-44f1-b38f-9d70a1695819"

# Payload remains the same
{
"context": {
"version": "0.3.0-draft",
"global_id": "00000000-0000-0000-0000-000000000001", // new field
"id": "505b31c2-8bc8-47b3-a1a0-269d7a8530ac",
"source": "dev/jenkins",
"type": "dev.cdevents.testsuite.finished.0.1.1",
Expand All @@ -70,33 +73,23 @@ a required field in **all** cdEvent types looking something like:
"source": "/tests/com/org/package",
"type": "testSuite",
"content": {}
},
"parent": [ // new proposed field here
{
"context": {
"global_id": "00000000-0000-0000-0000-000000000001",
"version": "0.4.0-draft",
"id": "271069a8-fc18-44f1-b38f-9d70a1695819",
"source": "/event/source/123",
"type": "dev.cdevents.change.merged.0.1.2",
"timestamp": "2023-03-20T14:27:05.315384Z"
},
"subject": {
"id": "mySubject123",
"source": "/event/source/123",
"type": "change",
"content": {
"repository": {
"id": "TestRepo/TestOrg",
"source": "https://example.org"
}
}
}
}
]
}
}
```

## IDs

With the introduction of the two headers, it's important to establish what they
are and their respective formats.

The `X-CDEVENTS-GLOBAL-ID` is an ID that is generated when a new CDEvent span
is wanted or if no CDEvent span is present. This ID will follow the [UUID](https://datatracker.ietf.org/doc/html/rfc4122)
format.

The `X-CDEVENTS-PARENT-IDS` is any number of immediate parent IDs to satifisy
the fan out, fan in use case. These IDs will be the IDs of the `context.id` in
the CDEvent which is of the UUID format.

## Client APIs and Links Storage

So far we've only talked about what a service may receive when expecting a
Expand Down Expand Up @@ -165,10 +158,9 @@ simply turning on linking payload aggregation, will send all links in the
payload. Mind you, this can make the payload very large, but may be good for
debugging.

The `global_id` field will use whatever is in the context, unless the user
explicitly starts a new global event. If there is no `global_id` set, the
client will generate one and that will be used for the lifetime of the whole
events span
The global ID header will continue to propogate, unless the user explicitly
starts a new CDEvent span. If there is no global ID header, the client will
generate one and that will be used for the lifetime of the whole events span

![link flow](images/links_flow.jpeg)

Expand Down Expand Up @@ -264,16 +256,16 @@ Scalability is one of the bigger goals in this proposal and we wanted to ensure
fast lookups. This section is going to describe how the proposed links format
will be scalable and also provide tactics on how DB read/writes can be done.

The purpose of the `global_id` was to ensure very fast lookups no matter the
The purpose of the global ID is to ensure very fast lookups no matter the
database. We could say that only graph DBs could be used to do a full span
lookup without a `global_id` but that poses two problems:
lookup without a global ID but that poses two problems:

* Slower lookups as the graph DB needs to backtrack to find the full span
* Requires either graph DBs or using SQL like graph DBs.

Instead a link service that processes and agnostically stores to some DB is
much prefer as it gives companies and developers options to choose from. When
using an SQL database, the `global_id` could be the secondary key to easily
using an SQL database, the global ID could be the secondary key to easily
retrieve indexed entities. Links could be easily sorted by timestamp which
should roughly coordinate to their linked neighbors, parent and child.

Expand Down

0 comments on commit fb3b5e0

Please sign in to comment.