At Netflix, we have hundreds of micro services each with its own data models or entities. For example, we have a service that stores a movie entity’s metadata or a service that stores metadata about images. All of these services at a later point want to annotate their objects or entities. Our team, Asset Management Platform, decided to create a generic service called Marken which allows any microservice at Netflix to annotate their entity.
Annotations
Sometimes people describe annotations as tags but that is a limited definition. In Marken, an annotation is a piece of metadata which can be attached to an object from any domain. There are many different kinds of annotations our client applications want to generate. A simple annotation, like below, would describe that a particular movie has violence.
- Movie Entity with id 1234 has violence.
But there are more interesting cases where users want to store temporal (time-based) data or spatial data. In Pic 1 below, we have an example of an application which is used by editors to review their work. They want to change the color of gloves to rich black so they want to be able to mark up that area, in this case using a blue circle, and store a comment for it. This is a typical use case for a creative review application.
An example for storing both time and space based data would be an ML algorithm that can identify characters in a frame and wants to store the following for a video
- In a particular frame (time)
- In some area in image (space)
- A character name (annotation data)
Goals for Marken
We wanted to create an annotation service which will have the following goals.
- Allows to annotate any entity. Teams should be able to define their data model for annotation.
- Annotations can be versioned.
- The service should be able to serve real-time, aka UI, applications so CRUD and search operations should be achieved with low latency.
- All data should be also available for offline analytics in Hive/Iceberg.
Schema
Since the annotation service would be used by anyone at Netflix we had a need to support different data models for the annotation object. A data model in Marken can be described using schema — just like how we create schemas for database tables etc.
Our team, Asset Management Platform, owns a different service that has a json based DSL to describe the schema of a media asset. We extended this service to also describe the schema of an annotation object.
{
"type": "BOUNDING_BOX", ❶
"version": 0, ❷
"description": "Schema describing a bounding box",
"keys": {
"properties": { ❸
"boundingBox": {
"type": "bounding_box",
"mandatory": true
},
"boxTimeRange": {
"type": "time_range",
"mandatory": true
}
}
}
}
In the above example, the application wants to represent in a video a rectangular area which spans a range of time.
- Schema’s name is BOUNDING_BOX
- Schemas can have versions. This allows users to make add/remove properties in their data model. We don’t allow incompatible changes, for example, users can not change the data type of a property.
- The data stored is represented in the “properties” section. In this case, there are two properties
- boundingBox, with type “bounding_box”. This is basically a rectangular area.
- boxTimeRange, with type “time_range”. This allows us to specify start and end time for this annotation.
Geometry Objects
To represent spatial data in an annotation we used the Well Known Text (WKT) format. We support following objects
- Point
- Line
- MultiLine
- BoundingBox
- LinearRing
Our model is extensible allowing us to easily add more geometry objects as needed.
Temporal Objects
Several applications have a requirement to store annotations for videos that have time in it. We allow applications to store time as frame numbers or nanoseconds.
To store data in frames clients must also store frames per second. We call this a SampleData with following components:
- sampleNumber aka frame number
- sampleNumerator
- sampleDenominator
Annotation Object
Just like schema, an annotation object is also represented in JSON. Here is an example of annotation for BOUNDING_BOX which we discussed above.
{
"annotationId": { ❶
"id": "188c5b05-e648-4707-bf85-dada805b8f87",
"version": "0"
},
"associatedId": { ❷
"entityType": "MOVIE_ID",
"id": "1234"
},
"annotationType": "ANNOTATION_BOUNDINGBOX", ❸
"annotationTypeVersion": 1,
"metadata": { ❹
"fileId": "identityOfSomeFile",
"boundingBox": {
"topLeftCoordinates": {
"x": 20,
"y": 30
},
"bottomRightCoordinates": {
"x": 40,
"y": 60
}
},
"boxTimeRange": {
"startTimeInNanoSec": 566280000000,
"endTimeInNanoSec": 567680000000
}
}
}
- The first component is the unique id of this annotation. An annotation is an immutable object so the identity of the annotation always includes a version. Whenever someone updates this annotation we automatically increment its version.
- An annotation must be associated with some entity which belongs to some microservice. In this case, this annotation was created for a movie with id “1234”
- We then specify the schema type of the annotation. In this case it is BOUNDING_BOX.
- Actual data is stored in the
metadata
section of json. Like we discussed above there is a bounding box and time range in nanoseconds.
Base schemas
Just like in Object Oriented Programming, our schema service allows schemas to be inherited from each other. This allows our clients to create an “is-a-type-of” relationship bet