Mangle is a programming language for deductive database programming. It
is an extension of Datalog, with various extensions like aggregation, function
calls and optional type-checking.
Deductive database is useful for bringing data from multiple data sources
together since it enables us to represent and query that data in a uniform way.
It can also be used to model domain knowledge, similar to machine-readable
ontology but without being restricted to binary predicates.
Datalog is an expressive declarative language similar to relational calculus
(think SQL and relational views). Unlike relational calculus, it also supports
recursive rules and program structuring in a straightforward way.
Mangle contains Datalog as a fragment and adds extensions that make its use
more practical. Some of the good properties like guaranteed termination are
lost when extensions are used.
The goal of Mangle as an open source project is to convey the concepts in
a way that is accessible to developers and lends itself to easy experimentation.
This repository contains an implementation of Mangle as a go library that can be
easily embedded into applications.
This is not an officially supported Google product. The Mangle maintainers
welcome external contributions to spec, documentation and this
implementation (see CONTRIBUTING.md) and also other
implementations. Pull requests will be handled like for tensorflow,
to ensure our internal use cases and tests continue to work.
Table of Contents
- Building(#building)
Examples
Simple Queries
Imagine you were asked to spot software affected by the
log4j vulnerability discovered in late 2021.
We do this by looking for projects that contain a Java archive (jar file) of
log4j that is not patched version.
projects_with_vulnerable_log4j(P) :- projects(P), contains_jar(P, "log4j", Version), Version != "2.17.1", Version != "2.12.4", Version != "2.3.2".
This is a Mangle rule: conceptually, the implementation retrieve all
possible values for variables P
and Version
that make all the subgoals true.
Simple Mangle rules like this correspond to select-project-join relational
queries. The same query in SQL would look like this:
SELECT projects.id as P FROM projects JOIN contains_jar ON projects.id = contains_jar.project_id WHERE contains_jar.version NOT IN ("2.17.1", "2.12.4", "2.3.2")
Unlike SQL, our Mangle rule projects_with_vulnerable_log4j
has a name
and can be referenced in other queries.
(If translating non-recursive Datalog into SQL queries sounds interesting, you
should check out the Logica open source project.)
Aggregation
In practice, querying is rarely enough and we also need grouping and
aggregation.
The example does not specify what contains_jar
does. Here is a possible
implementation for contains_jar
that walks a dependency graph.
This shows that Mangle rules can be recursive.
contains_jar(P, Name, Version) :-
contains_jar_directly(P, Name, Version).
contains_jar(P, Name, Version) :-
project_depends(P, Q),
contains_jar(Q, Name, Version).
The two rules correspond to two cases in which a project may “contain” a jar:
either directly, or through some dependency.
Knowledge Graphs, Property Graphs
When engineering requirements, it is useful to model a slice of the real world
through a domain model and contro