Graph transformations
Graph transformation are mathematical objects that transform graphs.
Definition of an labeled directed graph
A graph is a mathematical object that contains nodes attached by edges. It is noted:
- G = {N, E} where N is a set of nodes, and E a set of pair of nodes n ∈ N
An labeled graph is a graph where nodes and edges can have labels. Because we are in IT, we will consider that each node and edge has at least one label that is its unique identifier named id
.
In all the pages of this site, we will often consider that the graph label is an attribute with a value, what could be represented as follows, for a node n
∈ N and for an edge e
∈ E:
- n{id:value, attribute1:value1, ..., attributen,valuen}
- e{id:value, sourcenode, destination node, attribute1:value1, ..., attributen,valuen}
Note that the edges are directed in the graphs we are considering which means that they have a source and a destination. That enables to exhibit the notions of:
- Incoming relationships to a node: All edges pointing to that node
- Outgoing relationships from a node: All edges have the node as their source
Graphs can be included in each other. At the level of attributed directed graphs, the inclusion can be defined as follows.
Let A = {N1, E1} and B = {N2, E2} two graphs, we will note:
- B ⊂ A if and only if ∀ n ∈ N2, n ∈ N1, and ∀ e ∈ E2, e ∈ E1
Note: That means that all labels are the same.
Definition of a semantic graph
A semantic graph is defined with RDF triples manipulating and linking resources (URI).
We will often think about resources as bearing 3 kinds of information:
- Their domain
- Their type
- Their identifier
For instance, the following URI http://example.com/2021/07/product#item12345
can be split in:
- Domain: http://example.com/2021/07/
- Type: product
- Id: item12345
Some cases are more complex but we will address them as they come.
In a semantic graph, URIs can be used as nodes and edges, which make the semantic graph not a graph actually, and makes it complicated to manipulate without some specific hypothesis on an IT standpoint.
Definition of a graph transformation
Let g
be a graph transformation. g
is define as follows:
- g: D → R, D and R being sets of graphs, potentially reduced to a single graph
- For G ∈ D: g(G) = G' (1)
- If G ⊂ G', then g is said to be non destructive.
- If G ⊄ G', then g is said to be destructive.
Note: If some labels were modified, then g is qualified as destructive even if the node or edge is preserved.
Non destructive graph transformations preserve the full input graph and all the labels attached to its nodes and edges. We can call this process graph enrichment. It is used in the process of data migration.
That leads us to highlight the two kinds of graph transformations:
- Enrichment
- Modify, which has two subtypes:
- Modify destructive
- Modify non destructive
This split is quite important in IT where, implicitly, most graph transformation are in the modify mode to keep the same graph in memory. But, for operations that need to track all previous steps, we can use enrichment graph transformations.
Naturally, graph transformations can be composed:
- Let f: D → R and g: R → P
- g o f is a graph transformation from D → P
This defines:
- The basis of graph transformation reusability;
- The possible existence of a "graph transformation base" of basic graph transformations.
See after.
Definition of a root node
We will define a root node as a node of G that is preserved by g
at least in its existence (maybe its labels were modified, all except its id). Using a root note r
instead of the graph itself, we can write by extension of (1):
- g(r) = r with the first r ∈ G and the second r ∈ G'
Note: Qualifying "root node" a node of G is relative and can only be done in the context of a graph transformation.
This enables to create graph transformations with an homogeneous IT signature that enables cascades of function calls. Indeed, let's suppose r
is a root node for g1, g2 and gp, we can define h
the composition of all graph transformations gi by:
- h = gp o gp-1 o ... o g1 if h(r) = gp(gp-1(... (g1(r)...)) = r
But warning, at each step, r
is representing a different graph.
Limit conditions
In the IT context, some graph transformations may be not applicable to a certain graph because some conditions are not met. Instead of being an error, we can enrich a bit the semantics by defining the NOT_APPLICABLE
keyword.
For f
a graph transformation that is not applicable to a graph G, we can note:
- f: D → R
- G → NOT_APPLICABLE
Even if it is only another way of expressing that G ∉ D, it can be very useful in IT because D is never really defined if f
contains business rules.
To make the model homogeneous, we can postulate, in order to have the composition of graph transformation work, that:
- ∀ g, graph transformation, g(NOT_APPLICABLE) = NOT_APPLICABLE
Basic set of graph transformations
See the special pages:
Applicability of graph transformations
See:
Note: to be declined in Sparql.
Grammar for graph transformation
In the context of the page on industry data, please have a look at the grammar of graph transformation.
(Last update: July 2021)