Working with RDF graphs in JavaScript can be cumbersome compared to working with JSON data structures. This post explores different techniques for handling RDF in JavaScript less tedious and more idiomatic.
For much of the 2000s and early 2010s, Java and then Python were the preferred programming languages for manipulating data serialized in RDF, SPARQL endpoints, and other Semantic Web technologies. There are mature libraries in both languages, including Apache Jena and RDF4J in Java and RDFLib in Python.
Since that time JavaScript has largely supplanted Java and Python as the language of choice for handling RDF. The shift can be ascribed to the maturation of JavaScript as a language; the creation of the RDF/JS specifications, which have become de facto standards for programmatic access to RDF in JavaScript; and the Solid project's choice of JavaScript as its main implementation language, among other reasons.
JavaScript, like Java and Python, lacks built-in primitives for manipulating graph data. All three languages are object-oriented and work best with record-like data structures (i.e., product types).
In the absence of explicit language support there are various library-based techniques for handling RDF graph data in JavaScript, described in the sections below.
Low-level RDF/JS
The RDF/JS specifications represent the lowest common denominator, record-oriented way of working with RDF in JavaScript. The following example from the N3.js library documentation employs simple, object-oriented JavaScript to iterate over a set of hard-coded RDF quads using an RDF/JS DatasetCore
implementation (N3.Store
):
const store = new N3.Store();
store.add(
namedNode('http://ex.org/Pluto'),
namedNode('http://ex.org/type'),
namedNode('http://ex.org/Dog')
);
store.add(
namedNode('http://ex.org/Mickey'),
namedNode('http://ex.org/type'),
namedNode('http://ex.org/Mouse')
);
// Retrieve all quads
for (const quad of store)
console.log(quad);
// Retrieve Mickey's quads
for (const quad of store.match(namedNode('http://ex.org/Mickey'), null, null))
console.log(quad);
The DatasetCore
match
method is the workhorse of the interface, extracting subsets of quads in the dataset that match a pattern. The pattern consists of exact (Term equals
) or wildcard (anything accepted) matches on the components of a quad (subject, predicate, object, graph). More sophisticated matching such as partial IRIs or literal comparisons is left to higher-level libraries.
Domain-specific languages
Traversing subgraphs with match
quad patterns quickly becomes involved. Even a simple SPARQL WHERE
clause such as ?s foaf:knows ?t . ?t foaf:knows ?z .
would require two match
loops.
Libraries such as clownface simplify RDF manipulation with an embedded domain-specific language. Under the hood, clownface uses RDF/JS interfaces as primitives.
The following abridged example from the clownface documentation demonstrates the technique:
const peopleStuartKnows = stuartBloom
.out(ns.schema.knows)
.map((person) => {
const personalInformation = person.out([
ns.schema.givenName,
ns.schema.familyName
])
return personalInformation.values.join(' ')
})
.join(', ')
Starting from the RDF node stuartBloom
, which represents the person Stuart Bloom:
- Find all
(stuartBloom, schema:knows, person)
triples, where everyperson
object corresponds to a person Stuart Bloom knows - For each
person
, map theperson
to their full name by- Finding the
person
's given name by matching the triple pattern (person
,schema:givenName
,givenName
) - Finding the
person
's family name in a similar manner - Concatenating the resulting literal values with a space separator
- Finding the
- Join all the full names of all the people with a
,
Graph to record projection
SPARQL SELECT
queries can be used to project nodes and edges in the RDF graph into JSON data structures. The SPARQL 1.1 Query Language specification includes the following example SELECT
query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?nameX ?nameY ?nickY
WHERE
{ ?x foaf:knows ?y ;
foaf:name ?nameX .
?y foaf:name ?nameY .
OPTIONAL { ?y foaf:nick ?nickY }
}
with example results in JSON:
{
"head": {
"vars": [ "nameX" , "nameY" , "nickY" ]
} ,
"results": {
"bindings": [
{
"nameX": { "type": "literal" , "value": "Alice" } ,
"nameY": { "type": "literal" , "value": "Bob" }
} ,
{
"nameX": { "type": "literal" , "value": "Alice" } ,
"nameY": { "type": "literal" , "value": "Clare" } ,
"nickY": { "type": "literal" , "value": "CT" }
}
]
}
}
Manipulating the JSON results in JavaScript is straightforward:
for (const binding of outer.results.bindings) {
console.log(binding.nameX.value);
}
Validating schemas for the JSON data structures can also be defined with yup, https://github.com/colinhacks/zod, or similar libraries, and TypeScript types inferred at compile-time from the schemas.
SPARQL projection can be used to create nested records, too. For example, a Person
record may have a list of friends
in the form of nested Person
records. There is no nesting in the graph, only triples connecting the various Person
nodes to each other. Converting references (the triples) to nested records creates redundancy in the resulting data but simplifies processing in JavaScript, since there are language primitives for accessing nested records (e.g., object[key1][key2]
).
Conclusion
Although libraries for handling RDF in JavaScript long predate the emergence of the RDF/JS specifications, the latter have provided a foundation for higher-level interfaces such as clownface. In the absence of JavaScript language primitives for manipulating RDF graph data, concise and idiomatic domain-specific languages and libraries are the preferred technique for working with RDF in JavaScript.