Semantic Markdown Specifications

Semantic Markdown Specifications

Markdown (MD) has become the de facto standard syntax for writing on the web, pushed by Github and StackOverflow. It is heavily used everytime one need to enter a comment, or write a simple (document-style) HTML page. What if we could embed semantic annotations in a markdown document ? We would get Semantic Markdown ! imagine the best of both worlds between human readable/writable documents and machine-readable/writable (RDF) structured data. We could feed an RDF knowledge graph that is coupled with our set of MD documents, and we would have an easy way to put structure in content.

I see a lot of potential in this, and already see some use-cases. Unfortunately I don’t have the bandwith, nor the full skills to make this happens. So I am just writing this in the hope that the idea is implemented by someone, or that someone tells me it is totally nonsense…

Here are the semantic annotations use-cases I see with such a Semantic Markdown :

  1. Annotate a span or title that corresponds to an entity ;
  2. Annotate a piece of text with an existing URI for an entity;
  3. Create some statements on an entity;

Note that I am not necessarily looking for a way to produce RDFa annotations on the generated HTML, although that would be nice for a schema.org use-case. Any conversion route from the original semantically annotated markdown to a set of triples would be fine.

My source of inspiration is essentially Span Inline Attribute Lists » from the Kramdown syntax.

Annotate a span that corresponds to an entity

This piece of Semantic Markdown :

Tomorrow I am travelling to _Berlin_ {.schema:Place}

When interprered by a Semantic Markdown parser would produce this set of triples :

_:1 a <http://schema.org/Place> .
_:1 rdfs:label “Berlin” .

The span immediately preceding the « {.xxxx} » annotation is taken as the label of the entity. The use of rdfs:label to store the label of the entity could be subject to a parser configuration option.

One could imagine that a semantic markdown parser relies on the same RDFa Initial Context to interpret the « schema: » prefix without further declaration. But what about other ontologies ? we would need some kind of prefixes / vocab declaration somewhere in the document, just like in RDFa.

Note also that Markdown parser supporting the « {.xxxxx} » syntax will also insert this value as a CSS class on the corresponding span, so we win both on the CSS level and the semantic level.

Annotate a title

Similarly, we could annotate a title

### European Semantic Web Conference {.schema:Event}
Lorem ipsum...

In that case, the full content of the title is interpreted as the label of the entity :

_:1 a <http://schema.org/Event> .
_:1 rdfs:label “European Semantic Web Conference” .

Annotate with a known URI

Tomorrow I am travelling to [Berlin](https://www.wikidata.org/wiki/Q64) {.schema:Place}

Would yield

<https://www.wikidata.org/wiki/Q64> a <http://schema.org/Place> .
<https://www.wikidata.org/wiki/Q64> rdfs:label “Berlin” .

Describe an entity

If a list follows an annotated entity, then it should be interpreted as a set of predicates with this entity as subject :

### Specifications Meeting {.schema:Event}

* Date : _11/10_{.schema:startDate}
* Place {.schema:location} : Our office, Street name, 75014 Paris
* Meeting participants : 
  {.schema:attendee}
  * Thomas Francart{.schema:Person}
  * [Someone else](https://www.wikidata.org/wiki/Q80)
  * Tim Foo
* Description : Some information not annotated

### titre suivant
Lorem ipsum...

Should yield :

_:1 a <http://schema.org/Event> .
_:1 rdfs:label “Specifications Meeting” .
_:1 <http://schema.org/startDate> "11/10" .
_:1 <http://schema.org/location> "Our office, Street name, 75014 Paris" .
_:1 <http://schema.org/attendee> _:2 , <https://www.wikidata.org/wiki/Q80>, _:3 .

# attendee that is annotated : we know a type and a name
_:2 a <http://schema.org/Person>
_:2 rdfs:label “Thomas Francart” .

# attendee that is annotated with a URI : we keep the URI and add a label to it (?)
<https://www.wikidata.org/wiki/Q80> rdfs:label "Someone else" .

# attendee that is not annotated - but we know he was an attendee
_:3 rdfs:label "Tim Foo" .
  1. If a list follows a title or a paragraph that contains an annotated entity…
  2. Then items in this list correspond to a property of this entity…
  3. And can be annotated with a property
  4. The property annotation can be placed on an inline text, or right before or after a `:` or `=` character
  5. If the property annotation immediatly precedes a list, then all items in this list would be considered values for that property, and in that case could be either : entities annotated with a type, or entities identified by a URI, or entites not annotated (and in that case we would consider them as blank nodes with only a label

Related works

Metadata for Markdown, a Python extension to generated JSON-LD from YAML section in a Markdown document.

EDIT : PanDoc divs and spans : https://pandoc.org/MANUAL.html#divs-and-spans

I like the <span> syntax :

[This is *some text*]{.class key="val"}

This is close ! but still would not produce triples, unless one writes explicitely RDFa :

My name is [Thomas Francart]{typeof="schema:Person"}

Next Post:
Previous Post:

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Vous pouvez utiliser ces balises et attributs HTML : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>