The Art of Nested Code Fencing in Markdown
In case you didn't know, it is easy to nest code block delimiters in CommonMark. Since GitHub Flavoured Markdown (GFM) is a strict superset of CommonMark, whatever we discuss in this post applies equally well to both CommonMark and GFM.
Why would we ever need to nest code block delimiters within a pair of code block delimiters? Let us start with a simple example.
Contents
Basic Code Fences
Say we are writing a README.md where we want to teach how to write Markdown. Naturally, we would write the Markdown examples within fenced code blocks so that the raw Markdown source text is preserved. For example, we might write:
To add emphasis to some text, wrap it in a pair of asterisks.
For example:
```
hello, *world*
```
A CommonMark renderer converts the above Markdown text to the following HTML:
<p>To add emphasis to some text, wrap it in a pair of asterisks.
For example:</p>
<pre><code>hello, *world*
</code></pre>
The rendered output looks like this:
To add emphasis to some text, wrap it in a pair of asterisks. For example:
hello, *world*
But what if we want to use triple backticks within the fenced code block? Perhaps we want to teach how to use code fences in our README.md. This presents a difficulty. We cannot write triple backticks within a fenced code block that is started with triple backticks. The second occurrence of triple backticks would be interpreted as the closing code fence, thereby terminating the code block. CommonMark offers a few solutions to resolve this difficulty.
Fancy Code Fences
There are mainly two ways to include triple backticks within fenced code blocks. First, we can use tildes as the code fence:
To write a code block, wrap it in a pair of triple backticks.
For example:
~~~
```
echo 'hello, world'
```
~~~
In fact, the code fence need not be exactly three backticks or tidles. Any number of backticks or tidles is allowed as long as the number is at least three. So the following is equivalent:
To write a code block, wrap it in a pair of triple backticks.
For example:
~~~~~~~~~~~~~~~~~~~
```
echo 'hello, world'
```
~~~~~~~~~~~~~~~~~~~
So is this:
To write a code block, wrap it in a pair of triple backticks.
For example:
```````````````````
```
echo 'hello, world'
```
```````````````````
All three examples above produce the following HTML:
<p>To write a code block, wrap it in a pair of triple backticks.
For example:</p>
<pre><code>```
echo 'hello, world'
```
</code></pre>
The rendered output looks like this:
To write a code block, wrap it in a pair of triple backticks. For example:
```
echo 'hello, world'
```
Basic Code Spans
A similar problem arises with inline code spans as well. Most Markdown users know to use backticks to delimit code spans. For example:
The `echo` command writes arguments to standard output.
This produces the following HTML:
<p>The <code>echo</code> command writes arguments to standard output.</p>
The rendered HTML looks like this:
The echo command writes arguments to standard output.
However, what do we do when we need to write a backtick within an inline code span?
Fancy Code Spans
A code span delimiter need not be exactly one backtick. It can be
any number of backticks. So `echo`
and ``echo`` produce identical HTML. There is another
important but less-known detail. When the text within an inline
code span both begins and ends with spaces, a single space
is removed from both ends. So `echo` and
` echo ` are equivalent. Therefore, when we need to
put backticks within an inline code span, we can use multiple
backticks and a space to start the code span. Here is an example:
The string `` `${a}` `` is an example of a template literal.
Here is the HTML produced:
<p>The string <code>`${a}`</code> is an example of a template literal.</p>
Here is the rendered HTML:
The string `${a}` is an example of a template literal.
Specification
In this section, I'll note down excerpts from the CommonMark Spec Version 0.30, which is by now over four years old.
From section 4.5 Fenced Code Blocks:
A code fence is a sequence of at least three consecutive backtick characters (
`) or tildes (~). (Tildes and backticks cannot be mixed.)
The content of the code block consists of all subsequent lines, until a closing code fence of the same type as the code block began with (backticks or tildes), and with at least as many backticks or tildes as the opening code fence.
From section 6.1 Code Spans:
A backtick string is a string of one or more backtick characters (
`) that is neither preceded nor followed by a backtick.A code span begins with a backtick string and ends with a backtick string of equal length. The contents of the code span are the characters between these two backtick strings, normalized in the following ways:
- First, line endings are converted to spaces.
- If the resulting string both begins and ends with a space character, but does not consist entirely of space characters, a single space character is removed from the front and back. This allows you to include code that begins or ends with backtick characters, which must be separated by whitespace from the opening or closing backtick strings.
I hope this will be useful to you some day.