test
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
Simon Petit 2024-11-14 16:05:52 +01:00
parent cd22d9c8d2
commit f6a71fe353
3 changed files with 22 additions and 22 deletions

View File

@ -15,4 +15,4 @@ steps:
target: /var/www/html/blog/
source:
- index.html
- posts/*.html
- posts

View File

@ -23,18 +23,18 @@ AWK works as follow : it takes an optional regex and execute some code between b
For example :
/^#/ {
print "<h1>" $0 "</h1>"
print "&lt;h1&gt;" $0 "&lt;/h1&gt;"
}
Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.
In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.
In this case, for each line starting with `#`, awk will print (to the standard output), `&lt;h1&gt; [content of the line] &lt;/h1&gt;`.
This is the beginning to parse headers in markdown.
However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.
AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.
`substr` acts as its name indicates, it return a substring of its argument.
/^#/ {
print "<h1>" substr($0, 3) "</h1>"
print "&lt;h1&gt;" substr($0, 3) "&lt;/h1&gt;"
}
In the example above, as per the [documentation](https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html#index-substr_0028_0029-function)
@ -46,11 +46,11 @@ and allows the script to dynamically determine which depth of header it parses :
/^#+ / {
match($0, /#+ /);
n = RLENGTH;
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
print "&lt;h" n-1 "&gt;" substr($0, n + 1) "&lt;/h" n-1 "&gt;"
}
Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence
how to know when to close it with `</ul>` or `</ol>`
how to know when to close it with `&lt;/ul&gt;` or `&lt;/ol&gt;`
## Introducing a LIFO stack
@ -82,7 +82,7 @@ Turns out it came out to be easy, I only needed a pointer to track the size of t
}
The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.
This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.
This is a clever trick, because when I need to close an html tag, I use the poped element between a `&lt;/` and a `&gt;` instead of having a matching table.
I also used a simple `last()` function to return the last pushed value in the stack without popping it out :
@ -99,11 +99,11 @@ This way, parsing lists became trivial :
env = last()
if (env == "ul" ) {
# In a unordered list block, print a new item
print "<li>" substr($0, 3) "</li>"
print "&lt;li&gt;" substr($0, 3) "&lt;/li&gt;"
} else {
# Otherwise, init the unordered list block
push("ul")
print "<ul>\n<li>" substr($0, 3) "</li>"
print "&lt;ul&gt;\n&lt;li&gt;" substr($0, 3) "&lt;/li&gt;"
}
}
@ -122,7 +122,7 @@ I have no idea if this is the best solution, but so far it proved to work:
env = last()
if (env == "none") {
# If no block, print a paragraph
print "<p>" replaceEmAndStrong($0) "</p>"
print "&lt;p&gt;" replaceEmAndStrong($0) "&lt;/p&gt;"
} else if (env == "blockquote") {
print $0
}
@ -136,7 +136,7 @@ It only is a while loop, until the last environement is "none", as it way initia
env = last()
while (env != "none") {
env = pop()
print "</" env ">"
print "&lt;/" env "&gt;"
env = last()
}
}

View File

@ -30,17 +30,17 @@
<p>AWK works as follow : it takes an optional regex and execute some code between bracket, as a function, at each line of the text input.</p>
<p>For example :</p>
<pre><code>/^#/ {
print "<h1>" $0 "</h1>"
print "&lt;h1&gt;" $0 "&lt;/h1&gt;"
}
</code>
</pre>
<p>Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.</p>
<p>In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.</p>
<p>In this case, for each line starting with `#`, awk will print (to the standard output), `&lt;h1&gt; [content of the line] &lt;/h1&gt;`.</p>
<p>This is the beginning to parse headers in markdown.</p>
<p>However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.</p>
<p>AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.</p>
<pre><code>/^#/ {
print "<h1>" substr($0, 3) "</h1>"
print "&lt;h1&gt;" substr($0, 3) "&lt;/h1&gt;"
}
</code>
</pre>
@ -51,12 +51,12 @@
<pre><code>/^#+ / {
match($0, /#+ /);
n = RLENGTH;
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
print "&lt;h" n-1 "&gt;" substr($0, n + 1) "&lt;/h" n-1 "&gt;"
}
</code>
</pre>
<p>Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence </p>
<p>how to know when to close it with `</ul>` or `</ol>`</p>
<p>how to know when to close it with `&lt;/ul&gt;` or `&lt;/ol&gt;`</p>
<h2>Introducing a LIFO stack</h2>
<p>Since according to the markown syntax, it is possible to have nested blocks such as headers and lists withing blockquotes, or lists withing lists, I came with the simple idea to track to current environnement in a stack in AWK.</p>
<p>Turns out it came out to be easy, I only needed a pointer to track the size of the lifo, a fonction to push an element, an another one to pop one out :</p>
@ -88,7 +88,7 @@ function pop() {
</code>
</pre>
<p>The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.</p>
<p>This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.</p>
<p>This is a clever trick, because when I need to close an html tag, I use the poped element between a `&lt;/` and a `&gt;` instead of having a matching table.</p>
<p>I also used a simple `last()` function to return the last pushed value in the stack without popping it out :</p>
<pre><code># Function to get last value in LIFO
function last() {
@ -102,12 +102,12 @@ function last() {
env = last()
if (env == "ul" ) {
# In a unordered list block, print a new item
print "<li>" substr($0, 3) "</li>"
print "&lt;li&gt;" substr($0, 3) "&lt;/li&gt;"
} else {
# Otherwise, init the unordered list block
push("ul")
print "<ul>
<li>" substr($0, 3) "</li>"
print "&lt;ul&gt;
&lt;li&gt;" substr($0, 3) "&lt;/li&gt;"
}
}
</code>
@ -124,7 +124,7 @@ function last() {
env = last()
if (env == "none") {
# If no block, print a paragraph
print "<p>" replaceEmAndStrong($0) "</p>"
print "&lt;p&gt;" replaceEmAndStrong($0) "&lt;/p&gt;"
} else if (env == "blockquote") {
print $0
}
@ -138,7 +138,7 @@ function last() {
env = last()
while (env != "none") {
env = pop()
print "</" env ">"
print "&lt;/" env "&gt;"
env = last()
}
}