test
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
Simon Petit 2024-11-14 16:05:52 +01:00
parent cd22d9c8d2
commit f6a71fe353
3 changed files with 22 additions and 22 deletions

View File

@ -15,4 +15,4 @@ steps:
target: /var/www/html/blog/ target: /var/www/html/blog/
source: source:
- index.html - index.html
- posts/*.html - posts

View File

@ -23,18 +23,18 @@ AWK works as follow : it takes an optional regex and execute some code between b
For example : For example :
/^#/ { /^#/ {
print "<h1>" $0 "</h1>" print "&lt;h1&gt;" $0 "&lt;/h1&gt;"
} }
Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line. Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.
In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`. In this case, for each line starting with `#`, awk will print (to the standard output), `&lt;h1&gt; [content of the line] &lt;/h1&gt;`.
This is the beginning to parse headers in markdown. This is the beginning to parse headers in markdown.
However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not. However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.
AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations. AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.
`substr` acts as its name indicates, it return a substring of its argument. `substr` acts as its name indicates, it return a substring of its argument.
/^#/ { /^#/ {
print "<h1>" substr($0, 3) "</h1>" print "&lt;h1&gt;" substr($0, 3) "&lt;/h1&gt;"
} }
In the example above, as per the [documentation](https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html#index-substr_0028_0029-function) In the example above, as per the [documentation](https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html#index-substr_0028_0029-function)
@ -46,11 +46,11 @@ and allows the script to dynamically determine which depth of header it parses :
/^#+ / { /^#+ / {
match($0, /#+ /); match($0, /#+ /);
n = RLENGTH; n = RLENGTH;
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">" print "&lt;h" n-1 "&gt;" substr($0, n + 1) "&lt;/h" n-1 "&gt;"
} }
Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence
how to know when to close it with `</ul>` or `</ol>` how to know when to close it with `&lt;/ul&gt;` or `&lt;/ol&gt;`
## Introducing a LIFO stack ## Introducing a LIFO stack
@ -82,7 +82,7 @@ Turns out it came out to be easy, I only needed a pointer to track the size of t
} }
The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment. The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.
This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table. This is a clever trick, because when I need to close an html tag, I use the poped element between a `&lt;/` and a `&gt;` instead of having a matching table.
I also used a simple `last()` function to return the last pushed value in the stack without popping it out : I also used a simple `last()` function to return the last pushed value in the stack without popping it out :
@ -99,11 +99,11 @@ This way, parsing lists became trivial :
env = last() env = last()
if (env == "ul" ) { if (env == "ul" ) {
# In a unordered list block, print a new item # In a unordered list block, print a new item
print "<li>" substr($0, 3) "</li>" print "&lt;li&gt;" substr($0, 3) "&lt;/li&gt;"
} else { } else {
# Otherwise, init the unordered list block # Otherwise, init the unordered list block
push("ul") push("ul")
print "<ul>\n<li>" substr($0, 3) "</li>" print "&lt;ul&gt;\n&lt;li&gt;" substr($0, 3) "&lt;/li&gt;"
} }
} }
@ -122,7 +122,7 @@ I have no idea if this is the best solution, but so far it proved to work:
env = last() env = last()
if (env == "none") { if (env == "none") {
# If no block, print a paragraph # If no block, print a paragraph
print "<p>" replaceEmAndStrong($0) "</p>" print "&lt;p&gt;" replaceEmAndStrong($0) "&lt;/p&gt;"
} else if (env == "blockquote") { } else if (env == "blockquote") {
print $0 print $0
} }
@ -136,7 +136,7 @@ It only is a while loop, until the last environement is "none", as it way initia
env = last() env = last()
while (env != "none") { while (env != "none") {
env = pop() env = pop()
print "</" env ">" print "&lt;/" env "&gt;"
env = last() env = last()
} }
} }

View File

@ -30,17 +30,17 @@
<p>AWK works as follow : it takes an optional regex and execute some code between bracket, as a function, at each line of the text input.</p> <p>AWK works as follow : it takes an optional regex and execute some code between bracket, as a function, at each line of the text input.</p>
<p>For example :</p> <p>For example :</p>
<pre><code>/^#/ { <pre><code>/^#/ {
print "<h1>" $0 "</h1>" print "&lt;h1&gt;" $0 "&lt;/h1&gt;"
} }
</code> </code>
</pre> </pre>
<p>Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.</p> <p>Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.</p>
<p>In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.</p> <p>In this case, for each line starting with `#`, awk will print (to the standard output), `&lt;h1&gt; [content of the line] &lt;/h1&gt;`.</p>
<p>This is the beginning to parse headers in markdown.</p> <p>This is the beginning to parse headers in markdown.</p>
<p>However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.</p> <p>However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.</p>
<p>AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.</p> <p>AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.</p>
<pre><code>/^#/ { <pre><code>/^#/ {
print "<h1>" substr($0, 3) "</h1>" print "&lt;h1&gt;" substr($0, 3) "&lt;/h1&gt;"
} }
</code> </code>
</pre> </pre>
@ -51,12 +51,12 @@
<pre><code>/^#+ / { <pre><code>/^#+ / {
match($0, /#+ /); match($0, /#+ /);
n = RLENGTH; n = RLENGTH;
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">" print "&lt;h" n-1 "&gt;" substr($0, n + 1) "&lt;/h" n-1 "&gt;"
} }
</code> </code>
</pre> </pre>
<p>Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence </p> <p>Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence </p>
<p>how to know when to close it with `</ul>` or `</ol>`</p> <p>how to know when to close it with `&lt;/ul&gt;` or `&lt;/ol&gt;`</p>
<h2>Introducing a LIFO stack</h2> <h2>Introducing a LIFO stack</h2>
<p>Since according to the markown syntax, it is possible to have nested blocks such as headers and lists withing blockquotes, or lists withing lists, I came with the simple idea to track to current environnement in a stack in AWK.</p> <p>Since according to the markown syntax, it is possible to have nested blocks such as headers and lists withing blockquotes, or lists withing lists, I came with the simple idea to track to current environnement in a stack in AWK.</p>
<p>Turns out it came out to be easy, I only needed a pointer to track the size of the lifo, a fonction to push an element, an another one to pop one out :</p> <p>Turns out it came out to be easy, I only needed a pointer to track the size of the lifo, a fonction to push an element, an another one to pop one out :</p>
@ -88,7 +88,7 @@ function pop() {
</code> </code>
</pre> </pre>
<p>The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.</p> <p>The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.</p>
<p>This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.</p> <p>This is a clever trick, because when I need to close an html tag, I use the poped element between a `&lt;/` and a `&gt;` instead of having a matching table.</p>
<p>I also used a simple `last()` function to return the last pushed value in the stack without popping it out :</p> <p>I also used a simple `last()` function to return the last pushed value in the stack without popping it out :</p>
<pre><code># Function to get last value in LIFO <pre><code># Function to get last value in LIFO
function last() { function last() {
@ -102,12 +102,12 @@ function last() {
env = last() env = last()
if (env == "ul" ) { if (env == "ul" ) {
# In a unordered list block, print a new item # In a unordered list block, print a new item
print "<li>" substr($0, 3) "</li>" print "&lt;li&gt;" substr($0, 3) "&lt;/li&gt;"
} else { } else {
# Otherwise, init the unordered list block # Otherwise, init the unordered list block
push("ul") push("ul")
print "<ul> print "&lt;ul&gt;
<li>" substr($0, 3) "</li>" &lt;li&gt;" substr($0, 3) "&lt;/li&gt;"
} }
} }
</code> </code>
@ -124,7 +124,7 @@ function last() {
env = last() env = last()
if (env == "none") { if (env == "none") {
# If no block, print a paragraph # If no block, print a paragraph
print "<p>" replaceEmAndStrong($0) "</p>" print "&lt;p&gt;" replaceEmAndStrong($0) "&lt;/p&gt;"
} else if (env == "blockquote") { } else if (env == "blockquote") {
print $0 print $0
} }
@ -138,7 +138,7 @@ function last() {
env = last() env = last()
while (env != "none") { while (env != "none") {
env = pop() env = pop()
print "</" env ">" print "&lt;/" env "&gt;"
env = last() env = last()
} }
} }