This commit is contained in:
parent
cd22d9c8d2
commit
f6a71fe353
@ -15,4 +15,4 @@ steps:
|
||||
target: /var/www/html/blog/
|
||||
source:
|
||||
- index.html
|
||||
- posts/*.html
|
||||
- posts
|
||||
|
@ -23,18 +23,18 @@ AWK works as follow : it takes an optional regex and execute some code between b
|
||||
For example :
|
||||
|
||||
/^#/ {
|
||||
print "<h1>" $0 "</h1>"
|
||||
print "<h1>" $0 "</h1>"
|
||||
}
|
||||
|
||||
Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.
|
||||
In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.
|
||||
In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.
|
||||
This is the beginning to parse headers in markdown.
|
||||
However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.
|
||||
AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.
|
||||
`substr` acts as its name indicates, it return a substring of its argument.
|
||||
|
||||
/^#/ {
|
||||
print "<h1>" substr($0, 3) "</h1>"
|
||||
print "<h1>" substr($0, 3) "</h1>"
|
||||
}
|
||||
|
||||
In the example above, as per the [documentation](https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html#index-substr_0028_0029-function)
|
||||
@ -46,11 +46,11 @@ and allows the script to dynamically determine which depth of header it parses :
|
||||
/^#+ / {
|
||||
match($0, /#+ /);
|
||||
n = RLENGTH;
|
||||
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
|
||||
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
|
||||
}
|
||||
|
||||
Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence
|
||||
how to know when to close it with `</ul>` or `</ol>`
|
||||
how to know when to close it with `</ul>` or `</ol>`
|
||||
|
||||
## Introducing a LIFO stack
|
||||
|
||||
@ -82,7 +82,7 @@ Turns out it came out to be easy, I only needed a pointer to track the size of t
|
||||
}
|
||||
|
||||
The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.
|
||||
This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.
|
||||
This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.
|
||||
|
||||
I also used a simple `last()` function to return the last pushed value in the stack without popping it out :
|
||||
|
||||
@ -99,11 +99,11 @@ This way, parsing lists became trivial :
|
||||
env = last()
|
||||
if (env == "ul" ) {
|
||||
# In a unordered list block, print a new item
|
||||
print "<li>" substr($0, 3) "</li>"
|
||||
print "<li>" substr($0, 3) "</li>"
|
||||
} else {
|
||||
# Otherwise, init the unordered list block
|
||||
push("ul")
|
||||
print "<ul>\n<li>" substr($0, 3) "</li>"
|
||||
print "<ul>\n<li>" substr($0, 3) "</li>"
|
||||
}
|
||||
}
|
||||
|
||||
@ -122,7 +122,7 @@ I have no idea if this is the best solution, but so far it proved to work:
|
||||
env = last()
|
||||
if (env == "none") {
|
||||
# If no block, print a paragraph
|
||||
print "<p>" replaceEmAndStrong($0) "</p>"
|
||||
print "<p>" replaceEmAndStrong($0) "</p>"
|
||||
} else if (env == "blockquote") {
|
||||
print $0
|
||||
}
|
||||
@ -136,7 +136,7 @@ It only is a while loop, until the last environement is "none", as it way initia
|
||||
env = last()
|
||||
while (env != "none") {
|
||||
env = pop()
|
||||
print "</" env ">"
|
||||
print "</" env ">"
|
||||
env = last()
|
||||
}
|
||||
}
|
||||
|
@ -30,17 +30,17 @@
|
||||
<p>AWK works as follow : it takes an optional regex and execute some code between bracket, as a function, at each line of the text input.</p>
|
||||
<p>For example :</p>
|
||||
<pre><code>/^#/ {
|
||||
print "<h1>" $0 "</h1>"
|
||||
print "<h1>" $0 "</h1>"
|
||||
}
|
||||
</code>
|
||||
</pre>
|
||||
<p>Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.</p>
|
||||
<p>In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.</p>
|
||||
<p>In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.</p>
|
||||
<p>This is the beginning to parse headers in markdown.</p>
|
||||
<p>However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.</p>
|
||||
<p>AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.</p>
|
||||
<pre><code>/^#/ {
|
||||
print "<h1>" substr($0, 3) "</h1>"
|
||||
print "<h1>" substr($0, 3) "</h1>"
|
||||
}
|
||||
</code>
|
||||
</pre>
|
||||
@ -51,12 +51,12 @@
|
||||
<pre><code>/^#+ / {
|
||||
match($0, /#+ /);
|
||||
n = RLENGTH;
|
||||
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
|
||||
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
|
||||
}
|
||||
</code>
|
||||
</pre>
|
||||
<p>Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence </p>
|
||||
<p>how to know when to close it with `</ul>` or `</ol>`</p>
|
||||
<p>how to know when to close it with `</ul>` or `</ol>`</p>
|
||||
<h2>Introducing a LIFO stack</h2>
|
||||
<p>Since according to the markown syntax, it is possible to have nested blocks such as headers and lists withing blockquotes, or lists withing lists, I came with the simple idea to track to current environnement in a stack in AWK.</p>
|
||||
<p>Turns out it came out to be easy, I only needed a pointer to track the size of the lifo, a fonction to push an element, an another one to pop one out :</p>
|
||||
@ -88,7 +88,7 @@ function pop() {
|
||||
</code>
|
||||
</pre>
|
||||
<p>The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.</p>
|
||||
<p>This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.</p>
|
||||
<p>This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.</p>
|
||||
<p>I also used a simple `last()` function to return the last pushed value in the stack without popping it out :</p>
|
||||
<pre><code># Function to get last value in LIFO
|
||||
function last() {
|
||||
@ -102,12 +102,12 @@ function last() {
|
||||
env = last()
|
||||
if (env == "ul" ) {
|
||||
# In a unordered list block, print a new item
|
||||
print "<li>" substr($0, 3) "</li>"
|
||||
print "<li>" substr($0, 3) "</li>"
|
||||
} else {
|
||||
# Otherwise, init the unordered list block
|
||||
push("ul")
|
||||
print "<ul>
|
||||
<li>" substr($0, 3) "</li>"
|
||||
print "<ul>
|
||||
<li>" substr($0, 3) "</li>"
|
||||
}
|
||||
}
|
||||
</code>
|
||||
@ -124,7 +124,7 @@ function last() {
|
||||
env = last()
|
||||
if (env == "none") {
|
||||
# If no block, print a paragraph
|
||||
print "<p>" replaceEmAndStrong($0) "</p>"
|
||||
print "<p>" replaceEmAndStrong($0) "</p>"
|
||||
} else if (env == "blockquote") {
|
||||
print $0
|
||||
}
|
||||
@ -138,7 +138,7 @@ function last() {
|
||||
env = last()
|
||||
while (env != "none") {
|
||||
env = pop()
|
||||
print "</" env ">"
|
||||
print "</" env ">"
|
||||
env = last()
|
||||
}
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user