This commit is contained in:
parent
cd22d9c8d2
commit
f6a71fe353
@ -15,4 +15,4 @@ steps:
|
|||||||
target: /var/www/html/blog/
|
target: /var/www/html/blog/
|
||||||
source:
|
source:
|
||||||
- index.html
|
- index.html
|
||||||
- posts/*.html
|
- posts
|
||||||
|
@ -23,18 +23,18 @@ AWK works as follow : it takes an optional regex and execute some code between b
|
|||||||
For example :
|
For example :
|
||||||
|
|
||||||
/^#/ {
|
/^#/ {
|
||||||
print "<h1>" $0 "</h1>"
|
print "<h1>" $0 "</h1>"
|
||||||
}
|
}
|
||||||
|
|
||||||
Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.
|
Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.
|
||||||
In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.
|
In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.
|
||||||
This is the beginning to parse headers in markdown.
|
This is the beginning to parse headers in markdown.
|
||||||
However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.
|
However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.
|
||||||
AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.
|
AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.
|
||||||
`substr` acts as its name indicates, it return a substring of its argument.
|
`substr` acts as its name indicates, it return a substring of its argument.
|
||||||
|
|
||||||
/^#/ {
|
/^#/ {
|
||||||
print "<h1>" substr($0, 3) "</h1>"
|
print "<h1>" substr($0, 3) "</h1>"
|
||||||
}
|
}
|
||||||
|
|
||||||
In the example above, as per the [documentation](https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html#index-substr_0028_0029-function)
|
In the example above, as per the [documentation](https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html#index-substr_0028_0029-function)
|
||||||
@ -46,11 +46,11 @@ and allows the script to dynamically determine which depth of header it parses :
|
|||||||
/^#+ / {
|
/^#+ / {
|
||||||
match($0, /#+ /);
|
match($0, /#+ /);
|
||||||
n = RLENGTH;
|
n = RLENGTH;
|
||||||
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
|
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
|
||||||
}
|
}
|
||||||
|
|
||||||
Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence
|
Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence
|
||||||
how to know when to close it with `</ul>` or `</ol>`
|
how to know when to close it with `</ul>` or `</ol>`
|
||||||
|
|
||||||
## Introducing a LIFO stack
|
## Introducing a LIFO stack
|
||||||
|
|
||||||
@ -82,7 +82,7 @@ Turns out it came out to be easy, I only needed a pointer to track the size of t
|
|||||||
}
|
}
|
||||||
|
|
||||||
The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.
|
The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.
|
||||||
This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.
|
This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.
|
||||||
|
|
||||||
I also used a simple `last()` function to return the last pushed value in the stack without popping it out :
|
I also used a simple `last()` function to return the last pushed value in the stack without popping it out :
|
||||||
|
|
||||||
@ -99,11 +99,11 @@ This way, parsing lists became trivial :
|
|||||||
env = last()
|
env = last()
|
||||||
if (env == "ul" ) {
|
if (env == "ul" ) {
|
||||||
# In a unordered list block, print a new item
|
# In a unordered list block, print a new item
|
||||||
print "<li>" substr($0, 3) "</li>"
|
print "<li>" substr($0, 3) "</li>"
|
||||||
} else {
|
} else {
|
||||||
# Otherwise, init the unordered list block
|
# Otherwise, init the unordered list block
|
||||||
push("ul")
|
push("ul")
|
||||||
print "<ul>\n<li>" substr($0, 3) "</li>"
|
print "<ul>\n<li>" substr($0, 3) "</li>"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -122,7 +122,7 @@ I have no idea if this is the best solution, but so far it proved to work:
|
|||||||
env = last()
|
env = last()
|
||||||
if (env == "none") {
|
if (env == "none") {
|
||||||
# If no block, print a paragraph
|
# If no block, print a paragraph
|
||||||
print "<p>" replaceEmAndStrong($0) "</p>"
|
print "<p>" replaceEmAndStrong($0) "</p>"
|
||||||
} else if (env == "blockquote") {
|
} else if (env == "blockquote") {
|
||||||
print $0
|
print $0
|
||||||
}
|
}
|
||||||
@ -136,7 +136,7 @@ It only is a while loop, until the last environement is "none", as it way initia
|
|||||||
env = last()
|
env = last()
|
||||||
while (env != "none") {
|
while (env != "none") {
|
||||||
env = pop()
|
env = pop()
|
||||||
print "</" env ">"
|
print "</" env ">"
|
||||||
env = last()
|
env = last()
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
@ -30,17 +30,17 @@
|
|||||||
<p>AWK works as follow : it takes an optional regex and execute some code between bracket, as a function, at each line of the text input.</p>
|
<p>AWK works as follow : it takes an optional regex and execute some code between bracket, as a function, at each line of the text input.</p>
|
||||||
<p>For example :</p>
|
<p>For example :</p>
|
||||||
<pre><code>/^#/ {
|
<pre><code>/^#/ {
|
||||||
print "<h1>" $0 "</h1>"
|
print "<h1>" $0 "</h1>"
|
||||||
}
|
}
|
||||||
</code>
|
</code>
|
||||||
</pre>
|
</pre>
|
||||||
<p>Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.</p>
|
<p>Although `$n` refers to the n-th records in the line (according to a delimiter, like in a csv), the special `$0` refers to the whole line.</p>
|
||||||
<p>In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.</p>
|
<p>In this case, for each line starting with `#`, awk will print (to the standard output), `<h1> [content of the line] </h1>`.</p>
|
||||||
<p>This is the beginning to parse headers in markdown.</p>
|
<p>This is the beginning to parse headers in markdown.</p>
|
||||||
<p>However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.</p>
|
<p>However, by trying this, we immediatly see that `#` is part of the whole line, hence it also appear in the html whereas it sould not.</p>
|
||||||
<p>AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.</p>
|
<p>AWK has a way to prevent this, as it is a complete scripting language, with built-in functions, that enable further manipulations.</p>
|
||||||
<pre><code>/^#/ {
|
<pre><code>/^#/ {
|
||||||
print "<h1>" substr($0, 3) "</h1>"
|
print "<h1>" substr($0, 3) "</h1>"
|
||||||
}
|
}
|
||||||
</code>
|
</code>
|
||||||
</pre>
|
</pre>
|
||||||
@ -51,12 +51,12 @@
|
|||||||
<pre><code>/^#+ / {
|
<pre><code>/^#+ / {
|
||||||
match($0, /#+ /);
|
match($0, /#+ /);
|
||||||
n = RLENGTH;
|
n = RLENGTH;
|
||||||
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
|
print "<h" n-1 ">" substr($0, n + 1) "</h" n-1 ">"
|
||||||
}
|
}
|
||||||
</code>
|
</code>
|
||||||
</pre>
|
</pre>
|
||||||
<p>Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence </p>
|
<p>Reproducing this technique to parse the rest proves to be difficult, as lists for example, are not contained in a single line, hence </p>
|
||||||
<p>how to know when to close it with `</ul>` or `</ol>`</p>
|
<p>how to know when to close it with `</ul>` or `</ol>`</p>
|
||||||
<h2>Introducing a LIFO stack</h2>
|
<h2>Introducing a LIFO stack</h2>
|
||||||
<p>Since according to the markown syntax, it is possible to have nested blocks such as headers and lists withing blockquotes, or lists withing lists, I came with the simple idea to track to current environnement in a stack in AWK.</p>
|
<p>Since according to the markown syntax, it is possible to have nested blocks such as headers and lists withing blockquotes, or lists withing lists, I came with the simple idea to track to current environnement in a stack in AWK.</p>
|
||||||
<p>Turns out it came out to be easy, I only needed a pointer to track the size of the lifo, a fonction to push an element, an another one to pop one out :</p>
|
<p>Turns out it came out to be easy, I only needed a pointer to track the size of the lifo, a fonction to push an element, an another one to pop one out :</p>
|
||||||
@ -88,7 +88,7 @@ function pop() {
|
|||||||
</code>
|
</code>
|
||||||
</pre>
|
</pre>
|
||||||
<p>The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.</p>
|
<p>The stack does not have to be strictly declared. The value of inside the LIFO correspond to the current markdown environment.</p>
|
||||||
<p>This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.</p>
|
<p>This is a clever trick, because when I need to close an html tag, I use the poped element between a `</` and a `>` instead of having a matching table.</p>
|
||||||
<p>I also used a simple `last()` function to return the last pushed value in the stack without popping it out :</p>
|
<p>I also used a simple `last()` function to return the last pushed value in the stack without popping it out :</p>
|
||||||
<pre><code># Function to get last value in LIFO
|
<pre><code># Function to get last value in LIFO
|
||||||
function last() {
|
function last() {
|
||||||
@ -102,12 +102,12 @@ function last() {
|
|||||||
env = last()
|
env = last()
|
||||||
if (env == "ul" ) {
|
if (env == "ul" ) {
|
||||||
# In a unordered list block, print a new item
|
# In a unordered list block, print a new item
|
||||||
print "<li>" substr($0, 3) "</li>"
|
print "<li>" substr($0, 3) "</li>"
|
||||||
} else {
|
} else {
|
||||||
# Otherwise, init the unordered list block
|
# Otherwise, init the unordered list block
|
||||||
push("ul")
|
push("ul")
|
||||||
print "<ul>
|
print "<ul>
|
||||||
<li>" substr($0, 3) "</li>"
|
<li>" substr($0, 3) "</li>"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
</code>
|
</code>
|
||||||
@ -124,7 +124,7 @@ function last() {
|
|||||||
env = last()
|
env = last()
|
||||||
if (env == "none") {
|
if (env == "none") {
|
||||||
# If no block, print a paragraph
|
# If no block, print a paragraph
|
||||||
print "<p>" replaceEmAndStrong($0) "</p>"
|
print "<p>" replaceEmAndStrong($0) "</p>"
|
||||||
} else if (env == "blockquote") {
|
} else if (env == "blockquote") {
|
||||||
print $0
|
print $0
|
||||||
}
|
}
|
||||||
@ -138,7 +138,7 @@ function last() {
|
|||||||
env = last()
|
env = last()
|
||||||
while (env != "none") {
|
while (env != "none") {
|
||||||
env = pop()
|
env = pop()
|
||||||
print "</" env ">"
|
print "</" env ">"
|
||||||
env = last()
|
env = last()
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
Loading…
Reference in New Issue
Block a user