blog/posts/awk_static_blog_generator.html
simonpetit 318f2799c5
All checks were successful
continuous-integration/drone/push Build is passing
whatever
2025-12-10 17:54:49 +00:00

142 lines
6.8 KiB
HTML

<!DOCTYPE html>
<html lang="fr" dir="ltr">
<head>
<meta charset="utf-8">
<title>simpet</title>
<meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover">
<link href="https://fonts.googleapis.com/css?family=Cutive+Mono|IBM+Plex+Mono&display=swap" rel="stylesheet">
<link rel="stylesheet" type="text/css" href="../css/poststyle.css">
</head>
<body>
<h1 class='title'><a href="../index.html">simpet</a></h1>
<article>
<div class='dates'>
<p>Created at: <time datetime="2025-12-03 17:52:35">2025-12-03 17:52:35</time></p>
<p>Updated at: <time datetime="2025-12-10 17:11:41">2025-12-10 17:11:41</time></p>
</div>
<h1>Bob, a static blog generator</h1>
<h2>The blog engine</h2>
<p>Starting from my markdown AWK parser, which was litterally done to achieve this blog engine, I've added an extra layer to turn it into a statis blog generator
Of course the parser is only one of the several components required for a blog generator, but I shall start from the beginning.
Initially I wanted to blog for me, and as described <a href="https://simonpetit.top/posts/awk_for_static_site_generation.html">here</a>, it was to mostly talk about tech.
The desire to make everything from scratch and reinvent the wheel is very strong, but we'll see how this evolve in the future.</p>
<p>Now that I have my markdown to HTML converter I don't lack much to turn in into <code>bob</code> my blog generator.</p>
<h2>the boilerplate</h2>
<p>After thinking about it, I did want to rely on git to store my drafts and posts, and have a CI listening to my blog repository that would do all the publishing work on the actual webserver. Hence the need for a self hosted git instance, and CI (reinventing the wheel I said).
Maybe I shall post about <code>gitea</code> and <code>drone CI</code> later on.</p>
<p>For this to happend, <code>bob</code> shall be a simple CLI, and screw it, a docker image as well.</p>
<p>I also wanted to only handle the markdown file, and let the html build itself.</p>
<p>I came up with a very simple folder architecture : </p>
<ul>
<li>a <code>css</code> folder containing...css files</li>
<li>a <code>draft</code> folder containing...drafts written in markdown. These shall not be published yet.</li>
<li>a <code>draft/published</code> subfolder, where all the published posts shall be, still in the markdown format</li>
<li>a <code>posts</code> folder containing the actual HTML files generated from the posts in <code>draft/published</code></li>
</ul>
<p>The idea is as simple as it gets : I write my drafts in the folder of the same name, when I want to publish them, I simply move them into the <code>published</code> subfolder and <code>bob</code> and the CI handle the rest.</p>
<p>But the markdown converter does not create a full html page, so here comes the need for boilerplating :
I made an <code>index.html</code> template, for the home page, and a <code>post.html</code> one, for the actual articles.</p>
<p>Once again this is very simple : the post page template's body looks like this : </p>
<pre><code>&lt;body&gt;
&lt;h1 class='title'&gt;&lt;a href="../index.html"&gt;simpet&lt;/a&gt;&lt;/h1&gt;
&lt;article&gt;
{{article}}
&lt;footer&gt;
&lt;div&gt;&lt;/div&gt;
&lt;/footer&gt;
&lt;/article&gt;
&lt;/body&gt;
</code>
</pre>
<p>and I use <code>awk</code> to replace <code>{{article}}</code> with the actual content of the posts, like so : </p>
<pre><code>publish_one()
{
# Storing the path of the post/article to publish
# The path is supposed to have this format "./drafts/published/&lt;article&gt;.&#42;
article_path=$1
# from the relative path, only retrieving the name of the article (without file extension)
article_name=$(echo $article_path | cut -d '/' -f 4 | cut -d '.' -f 1)
# Convert the markdown draft into an html article and storing it locally
post=$(awk -f ${BOB_LIB}/markdown.awk ./$article_path)
# Retrieving the html article template
template="${BOB_LIB}/template/post.html"
# Escaping the & for next step to not confuse awk
escaped_post=$(echo "$post" | sed 's/&/\\&/g')
# In the template, replacing the string {{article}} by the actual content parsed above
awk -v content="$escaped_post" '{gsub(/{{article}}/, content); print}' "$template" &gt; "./posts/$article_name.html"
}
</code>
</pre>
<p>The home page template is similar : </p>
<pre><code>&lt;body&gt;
&lt;h1 class='title'&gt;simpet&lt;/h1&gt;
{{articles}}
&lt;/body&gt;
</code>
</pre>
<p>and updated this way : </p>
<pre><code>update_index()
{
# Listing all posts and making an html list (with there link) out of them
posts=$(ls -t ./posts | awk '
BEGIN {
print "&lt;ul&gt;"
}
{
ref=$0
gsub(".html","",ref)
gsub(/[_-]/, " ", ref)
print "&lt;li&gt;&lt;a href=\"./posts/" $0 "\"&gt;" ref "&lt;/a&gt;&lt;/li&gt;"
}
END {
print "&lt;/ul&gt;"
}')
# retrieving the template for the index.html
template="${BOB_LIB}/template/index.html"
# replacing {{articles}} in the template with the actual list of articles from above
awk -v content="$posts" '{gsub(/{{articles}}/, content); print}' "$template" &gt; "./index.html"
}
</code>
</pre>
<p>Whenever an new article is added or removed of the <code>drafts/published</code> folder, the <code>update_index()</code> will adjust the home page, because call by this function : </p>
<pre><code>publish_all()
{
# List all drafts to be published
published=$(ls -1 ./drafts/published)
# turning it into an array
published_array=($published)
# Remove all html articles in case a previously published one was removed
rm ./posts/&#42;.html
# Publish them one by one (ie turning md into html)
for file in "${published_array[@]}"; do
publish_one ./drafts/published/$file
done
# updating the index.html as new articles are supposedly present and some may be removed
update_index
}
</code>
</pre>
<p>which basically only reads the ready to be published posts and turn them into an html file, using the template, and then update the <code>index.html</code> </p>
<p>That's it ! </p>
<h2>To sum up</h2>
<p>I've made a very simple, not very customisable static blog generator, mostly using awk. It clearly is not optimized as it regenerated all the articles everytime, but awk is quite efficient, and for a few posts, I don't think it really matters.</p>
<p>The real benefit is that I only handle markdown files, the CI and <code>bob</code> do the rest... </p>
<p>Also, a statis site is blazing fast as loading in the browser, and since I do not use images (yet) nor javascript, I get a very very fast blog.</p>
<p>To be continued...</p>
<footer>
<div></div>
</footer>
</article>
</body>
</html>