blog/drafts/published/awk_static_blog_generator.md
simonpetit f6781edad8
All checks were successful
continuous-integration/drone/push Build is passing
testing
2025-12-10 17:11:41 +00:00

5.7 KiB

Bob, a static blog generator

The blog engine

Starting from my markdown AWK parser, which was litterally done to achieve this blog engine, I've added an extra layer to turn it into a statis blog generator Of course the parser is only one of the several components required for a blog generator, but I shall start from the beginning. Initially I wanted to blog for me, and as described here, it was to mostly talk about tech. The desire to make everything from scratch and reinvent the wheel is very strong, but we'll see how this evolve in the future.

Now that I have my markdown to HTML converter I don't lack much to turn in into bob my blog generator.

the boilerplate

After thinking about it, I did want to rely on git to store my drafts and posts, and have a CI listening to my blog repository that would do all the publishing work on the actual webserver. Hence the need for a self hosted git instance, and CI (reinventing the wheel I said). Maybe I shall post about gitea and drone CI later on.

For this to happend, bob shall be a simple CLI, and screw it, a docker image as well.

I also wanted to only handle the markdown file, and let the html build itself.

I came up with a very simple folder architecture :

  • a css folder containing...css files
  • a draft folder containing...drafts written in markdown. These shall not be published yet.
  • a draft/published subfolder, where all the published posts shall be, still in the markdown format
  • a posts folder containing the actual HTML files generated from the posts in draft/published

The idea is as simple as it gets : I write my drafts in the folder of the same name, when I want to publish them, I simply move them into the published subfolder and bob and the CI handle the rest.

But the markdown converter does not create a full html page, so here comes the need for boilerplating : I made an index.html template, for the home page, and a post.html one, for the actual articles.

Once again this is very simple : the post page template's body looks like this :

<body>
    <h1 class='title'><a href="../index.html">simpet</a></h1>
    <article>
        {{article}}
        <footer>
            <div></div>
        </footer>
    </article>
</body>

and I use awk to replace {{article}} with the actual content of the posts, like so :

publish_one()
{ 
    # Storing the path of the post/article to publish 
    # The path is supposed to have this format "./drafts/published/<article>.*
    article_path=$1 

    # from the relative path, only retrieving the name of the article (without file extension)
    article_name=$(echo $article_path | cut -d '/' -f 4 | cut -d '.' -f 1)

    # Convert the markdown draft into an html article and storing it locally
    post=$(awk -f ${BOB_LIB}/markdown.awk ./$article_path)

    # Retrieving the html article template
    template="${BOB_LIB}/template/post.html"

    # Escaping the & for next step to not confuse awk
    escaped_post=$(echo "$post" | sed 's/&/\\\\&/g')

    # In the template, replacing the string {{article}} by the actual content parsed above
    awk -v content="$escaped_post" '{gsub(/{{article}}/, content); print}' "$template" > "./posts/$article_name.html"
}

The home page template is similar :

<body>
    <h1 class='title'>simpet</h1>
    {{articles}}
</body>

and updated this way :

update_index()
{
    # Listing all posts and making an html list (with there link) out of them
    posts=$(ls -t ./posts | awk '
        BEGIN {
            print "<ul>"
        } 
        {
            ref=$0
            gsub(".html","",ref)
            gsub(/[_-]/, " ", ref)
            print "<li><a href=\"./posts/" $0 "\">" ref "</a></li>"
        } 
        END { 
            print "</ul>"
        }')

    # retrieving the template for the index.html
    template="${BOB_LIB}/template/index.html"

    # replacing {{articles}} in the template with the actual list of articles from above
    awk -v content="$posts" '{gsub(/{{articles}}/, content); print}' "$template" > "./index.html"

}

Whenever an new article is added or removed of the drafts/published folder, the update_index() will adjust the home page, because call by this function :

publish_all()
{
    # List all drafts to be published
    published=$(ls -1 ./drafts/published)

    # turning it into an array
    published_array=($published)
    
    # Remove all html articles in case a previously published one was removed
    rm ./posts/*.html
    
    # Publish them one by one (ie turning md into html)
    for file in "${published_array[@]}"; do
        publish_one ./drafts/published/$file
    done

    # updating the index.html as new articles are supposedly present and some may be removed
    update_index
}

which basically only reads the ready to be published posts and turn them into an html file, using the template, and then update the index.html

That's it !

To sum up

I've made a very simple, not very customisable static blog generator, mostly using awk. It clearly is not optimized as it regenerated all the articles everytime, but awk is quite efficient, and for a few posts, I don't think it really matters.

The real benefit is that I only handle markdown files, the CI and bob do the rest...

Also, a statis site is blazing fast as loading in the browser, and since I do not use images (yet) nor javascript, I get a very very fast blog.

To be continued...