From c8a549cc7cb7745a83dd08a4d3fc32a3eade6f32 Mon Sep 17 00:00:00 2001 From: Simon Petit Date: Sat, 16 Nov 2024 00:10:58 +0100 Subject: [PATCH] testing again --- drafts/markdown_testing_suite.md | 0 .../awk_for_static_site_generation.md | 13 +++++------ posts/awk_for_static_site_generation.html | 22 +++++++++---------- 3 files changed, 15 insertions(+), 20 deletions(-) create mode 100644 drafts/markdown_testing_suite.md diff --git a/drafts/markdown_testing_suite.md b/drafts/markdown_testing_suite.md new file mode 100644 index 0000000..e69de29 diff --git a/drafts/published/awk_for_static_site_generation.md b/drafts/published/awk_for_static_site_generation.md index 2305c5b..02d85f2 100644 --- a/drafts/published/awk_for_static_site_generation.md +++ b/drafts/published/awk_for_static_site_generation.md @@ -14,7 +14,7 @@ AWK, from the intials of its creator, is an old an powerful text file maniulatio Its [wikipedia page](https://en.wikipedia.org/wiki/AWK) sums up nicely its story. I thought it was clever to use is for a site generator, to parse markdown files and generate html ones. However, according to this [listing](https://jamstack.org/generators/) of static site generator programs, another one has had the same idea. -Hence, the following, as well as my code is heavily inspired by [Zodiac](https://github.com/nuex/zodiac) (even though the repo has not been touched for 8years). +Hence, the following, as well as my code is heavily inspired by [Zodiac](https://github.com/nuex/zodiac) (even though the repo has not been touched for 8 years). ## Parsing markdown @@ -41,7 +41,7 @@ In the example above, as per the [documentation](https://www.gnu.org/software/ga it returns the subtring of `$0` starting at 3 (1 being `#` and 2 the whitespace following it) to the end of the line. Now this is better, but we now are able to generalized it to all headers. Another function, `match` can return the number of char matched by a regex, -and allows the script to dynamically determine which depth of header it parses : +and allows the script to dynamically determine which depth of header it parses. This length is stored is the global variable `RLENGTH`: /^#+ / { match($0, /#+ /); @@ -146,11 +146,8 @@ Of course I am aware that is lacks emphasis, strong and code within a line of te However I did implement it, but maybe it will be explained in another edit of this post. Nonetheless the code can still be consulted on [github](https://github.com/SiwonP/bob). -# A testing suite for markdown parser - -Having a markdown parser is cool, having one well tested id better. -I embarked in writing a testing suite for markdown parsers. I wanted it to be generic, meaning you only had to provide a parsing program, -that takes markdown in the standard input, and returns html in the standard output. -All tests would be provided by the test suite. +## Parsing in-line fonctionnalities +For now we have seen a way to parse blocks, but markdown also handles strong, emphasis and links. However, these tags can appear anywhere in a line. +Hence we need to be able to parse these lines apart from the block itself : indeed a header can container a strong and a link. diff --git a/posts/awk_for_static_site_generation.html b/posts/awk_for_static_site_generation.html index 90f88b5..f67b356 100644 --- a/posts/awk_for_static_site_generation.html +++ b/posts/awk_for_static_site_generation.html @@ -21,12 +21,12 @@

Anyway, writing this static site generator from scratch is also the perfect excuse to explore a not so widely know technology to manipulate text files.

Introduction to AWK

AWK, from the intials of its creator, is an old an powerful text file maniulation. Syntactically close to C, it is a scripting language to manipulation text entries.

-

Its [wikipedia page](https://en.wikipedia.org/wiki/AWK) sums up nicely its story.

+

Its wikipedia page sums up nicely its story.

I thought it was clever to use is for a site generator, to parse markdown files and generate html ones.

-

However, according to this [listing](https://jamstack.org/generators/) of static site generator programs, another one has had the same idea.

-

Hence, the following, as well as my code is heavily inspired by [Zodiac](https://github.com/nuex/zodiac) (even though the repo has not been touched for 8years).

+

However, according to this listing of static site generator programs, another one has had the same idea.

+

Hence, the following, as well as my code is heavily inspired by Zodiac (even though the repo has not been touched for 8 years).

Parsing markdown

-

Following the official [syntax](https://daringfireball.net/projects/markdown/syntax), is a good start for a parser.

+

Following the official syntax, is a good start for a parser.

AWK works as follow : it takes an optional regex and execute some code between bracket, as a function, at each line of the text input.

For example :

/^#/ {
@@ -44,10 +44,10 @@
 }
 
 
-

In the example above, as per the [documentation](https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html#index-substr_0028_0029-function)

+

In the example above, as per the documentation

it returns the subtring of `$0` starting at 3 (1 being `#` and 2 the whitespace following it) to the end of the line.

Now this is better, but we now are able to generalized it to all headers. Another function, `match` can return the number of char matched by a regex,

-

and allows the script to dynamically determine which depth of header it parses :

+

and allows the script to dynamically determine which depth of header it parses. This length is stored is the global variable `RLENGTH`:

/^#+ / {
     match($0, /#+ /);
     n = RLENGTH;
@@ -147,12 +147,10 @@ function last() {
 

This way we are able to simply parse markdown and turn it into an HTML file.

Of course I am aware that is lacks emphasis, strong and code within a line of text.

However I did implement it, but maybe it will be explained in another edit of this post.

-

Nonetheless the code can still be consulted on [github](https://github.com/SiwonP/bob).

-

A testing suite for markdown parser

-

Having a markdown parser is cool, having one well tested id better.

-

I embarked in writing a testing suite for markdown parsers. I wanted it to be generic, meaning you only had to provide a parsing program,

-

that takes markdown in the standard input, and returns html in the standard output.

-

All tests would be provided by the test suite.

+

Nonetheless the code can still be consulted on github.

+

Parsing in-line fonctionnalities

+

For now we have seen a way to parse blocks, but markdown also handles strong, emphasis and links. However, these tags can appear anywhere in a line.

+

Hence we need to be able to parse these lines apart from the block itself : indeed a header can container a strong and a link.