48 lines
2.2 KiB
Markdown
48 lines
2.2 KiB
Markdown
# A test suite for markdown parser
|
|
|
|
As I implemented my own markdown parser for [bob](https://git.simonpetit.top/simonpetit/bob), my static site (blog) generator, I also wanted to make sure it was parsing markdown correctly.
|
|
|
|
Hence I thought about a custom testing suite. After all this blog is also to make things from scratch to grasp a better understanding of how things work overall.
|
|
|
|
## The concept
|
|
|
|
Also this is a custom script I still wanted to make it somehow generic. In others words I wanted it to be used against any markdown parser (assuming it follows a certain input/ouput constraints).
|
|
|
|
For example, if the test script is `test_parser`, the parser `parser`, then actually testing the parser shall be the command :
|
|
|
|
test_parser parser
|
|
|
|
The one condition is that `parser` take the markdown string as a standard input, and print the rendered html in the standard output.
|
|
This way the markdown parser can feed custom markdown, of which it known the outputed html, and directly compare it with the output of the parser.
|
|
|
|
One more thing is, my markdown parser is written as an `awk` script, but some may be `bash` scripts or even executables. This means I need to add an argument to precise the interpreter (if needed).
|
|
In my case this would look like this :
|
|
|
|
test_parser parser.awk awk
|
|
|
|
and for a bash script, it will be as such :
|
|
|
|
test_parser parser.sh bash
|
|
|
|
## Unit testing
|
|
|
|
The purpose of the testing suite is to confront an expect output with the actual outputs from typical markdown syntax.
|
|
I started by making an array of size `3n`, `n` being the number of tests. Indeed for display purposes each test has
|
|
- a title : quickly defining what kind of syntax is being tested
|
|
- a markdown input: a legal markdown syntax text
|
|
- an expected output: the corresponding html output
|
|
|
|
This approach has flaws, obviously, and the biggest one being the consistence of html. Indeed this html :
|
|
|
|
<h1>Title</>
|
|
|
|
is strictly equivalent to :
|
|
|
|
<h1>
|
|
Title
|
|
</h1>
|
|
|
|
whereas the strings are not equal.
|
|
|
|
The most naive approach I came with (and because I wanted a quick prototype so I didn't think much about it) was to remove all carriage return from the parser output, using `tr -d '\n'`
|