This commit is contained in:
parent
7880437196
commit
18e27dc258
@ -148,15 +148,15 @@ This way we are able to simply parse markdown and turn it into an HTML file.
|
||||
For now we have seen a way to parse blocks, but markdown also handles strong, emphasis and links. However, these tags can appear anywhere in a line.
|
||||
Hence we need to be able to parse these lines apart from the block itself : indeed a header can container a strong and a link.
|
||||
|
||||
A very useful function in awk is `match` : it literally is a regex engine, looking for a pattern in a string.
|
||||
The previously introduced but very useful function `match` fits this need : it literally is a regex engine, looking for a pattern in a string.
|
||||
Whenever the pattern is found, two global variables are filled :
|
||||
- RSTART : the index of the first character matching the *group*
|
||||
- RLENGTH: the length of the matched *group*
|
||||
|
||||
For the following, `line` represents the line processed by the function, as the following `while` loops are actually part of a single function.
|
||||
|
||||
This way `match(line, /\*([^*]+)\*/)` matches a string surrounded by two `*`, corresponding to an emphasis text.
|
||||
The `*` are espaced are thez are special characters, and the *group* is inside the parenthesis.
|
||||
This way `match(line, /\*([^*]+)\*/)` matches a string (that does not start with a `*`) surrounded by two `*`, corresponding to an emphasis text.
|
||||
The `*` are espaced as they are special characters, and the *group* is delimited by the parenthesis.
|
||||
To match several instances of emphasis text within a line, a simple `while` will do the trick.
|
||||
We now only have to insert html tags `<em>` are the right space around the matched text, and we are good to go.
|
||||
We can save the global variables `RSTART` and `RLENGTH` for further use, in case they were to be change. Using them we also can extract the
|
||||
@ -169,6 +169,8 @@ matched substrings and reconstruct the actual html string :
|
||||
# Build the result: before match, <em>, content, </em>, after match
|
||||
line = substr(line, 1, start-1) "<em>" substr(line, start+1, RLENGTH-2) "</em>" substr(line, end+1)
|
||||
}
|
||||
|
||||
The while loop enables us to repeat this process as many times as this pattern is encountered within the line.
|
||||
|
||||
We now can repeat the pattern for all inline fonctionnalities, e.g. strong and code.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user