This commit is contained in:
parent
7880437196
commit
18e27dc258
@ -148,15 +148,15 @@ This way we are able to simply parse markdown and turn it into an HTML file.
|
|||||||
For now we have seen a way to parse blocks, but markdown also handles strong, emphasis and links. However, these tags can appear anywhere in a line.
|
For now we have seen a way to parse blocks, but markdown also handles strong, emphasis and links. However, these tags can appear anywhere in a line.
|
||||||
Hence we need to be able to parse these lines apart from the block itself : indeed a header can container a strong and a link.
|
Hence we need to be able to parse these lines apart from the block itself : indeed a header can container a strong and a link.
|
||||||
|
|
||||||
A very useful function in awk is `match` : it literally is a regex engine, looking for a pattern in a string.
|
The previously introduced but very useful function `match` fits this need : it literally is a regex engine, looking for a pattern in a string.
|
||||||
Whenever the pattern is found, two global variables are filled :
|
Whenever the pattern is found, two global variables are filled :
|
||||||
- RSTART : the index of the first character matching the *group*
|
- RSTART : the index of the first character matching the *group*
|
||||||
- RLENGTH: the length of the matched *group*
|
- RLENGTH: the length of the matched *group*
|
||||||
|
|
||||||
For the following, `line` represents the line processed by the function, as the following `while` loops are actually part of a single function.
|
For the following, `line` represents the line processed by the function, as the following `while` loops are actually part of a single function.
|
||||||
|
|
||||||
This way `match(line, /\*([^*]+)\*/)` matches a string surrounded by two `*`, corresponding to an emphasis text.
|
This way `match(line, /\*([^*]+)\*/)` matches a string (that does not start with a `*`) surrounded by two `*`, corresponding to an emphasis text.
|
||||||
The `*` are espaced are thez are special characters, and the *group* is inside the parenthesis.
|
The `*` are espaced as they are special characters, and the *group* is delimited by the parenthesis.
|
||||||
To match several instances of emphasis text within a line, a simple `while` will do the trick.
|
To match several instances of emphasis text within a line, a simple `while` will do the trick.
|
||||||
We now only have to insert html tags `<em>` are the right space around the matched text, and we are good to go.
|
We now only have to insert html tags `<em>` are the right space around the matched text, and we are good to go.
|
||||||
We can save the global variables `RSTART` and `RLENGTH` for further use, in case they were to be change. Using them we also can extract the
|
We can save the global variables `RSTART` and `RLENGTH` for further use, in case they were to be change. Using them we also can extract the
|
||||||
@ -170,6 +170,8 @@ matched substrings and reconstruct the actual html string :
|
|||||||
line = substr(line, 1, start-1) "<em>" substr(line, start+1, RLENGTH-2) "</em>" substr(line, end+1)
|
line = substr(line, 1, start-1) "<em>" substr(line, start+1, RLENGTH-2) "</em>" substr(line, end+1)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
The while loop enables us to repeat this process as many times as this pattern is encountered within the line.
|
||||||
|
|
||||||
We now can repeat the pattern for all inline fonctionnalities, e.g. strong and code.
|
We now can repeat the pattern for all inline fonctionnalities, e.g. strong and code.
|
||||||
|
|
||||||
The case of url is a bit more deep as we need to match two groups : the actual text and the url itself.
|
The case of url is a bit more deep as we need to match two groups : the actual text and the url itself.
|
||||||
|
Loading…
Reference in New Issue
Block a user