I’ve started to explore the Red programming language (based on Rebol) as an alternative to Python for some text-processing applications.
Red contains its own text-processing dialect called Parse. It’s rather fascinating as it does away with regex, and instead offers a rules-based method for searching text. I’m still exploring the basics. In this short article, I’ll summarise some of my understanding.
To follow along, you’ll need to install Red, which comes with its own REPL and compiler all built into one convenient exe for Windows, Mac or Linux.
Simple search and replace
Open up the REPL, and specify a string simply by typing the variable name followed by a colon, then the string in quotes.
s: "this is a great day"
upon pressing enter, the REPL will echo the string
== "this is a great day"
To search for “great” and replace it with “fantastic” use the parse keyword:
parse s [to "great" change "great" "fantastic"]
If you then print the value of s, you’ll see:
>> print s this is a fantastic day
So how does this work?
Firstly, parse is the keyword command to do some parsing. It takes two parameters: the string to evaluate and a block containing rules. Blocks are encapsulated with square brackets. At the start of the parse, the string pointer is at the head of the string.
Rules are executed and evaluated from left to right. So where are the rules? Rules are specified and supplied in a block as parse’s second parameter, as indicated by the coloured regions below:
The first rule (in blue) means ‘advance the pointer until “great” is found, and then put the pointer at the start of the found string’, i.e.:
Then the next rule (in pink) is evaluated. Now, look for a match on “great” which is possible since the pointer is at the start of “great”, and then replace that content with “fantastic”. This gives the result as shown before:
this is a fantastic day
This of course works for contracting a string:
s: "this is a fantastic day" parse s [to "fantastic" change "fantastic" "slack"] print s this is a slack day
Removing or altering a duplicate word
Replacing to to thru performs the same search but changes the pointer position as a result of a match. Consider the following:
s: "this is a greatgreat day" parse s [thru "great" change "great" " and fantastic"] print s this is a great and fantastic day
This works since thru places the pointer after the match, like so:
It can be seen that rules can be chained, so that the following should be possible:
However if you run this, you’ll get the unexpected result of:
this is a great and fantastic day
lacking the change from “day” to “event” because after the second rule is done, the pointer is actually positioned here:
so the last rule matching “day” will fail at that point since the pointer is on the space.
There are two solutions:
parse s [thru "great" change "great" " and fantastic" change " day" " event"]
or the more elegant one:
parse s [thru "great" change "great" " and fantastic" skip change "day" "event"]
where a skip rule is added before the last change request.
which produces the expected result:
this is a great and fantastic event
Skip advances the pointer by one character, so the subsequent rule seeking a change on an immediate match of “day” will succeed and the change can occur.
Some simple find-and-replace examples using Red’s Parse were shown. Admittedly, these have limitations (like only the first match is found in a string) and these examples are probably overkill for parse (i.e. other Red keywords are better suited). These examples simply show the basic rules-based nature of Parse, which makes for readable code and powerful text-processing. I hope to write up some more experiences in the near future.