Estimated reading time: 10 minutes
ES6 introduced the concept of template strings or template literals to Javascript. This made it easy to embed Javascript expressions within a string and to have the expressions evaluated at runtime. To take a trivial example, instead of having Javascript code like the following:
const name = "Aang";
const age = 112;
console.log("Hi, my name is " + name + " and I'm " + age + " years old");
You could reduce the cognitive load of all those pluses and quotes by having code like this instead:
console.log(`Hi, my name is ${name}, and I'm ${age} years old`);
Wouldn’t it be nice if we could do the same in Q/kdb? Yes, it would be. Can it be done in Q/kdb? Why, yes, it can.
Opening up the Q toolbox
In order to process template strings in Q, we will need the following functionality from the language:
- Parsing a given string to identify expressions enclosed in
${}
- Evaluating the expression, much like the
eval()
function in Javascript:
eval("1 + 1")
-> 2
The second piece of functionality can be achieved using the value
function in Q:
x: 12
value "x"
-> 12
value "x + 12"
-> 24
What about the first one? Erm, there’s some disappointment up ahead. Q has a function called ssr
(short for string-search-and-replace
) that can be used to replace parts of a string with another string. The following are some examples of how it can be used:
// simple string match (case-sensitive)
ssr["Aang is the Avatar";"a";"@"]
-> "A@ng is the Av@t@r"
// function execution on each simple string match
ssr["But Zuko is arguably cooler";"But";upper]
-> "BUT Zuko is arguably cooler"
// function execution on each regex match
ssr["And Iroh has a great story arc";"?o";upper]
-> "And IROh has a great STOry arc"
The last example seems to possess the seed of a solution for our problem. The ?
operator matches any single character preceding the o
- in our case they are r
and t
. If we could use an operator that matched any number of characters then we could make use of it like so, assuming that operator was *
:
x: 12
ssr["The value of x is ${x}.";"${*}";<function to evaluate string>]
The *
operator is in fact used to match any number of characters when used with the like
function:
"${someVariableName}" like "${*}"
-> 1b
But, and here’s the disappointment I mentioned earlier, the operator doesn’t work with the ssr
function:
ssr["The value of x is ${x}.";"${*}";upper]
-> 'length
[0] ssr["The value of x is ${x}.";"${*}";upper]
^
The Kx Wiki leaves us this cryptic message about the use of *
with ssr
:
Ah well, back to the toolbox.
Rummaging around
One other way to identify expressions enclosed in ${}
could be to split a string into a list of strings such that we can reason about items in the list.
The vs
function (short for vector-from-scalar
) can be employed to split a string into a list of strings based on a delimiter:
"," vs "one,two,three"
-> "one"
"two"
"three"
We could use a two-step split to achieve a list of strings in a format that could help us distinguish string expressions from strings:
- Split the string on
${
- Split each of the resulting strings on
}
To take an example:
// Split the string on ${
"${" vs "The value of x is ${x}."
-> "The value of x is "
"x}."
Using this split, we can identify that an item in the obtained list is a string expression if it contains }
, which is the case for the last item in the list shown above. But we cannot call the value
function on the item just yet, we’ll need to run step 2 on each item of this list. We have two options:
- Use the
each
iterator - Use the
each-both
('
) iterator
Why consider each-both
at all? Because of a little inconvenience with each
- it has to accept a unary function:
// This will count the length of each string
// and return a list of lengths
count each ("Hello";"multi-verse")
-> 5 11
whereas vs
is a binary function. If we try to invoke the function with an each
, things fall apart:
"}" vs each "${" vs "The value of x is ${x}."
-> '
[0] "}" vs each "${" vs "The value of x is ${x}."
^
I said that vs
is a binary function but that is not entirely true all the time. We can change it to a unary function by harnessing a projection:
vs["}";] each "${" vs "The value of x is ${x}."
-> ,"The value of x is "
(,"x";,".")
Note how we’re calling vs
differently now? We’re using bracket
notation this time whereas we were using what’s called an infix
notation before. By using bracket notation we’re explicitly preventing Q’s right-to-left interpreter from interpreting the token preceding each
as a function - which was happening when the interpreter encountered vs
to the left of each
previously. Also note that we’re binding }
as the first argument to vs
and leaving its second argument empty by terminating the function expression right after the semi-colon with ]
. The result is a unary projection that gets evaluated for each item of the string list.
It is more convenient to switch to the multi-valent form of each
: '
. This can be used in conjunction with an infix vs
function call to achieve the same result as above:
"}" vs ' "${" vs "The value of x is ${x}."
-> ,"The value of x is "
(,"x";,".")
Eureka
The nested list that we’ve obtained has the following key properties:
- Each item of the list will either be a single-item list or a two-item list.
- If an item of the list is a single-item list, then we can be sure that is it not a string expression - otherwise the second split would’ve caused it to split into two elements.
- If an item of the list is a two-item list, then we can be sure that its first item is a string expression. This is because the string would’ve had the
}
delimiter before the second split. We can also be sure that the second (and last) item is not a string expression because it would’ve followed the}
delimiter before the second split.
These properties lead us to the following algorithm:
For each item i of the nested list l
If i is a single-item list
Return i
If i is a two-item list
e <- Evaluate the first item
c <- Concatenate e with the second item
Return c
Flatten the list
All of this can be achieved using a single-line of code in Q:
interpolate: {raze {$[1 < count[x];raze string[value[first[x]]],1_x;raze x]} each "}" vs ' "${" vs x}
But we can split it into multiple lines for easier readability:
interpolate:{
firstSplit: "${" vs x;
secondSplit: "}" vs ' firstSplit;
: raze {$[1 < count[x];raze string[value[first[x]]],1_x;raze x]} each secondSplit
}
Ok, I won’t claim that it’s any better.
The following are examples of our function in action:
x: 22
interpolate "The value of x is ${x}"
-> "The value of x is 22"
.ak.x: 2
.tst.y: 4
z: 8
interpolate "${.ak.x} * 2 is ${.tst.y}. ${.tst.y} * 2 is ${z}"
-> "2 * 2 is 4. 4 * 2 is 8"
Deconstruction
Here’s what happens in the last line: we iterate through each item of the nested list. If the current item is a two-item list, then we evaluate the first string of the list using value[first[x]]
, join it with the rest of the list using 1_x
, and then call raze
to flatten the two-item list into a single string. If the item is a single-item list, then we just call raze
on it to convert it from a list to a string. The following are examples of using raze
in both ways:
raze ("one";"two")
-> "onetwo"
raze enlist "one"
-> "one
We have to make one final call to raze
to flatten the outermost list - or what used to be the nested list stored in secondSplit
- back into one string. Note the plethora of square brackets here:
raze string[value[first[x]]],1_x
This is a necessity due to the right-to-left nature of the Q interpreter. If we had left out the brackets to the string
function like so:
raze string value[first[x]],1_x
then the join (,
) operation between value[first[x]]
and 1_x
would’ve yielded unexpected results based on the type of x:
x:1
1,"test"
-> 1
"t"
"e"
"s"
"t"
And the string
function would’ve been executed in turn on the above result. Using square brackets for string
, we forced the Q interpreter to only perform the join operation after the evaluated value had been converted to a string.
What if we left out the brackets to the value
function, like so?
raze string[value first[x]],1_x
Turns out that’s quite alright, since the order of operations is still the same as far as the join operation is concerned. In fact, we can even eliminate the square brackets for first
for the same reason:
raze string[value first x],1_x
So the new shape of our function is as follows:
interpolate:{
firstSplit: "${" vs x;
secondSplit: "}" vs ' firstSplit;
: raze {$[1 < count[x];raze string[value first x],1_x;raze x]} each secondSplit
}
Are we using one too many razes? Yeah, I agree. Instead of first razing the inner list and then razing the outer list, we can combine raze
with the over
iterator to essentially accomplish a fold
:
raze over ("one";("two";"three");"four")
-> onetwothreefour
This simplifies our function slightly:
interpolate:{
firstSplit: "${" vs x;
secondSplit: "}" vs ' firstSplit;
: raze over {$[1 < count[x];string[value first x],1_x;x]} each secondSplit
}
And achieves the same result:
.ak.x: 3
.tst.y: 6
z: 12
interpolate "${.ak.x} * 2 is ${.tst.y}. ${.tst.y} * 2 is ${z}"
-> "3 * 2 is 6. 6 * 2 is 12"
This works for other data types too, like dates and timestamps:
// .z.D and .z.P are system variables
// .z.D represents the current date (local)
// .z.P represents the current time (local), down to nanoseconds
interpolate "date: ${.z.D}, timestamp: ${.z.P}"
-> "date: 2020.06.13, timestamp: 2020.06.13D16:56:13.002316000"
There we have it: ES6-style template literals in Q.