Simple Karplus-Strong Synthesis and Note Length

Ever waste a whole bunch of time trying to understand something, where the only way to feel less guilty about wasting so much time on it is to tell someone about it? Me too… here goes.

In between more important things, I was thinking about Karplus-Strong (KS) plucked string synthesis again the other day — more than just the general concept of a filtered feedback network creating something that sounds like a plucked string — specifically, how high notes are quick to decay, and how low notes sustain for longer.

The reason this happens is because of the amplitude scaling that is applied through each feedback recursion. Each feedback recursion multiplies the amplitude by some value less than (but close to) 1. Essentially, the note lasts as long as the recursive geometric sequence doesn’t tend to zero. Because the recursive geometric sequence terminates at zero (well, at least at some point in a fixed bit number system anyway), the number of times a grain repeats before fading to silence is exactly the same whether it’s a low or high note. Since the high notes have a short grain-length, they die quicker.

I set out to figure out a way for one to specify note length, and have the algorithm figure out the scaling factor based on the desired pitch (as the geometric sequence should occur more often, the closer it gets to 1). Not sure if someone has written about this before (there’s probably plenty published about it, and to be honest, I didn’t look), but it was an excuse to kick around a couple of under-used neurons in my brain.

Where to begin…

When we think about sound in a medium (eg. in air, water, or earth), the relationship between pitch and wavelength is that they are inversely proportional about the speed of sound in the medium. The pitch is the number of oscillations per second, and the wavelength would be some value specified in metres, feet, etc. If something has a frequency x, the length of a cycle of the wave is speed of sound in the medium divided by frequency x — so if sound travels 343 feet per second in air, something with a frequency of 343 Hz would have a wavelength of 1 foot. A frequency of 440Hz would have a wavelength of 0.78 feet, etc. etc.

It’s more useful to think of wavelength in the temporal sense though, when it comes to KS synthesis.
‘Wavelength’ in KS synthesis is really just ‘time in ms’ of a repeating grain/cycle. I’ll refer to this from here onwards as cycle-time, or grain-width… ehhh, I can’t decide. Grain-width is probably better. Essentially, in KS synthesis, we’re more interested in the time it takes for a cycle of the wave to complete. Therefore, in the KS algorithm, the delay-line length of the repeating wave (the grain-width) in ms is 1000 / frequency.

Why? Because the frequency is the number of time it repeats each second. For example, a frequency of 440 Hz would have a grain-width of 1000/ 440 = ~2.27ms. A frequency of 343 Hz would have a cycle-time… I mean grain-width… of ~2.915ms. This is the ‘length’ of the repeated noise burst.

Calculating the Scaling Factor

Firstly, let’s lock the desired length of the plucked sound to be 2 seconds. In order for something to play for 2 seconds, we need to know the pitch we want to hear, and consequently adjust the multiplication factor so that for a given cycle-time/grain-width, a repeated wave will terminate after a specific number of recursions.

OK… so let’s say we want to play a 440 Hz note. The cycle-time is 2.27ms. The sound should repeat 880 times (given that it repeats 440 times per second).

It wasn’t immediately obvious to me how I should figure out how to set a multiplication value such that a geometric sequence (starting at 1., ending at 0.) should terminate after 880 recursions. The answer to this question might be fairly straightforward to someone out there with a background in number manipulation (or quantisation) in fixed bit number bit systems, but it seemed like the only way for me to understand this was to plot out the ‘recursions’ given a series of fixed scaling factors.

So, to figure this out, I did something really dull… (I’ll call it brute-force “modelling” to make it sound more interesting than it is). I set up a really basic system to count the number of times that a number (1.) is recursively scaled (with values < 1.) until it was less than 0.000001*. At an amplitude of 0.000001, I figured the sound was essentially silent (for 16-bit audio anyway). The way I did it was pretty rough, but that didn’t matter. I just wanted to glean a sense of how the scaling factor affected the number of recursions.

*Disclaimer: Having a poor understanding of number representation in the computer, I couldn’t count the number of times the function ‘actually’ happened because if I set the counter function to stop when the geometric sequence ‘equalled’ zero, it kept counting. I figured it had something to do with subnormal numbers — or that the internal mechanics of the ‘accum’ object had a finer resolution than the upper world of Max (or maybe it’s that I just upgraded to Max 6.1, where it appears that objects are working internally with 64bit precision… I think).

I therefore used an arbitrary small value, namely 0.000001 (which seemed to be the smallest number in magnitude that I could use as an argument in an object or write in a message box).

This is what I learned:

Starting with 1, the following values were less than 0.000001 after being recursively multiplied n times:

0.90 = 131 recursions 0.990 = 1374 recursions 0.9990 = 13808 recursions
0.91 = 146 recursions 0.991 = 1528 recursions 0.9991 = 15343 recursions
0.92 = 165 recursions 0.992 = 1720 recursions 0.9992 = 17262 recursions
0.93 = 190 recursions 0.993 = 1966 recursions etc.
0.94 = 223 recursions 0.994 = 2295 recursions
0.95 = 269 recursions 0.995 = 2756 recursions
0.96 = 338 recursions 0.996 = 3446 recursions
0.97 = 453 recursions 0.997 = 4598 recursions
0.98 = 683 recursions 0.998 = 6900 recursions

The pattern I spotted was that with more precision (as we tend toward 1.), the resolution jumps tenfold. Seems somewhat straightforward. I thought this over, and came to the conclusion that we can model this ‘precision’ by:

subtracting the scaling factor from 1, and then dividing 1 by that number.

eg. for 0.98:

1. – 0.98 = 0.02
1. / 0.02 = 50.

and 0.998:

1. – 0.998 = 0.002
1. / 0.002 = 500.

This produced values that shared the same ratios as the figures I got from my tests, but the numbers themselves were proportionally out by another value. Dividing the numbers produced by my recursive scaling tests by these precision ‘ratios’ I derived, I get a value somewhere around 13.75. I don’t know what this means… I’ll try to understand this later.

Firstly though, in order for me to calculate the scaling factor for a given pitch, I should do some kind of lookup (or reversal of the above process): find a value that terminates after n recursions, and set that as the scaling factor.

So, back to the example of making a note ring on for two seconds, lets pick a pitch and try it out… Say 440Hz.

OK. 440Hz will recur 880 times in two seconds.

If we want 880 recursions, our scaling factor should be somewhere between 0.98 and 0.99. Given that 0.98 produces 683 recursions, and 0.99 produces 1374, there’s quite a bit of leeway there.

Let’s try using the reverse of the formula (with the mystical ~13.75 ) to try to derive the scaling factor. [Strangely, I only just noticed the 1374 on the line above… maybe it’s just a coincidence.]

880 (number of recursions) / 13.75 = 64

1 / 64 = 0.015625

1 – 0.015625 = 0.984375 (Fits the conditions of being between 0.98 and 0.99. Good).

The formula to calculate scaling factor therefore appears to be,

1 minus the inverse of ((pitch * length in seconds) / 13.75)

or (since 13.75 is just a constant)

1. – (1. / ((frequency * length in seconds) / magicNumber))

It’s not perfect, but notes now seem to sustain for similar lengths, whether played low or high.

Other thoughts

Up to this point (in order to remove variables) I’d been working with an ideal situation, where no low-pass filter existed in the KS feedback network. Depending on the cutoff (or amount that is attenuated), the filter plays a big part in how long the note appears to resonate. Notes down the bottom still start with a broader spectrum, so a proportional filter cut-off (based on the fundamental of the synthesised note) might be a good idea too. I’ll think about that another day. After adding a fixed cut-off it doesn’t sound bad as it is though.

What have I learned from this?

  1. If something is percolating away in your brain, it’s faster to go about something in a laborious way and put pen to paper with the data you acquire, than it is for something to become self-evident.
  2. That producing something useful with the Karplus-Strong algorithm is much more complicated than scaling/filtering feedback of noise.

What am I still to learn?

With regard to different scaling factors, I haven’t yet looked in depth at how different pitches decay with different values (basically, whether a geometric sequence is the same at different scales). From some basic tests in a spectrogram, the geometric sequence appears to be similar at different scales. Notes seem to decay at similar rates when the grain is smaller (and the scaling factor is consequently closer to 1). It isn’t immediately obvious to me that it should or shouldn’t be this way.

At the end of this all… what’s with the number magicNumber constant (~13.75)? I still have no idea. Ultimately, though, the ~13.75 is irrelevant. I arrived at this number from the data I acquired just by counting the number of times 1 (when recursively multiplied by a value less than 1) tended to 0. In the actual synthesis implementation, setting it around 4.5 with a low-pass filter around 8000Hz makes notes appear to last around the desired length. It seems somewhat related to the filter cutoff-value in adjusting note length. (eg. a lower filter value causes the note to appear to fade more quickly, as we’re filtering out a lot of the note’s energy). It’s a little bit subjective though, and I only timed the length of plucked synth note by ear (and on laptop speakers).

Patch

Here’s the patch. Adjust the note length number box to produce longer or shorter notes. Make sure you switch between the ‘enable/disable scaling adjustment’ to hear the difference in note sustain lengths (ie. compensating for different grain-widths to produce similar decay lengths).


----------begin_max5_patcher----------
2143.3oc0bksbaaCE8Y6uBTN9g3TGEhMtjY5CY5zz9T9ApyjghBRlIbQkjJ1
MYh+1KVn1rDAgkHgYGOVxjhB7hCt2ycCz+3xKblV7.qxA7NveCt3heb4EWHO
k3DWzb7ENYQODmFUIuLmb18ES+hyMpOpl8Ps7zoEQyxXUU.2Iggq+z7UYI4o
rZ42D1bx4E40UIemIOGZh61qsXU8Su3kQ0w2kju3ykr3ZkX5QI7uD.EREuED
JOvchK3SMemjYRIhKkuIv0Ym6ZdTl7t579xjnz0excIylwx28lpji5+cIScG
cb.eR7I+7xKEubigHUbQVFKu9.n5Oxillxd6rjJw6fp3nT9LDDM6Kqpp28a7
bfukkrJ9WMpNoHeGvhDPlfn2.vdRzxs4kMX09n95Aa5pobIqi0ARHdBzyyKv
WL7p0DW0pA43qF9AcuZLubwTonLwsGg7zh6WxuFv7jzZVIHdUcw74fjbve88
dErgS74fMwE26fc.VtN1LzPOWs.s2KEPmEkuJJciN87n35hx9DhoHID2nv0m
PLkDpFZjTIFRfZgX5PCwKptOgKjGE7vOWRSbXfhGfFHYM8l3FfwDB+L3VlfD
mMCWIe5wsa9LSxbsKx1STkySK3Sk9TOwENAialgdlXNhzX7wIO2LXX43DPNv
sCWlSx3SB0B9Q.TT2ZL5fZNfsc7CCCaaEfikQbCSfyzn7E8qmqy2rtGrOQgx
kyv.slmvWJFv+rLh6WIkkun9NgGlrpSAmTHRUxL1d5lGWYGgnRjAR6cm7HWp
j2.5imHdCJ06wgGG08BeoP8OTx9mUr73+0xJkPLQoTB81gVnEkRuAO9mtBNG
NnQlKzCUAoH0C8Q5BMmR5FL5IuKchJH+AEWftRiGLQkwBUKrfrFrTsjKgkEU
wEKYOtdrqhKKRSeRLFagj0LQUE4E6NqMFKnPEKNgCFRUEUvVAnVLYripvu9n
itY64p.DHW+oXoKKLRbPaq+99mv5eUxhb9GeZHxprordvWtt3m3S4c3Ent5L
.7vmWXRG.NI4mYzPmdvoFiPHtGUjzMRP2HDMrmQndHhwUbuxqNEHJolk0T+I
meOsnhMSHH2Hd4Cxj02b3GKdSyYZIvFrAdlTrNPW8.r+vnBJ+sOospXoLQj3
OBPZSSr2brGpxIlp2ElmkovZKnv2yWuc6unuOmvnC8k5cxfDUEVnsXnoCdgE
r.elxgmYrY3wGalE7IpPHHBY.BAGadDagNpFLm+yTvzyG4HZJkEJTBcXYjiD
jmtvoHgm.WzFMn8UkF6AQraYpV2cDhNEKRvHzzqib0BF1dKAUoyC875NCVh2
nICVxD5fhK9TnpsCppbnGVnVCVrfQ05YdnAD0Dx3ydxBPzFiFivHz+a3bZF4
Art.9XeIVo53DFqC3v9mrmrdMn5rnEIweTFfDHtHupN5zZX8YDMcfKRYSJ0z
fTsMpCidopIcE32.Pva.uBBdK3UyAuFTy+irqu917ayu+NVI6cfayEW098U4
1747yMecEsAEyA4E0rayq4mtowBaNWF+bxkjCVL3iHKtXUtTX7s6JjOUsQ.T
coEF3pHHbaaMxfsog59trHQMg11hsAoh4CusumKby1mvHB.TfcI.ZAYXOrrD
.mrQu9p4Ptl8UyQ7CtZN95qGzJQrWLtpdYi3z.bbzi5oA6rWLIKhRx0WM4mY
NhX+.waApM9Apk10iNs102T1kaTIEdRSX12mEE+blwtcVn7.IwtpcIhWO5DF
6zeJ0E4blkT1ibWKttssof5KKeLAuewHCzZ1ircCANND85Gaeq806HSSCD0l
8JBNN.lEQ0Laz9HLTFmiGTW8NfgVFTDoeFsfc.pH3p.t7euR7lqEvGDUFLcf
uJjiVvmfW7b14gk0CpKcW7UkCCrmNaH3YTp9AqBhCZxpMUODgvc2KVH87KdX
u1tmjEONnk4Qv+5twRRe3nPx3f9sNZI+1L73BZ6tNsIgy1.F7nAX3QhJ50oq
U10OX01aPuSa3oDMCelvStMWdG6yRY7QdFzqSn9UI4fJd9x4ypt9nYPiN69H
h6g8.YytmApJrenZ2342BVCeg1tY+xakZcSF1XEaZCjxMmdSR5nHy4r5h4Cp
kXSOEoJWa5a8yKcXPbtouNrj1PjJzGkBBQaUosMk8WqR4jBkZMPZYy95tYK9
hUw1QwmxdZuISaZaCztIZcNMY9fxJHuCRp0m7XmIERw42G2pJVUFudPW+LcA
1JnyXU0I4RTZmKRrY624hN7Y7p4ouRNKON0moxinMfcJOj1jG2dWdD05pa7g
S.vSNqc4IKYlrVqMKRDta8f00cKPVPJBANAEBUEvTbps2xdatHEytlKhBAnc
tzixCxTrEYE4QjgxNS8iKOPh0z8fln6IxA2RxiQhC0dvSnIpy1a4hZh4kX2P
ZKpKnIxiWGlWOg5B27HCRvpcbtu61CG.JKDxz4fcnrnl3dB1k6omfonlG6EJ
hp5DqJu38NBiTgb0bo38h8xxyOaZS4a55ukjGz3BeHlvISrm7fLgCz2ebIOd
OOyU0SLov7TXyBcm.QzCOln1QkTj5YQfr2NTtGigxDCDD1Z.t4xicHr8vlHO
H64.wDmxjQUHczP6QmYRHcT6JN3NDGruckGTWJyA1Ud5JgIjEWuLgsWtdYGi
chQQKDXO4wDqK64qvDwgXOxGrQttnVqbDl34BFZMkmmvqnAdfimJyQBFWoai
rXkKMwXGNtpVCzdvCzD0Yn8xrD5Yp4tkjGSXmg3mWpShpGI+mNi7A3bcsjjG
N.4FAwlVuaKAon9.RaSdTsBJZ4xuwJqZFSon3jE8khRwgd2bo5esRpCkcuwo
j8sj0Wu5LQkw2kTyhqWUpZc2CAdNWJtO+7x+CLx3iHB
-----------end_max5_patcher-----------

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>