Author Archives: X37V

Structuring JSON data with the [dict] object in Max.

{
	"perceivedComplexity": "beastly",
	"actualComplexity": "manageable",
}
Working with setting and getting content from dictionaries in Max seems straightforward enough, but trying to group data into well-structured form can be a little tricky.

Structured Data

Recently I had a need to create a way to store some fairly complex data in Max. In short, I wanted to map out and find similarities in a bunch of audio files. I’m not a computer scientist or real programmer, so I had no idea how I should do this or how I should store this information in a manageable way. [As it turns out, I needed to create a hash table.]

Traditionally, the [coll] object was the go-to object to do this kind of stuff (and still is to an extent). It’s a simple way to store a list of values at a numerical (or symbolic) index.

1, 100 72 64 forward 7.43 delay 85 0;
2, 60 160 62 forward 5.0 bypass 51 1;
3, 82 10 114 backward 0.2 delay 15 1;
4, 155 97 98 backward 8.2 delay 99 0;

Send the [coll] object a 2, and the corresponding data (60 160 62 forward 5.0 bypass 51 1) will come out the first outlet.

When trying to encode lots of data though, a more descriptive index would be more helpful.

While [coll] supports indexes that are symbols, I was keen to use something that allowed me to look up or retrieve particular ‘atoms’ of the information I was storing. With [coll], if you request the data stored at an index, you retrieve all the data stored at that index. As [coll] data is stored as a list, the order of the data stored at that index is important, and it can be a little difficult to see what each value represents. Furthermore, as I was interested in storing and retrieving data based on some kind of shared similarity (ie. separate arrays of data that should be grouped under the same index) I wanted to store it in a more descriptive and extensible way. What I needed was something like an associative array.

Associative arrays store every piece of information as key and value pairs. This data structure goes by many differing names (dictionaries, hashes, maps, symbol tables, hash tables, collections). In the JavaScript world these kinds data storage structures are referred to as objects. I’ll refer to them as objects for the time being. (Just to confuse things more, key-value pairs are also sometimes termed name-value pairs, index-value pairs, and attribute-value pairs.)

The key would describe the bit of information I was interested in storing, and the value would be the number/setting representing that information.

Essentially, objects represents structured data like this:

{
	name: "Alex",
	sex: "male",
	age: 35,
	coffee: "espresso",
	coffeeTimes: [7, 9, 11, 16]
}
An example of object notation

keys sit on the left, values on the right. There’s a colon after each key, and a comma after the first to the penultimate key-value pair. Strings are surrounded by quotes (eg. "espresso"), and arrays are a list of comma separated values in square brackets (eg. [7, 8, 11, 16]).

The cool thing about object notation is that values can be strings, numbers, lists/arrays, or even objects themselves. Even better is that you can insert a new key-value pair at any point within your object and it won’t break anything, because you retrieve values by their key (contrast to a [coll] where you’d have to keep track of where the data value you were storing was in the array).

Combined with arrays, objects are very flexible ways to store and format data. Scott Murray’s D3 Tutorial chapter on Data Types illustrates the power of objects and arrays really well: You can combine these two structures to create arrays of objects, or objects of arrays, or objects of objects or, well, basically whatever structure makes sense for your data set.

What do ‘arrays of objects’, ‘objects of arrays’, and ‘objects of objects’ mean?

Well, many things. If an array is a list of items, and an object is a collection of named properties grouped together, you could combine them in ways to:

  • create a list of data structures that were all related in some way and assign them all to one keyed list (and access info about each one by its index in the array); or
  • nest specific bits of information within the context in which they are relevant; or
  • have a collection of properties that had their own groups of sub properties, and so on.

Example 1:

{
	animals: [
		{ name: "Alex", sex: "male", age: 35, species: "human" },
		{ name: "Benny", sex: "male", age: 3, species: "cat" },
		{ name: "Mench", sex: "male", age: 6, species: "cat" }		
	]
}
animals contains an array of objects.

Example 2:

{
	series1: [ 0, 1, 3, 7, 15, 31, 63 ],
	series2: [ 1, 4, 9, 16, 25, 36, 49 ],
	series3: [ 1, 2, 4, 7, 11, 16, 22 ],
	series4: [ 1, 1, 2, 3, 5, 8, 13 ]
}
An object of arrays.

Example 3:

{
	name: "Alex",
	sex: "male",
	age: 35,
	coffee: {
		type: "espresso",
		specs: {
			shots: 2,
			milk: 1,
			sugar: 0
		}
	},
	coffeeTimes: [ 7, 9, 11, 16 ]
}
The ‘coffee’ key contains an object with two keys (‘type’ and ‘specs’), and that ‘specs’ itself also contains an object.

Essentially, [] indicates an array, and {} an object. In JavaScript, you access objects’ values by their key, and arrays’ values by appending their numerical index (starting at 0) in square brackets. If an object is contained within another object, you use ‘dot’ notation to indicate the ‘path’ to the desired named element.

age			// Returns 35
coffee.type		// Returns "espresso"
coffee.specs.shots	// Returns 2
coffeeTimes[2]		// Returns 11
Retrieving properties of keys and arrays in JavaScript

JavaScript Object Notation

JavaScript Object Notation (or JSON) is a specific syntax for organising data as JavaScript objects. Essentially keys are wrapped in double quotes, as are the values if they are strings/symbols.

{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 25,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    }
  ],
  "children": [],
  "spouse": null
}
[From the JSON entry on Wikipedia]

Note again that the value stored under ‘address’ is itself an object that contains its own key-value pairs, and that ‘phoneNumbers’ contains an array of objects.


Dictionaries in Max: The [dict] object

The [dict] object emerged in Max 6 as a way to store structured data like this. As the term ‘object’ in Max refers to elements within a patch that perform a function, object data structures are referred to as dictionaries in Max.

{
	"key": "value",
	"anotherKey": "anotherValue"
}

Why are dictionaries good?

Apart from the fact that data can be structured in a more meaningful and readable way, the order of the key-value data pairs they contain doesn’t matter. As alluded to above, in the [coll] object, changing the order of the values in an array would likely break something in your patch (as the position of the items in the array carries some kind of associative meaning), whereas in a dictionary the order doesn’t matter as you request the value stored at a key (as opposed to the nth item in a list).

In a [coll]:

1, 100 72 64 forward 7.43 delay 85 0;

… is different to:

1, 64 forward 100 72 7.43 delay 85 0;

whereas in a [dict]:

{
	"key1": 54,
	"key2": 95,
	"key3": 8
}

…is equivalent to:

{
	"key1": 54,
	"key3": 8,
	"key2": 95
}

Building dictionary content

The [dict] object allows us to programmatically build up content in a JSON-like way. There are a few ways of setting content in a [dict] object. set, append, and replace messages allow you to:

  • set a string (symbol), number (int/float), or array at a particular key;
  • append values to a specified key to turn it into an array (or insert the key and value pair if it does not existing within the dictionary); and
  • replace the value at an existing key (or insert the key and value pair if it does not existing within the dictionary).

For example:

Before we get to nesting dictionaries within dictionaries, let’s look at how to retrieve content.

Retrieving content from a dictionary

{
	"name": "Alex",
	"sex": "male",
	"age": 35,
	"coffee": {
		"type": "espresso",
		"specs": {
			"shots": 2,
			"milk": 1,
			"sugar": 0
		}
	},
	"coffeeTimes": [ 7, 9, 11, 16 ]
}

There are a few methods that allow you to get information from a dictionary: get, gettype, getsize, and getkeys. Given the dictionary above, the following is an example of what gets output with these get methods.

Method Example Value Output
get name name Alex
sex sex male
coffee coffee dictionary u504001192
coffee::type coffee::type espresso
coffeeTimes coffeeTimes 7 9 11 16
gettype name symbol
age int
coffee coffee dictionary
coffeetimes coffeeTimes array
getsize name name 1 [ie. 1 string]
age age 1 [ie. 1 int]
coffee coffee 1 [ie. 1 dictionary]
coffee::specs coffee::specs 1 [ie. 1 dictionary]
coffeeTimes coffeeTimes 4
getkeys [outputs a list of all the top level keys]

Note that to access nested dictionary content (eg. ‘specs’), you use a double colon separator (::)

So we can retrieve nested dictionary content, but how do we set it?

Setting key-value pairs is easy, but setting nested dictionary content (ie. a dictionary at a key, or an array of dictionaries at a key) requires a few little steps to do correctly. Let’s build a complex set of nested content like GeoJSON data as an example:

{
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [ 150.1282427, -24.471803 ]
            },
            "properties": {
                "type": "town"
            }
        }
    ]
}
[From Scott Murray’sTypes of data]

The setparse message

There’s not very much about setparse in the Max help patches, but setparse is one of the most important messages when trying to construct dictionaries within dictionaries using Max messages.

setparse allows you to set content as a dictionary at a specified key.

Let’s go back to a simple example:

{
	"name": "Alex",
	"sex": "male",
	"age": 35
}

The syntax for setparse goes like this:

setparse coffee type: espresso

The first word after ‘setparse’ is the key at which you wish to add some dictionary value. If the second word has a trailing colon (eg. as in ‘type:’), it creates a dictionary with that key (type) within the first key (coffee). Re-read that if it didn’t make sense.

If you list a value after the second word (eg. ‘espresso’), it sets the value at the second word’s key (ie. the value of the nested dictionary’s key).

Namely, the dictionary would now look like this:

{
	"name": "Alex",
	"sex": "male",
	"age": 35,
	"coffee": {
		"type": "espresso"
	}
}

You can specify as many words with trailing colons as you like and it will create those keys, eg.

The message:

setparse coffee origin: roast: age:

…would create:

{
	"name": "Alex",
	"sex": "male",
	"age": 35,
	"coffee": {
		"origin": "*",
		"roast": "*",
		"age": "*"
	}
}

…and Max will store placeholder text (“*“) at those keys (if no value is listed after each key). Note though that the type key disappeared. When you set content (and this includes setparse), it overwrites existing content at that key. It is sometimes best to create a key with setparse,

{
	"name": "Alex",
	"sex": "male",
	"age": 35
}
setparse coffee type: espresso  

…then append the elements one at a time like this:

append coffee::origin *
append coffee::roast *
append coffee::age *

This will retain the four keys (type:, origin:, roast:, and age:)

Making a key store an array of dictionaries.

Lastly, if you want an item stored at a key to be an array of dictionaries, there is a cool thing you can do to achieve this (that, as far as I can see is undocumented in the help patches).

Let’s try to create this structure:

{
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "geometry": {
                "type": "Point",
                "coordinates": [ 150.1282427, -24.471803 ]
            },
            "properties": {
                "type": "town"
            }
        }
    ]
}

Here is a list of messages (with a comment explaining what each does):

set type FeatureCollection // create a key called 'type' and assign it the value 'FeatureCollection'
set features // create an 'empty key' called 'features'
append features // this is a crucial step - this turns features' value into an empty array
setparse features[0] type: geometry: properties: //creates an object with three keys under the first 'features' key of the array
set features[0]::type Feature // as with the last step, we need to ensure we address the items with square bracket notation now that it's an array

setparse features[0]::geometry type: coordinates: // add a key with a dictionary value (with its own two keys) to 'features'
set features[0]::geometry::type Point // set the value of 'type' within the geometry dictionary
set features[0]::geometry::coordinates 150.12825 -24.471804 // set the value of 'coordinates' within the geometry dictionary to an array of floats

// or the previous 3 lines all in one step
setparse features[0]::geometry type: Point coordinates: 150.12825 -24.471804

setparse features[0]::properties type: town // create a new key 'properties' and set its content as a dictionary

Optional: should you wish to extend the length of the ‘features’ array, try:

append features * // append some dummy data to the 'features' array, then...
setparse features[1] type: geometry: properties: // add the keys
append features * // again, extend the 'features' array
setparse features[2] type: geometry: properties: // add keys to the third item in the array
append features * // and again, extend the 'features' array
setparse features[3] type: geometry: properties: // ...you get the idea.

Building GeoJSON data example patch

A comprehensive tutorial (aside from this vignette) from Cycling ’74 is still very much desired, but in the meantime check out the help patch below for some examples of how to create complex dictionary structures.

Perspective Simulation

I was recently watching some videos about using ‘displacement maps’ in After Effects, which is a way to give 2D images the appearance of being 3D. It’s a beautifully simple idea and I wanted to see if I could recreate this effect in Processing.

In short, the 3D appearance is simulated by offsetting the position of each pixel by some (varying) value. The offset amount is dictated by a depth map, which can be as simple as a series of greyscale layers that represent ‘planes’ of extrusion, much like how elevation is represented on a topographical map.


The following video simulates a moving perspective, where the only source material is a still image, and a greyscale image representing topography.

This effect is achieved by using the brightness of the pixels in the depth map to offset the location of pixels from the input source. Sarah created a detailed depth map for this (below).

George Hodan - Face of the Man
Input Source, Face of the Man by George Hodan
Depth map (detailed)
Depth Map (Detailed)

My initial approach used a simple lookup of the depth map to offset the pixels of the input source to create the output. This is a fairly efficient implementation, but produces a low-quality output, as using a one-pass lookup to extrude points can leave some pixels in the output array blank. Why? Because bright areas can offset pixels by a large value, and dark values offset by a small value. Some pixels in the source input can therefore be mapped to the same pixel in the output (and therefore some output pixels are left unmapped).

You can see the result of this in the video below:

Download the PerspectiveLite Processing sketch.

Additionally, a single iteration through the list of pixels can cause pixels later in the array to overwrite any previously set pixels (which has the potential of bringing the top lip in front of bottom lip when shifting the perspective upwards, for example).


A higher quality result (as shown in the first video) can be produced by deducing which pixel is likely to be mapped to a specific point in the output — this also allows brighter values in the depth map to take precedence. This is done by walking through the output array, and calculating which pixel (given the magnitude of the perspective transform) meets the conditions of that amount of offset. Stepping through the pixels on the depth map (from 0 to the maximum offset, multiplied by the brightness of the pixel in the depth map) also lets you calculate which pixel from the input should be the frontmost (in the event that two pixels are mapped to the same point the output).

The following sketch demonstrates this approach. This implementation is a bit too computationally expensive to run in the browser at useful frame rates unfortunately, but you can download the Perspective Processing sketch to play around with offline.

Rainbow Apple Logo from WWDC2014 in Processing

Watching WWDC 2014 recently, the Apple graphic that was projected behind the presenters caught my eye.
WWDC2014
I was playing with Processing before watching this, so I started thinking about how such a graphic could be created. If a low resolution image of the Apple logo was used as input, the brightness of the pixels could be mapped to the size of the squares.

Here’s a rough Processing sketch to recreate the projected ‘rainbow’ Apple logo from http://www.apple.com/au/apple-events/june-2014/. The source image has a gaussian blur applied to it before downscaling to soften the gradient. This has the effect of scaling the squares adjacent to the logo’s edge.

View source. Download the AppleLogo sketch for Processing.

Fix .mxo Max externals displaying as a folder

Under Mavericks, I’ve noticed that quite a number of Max .mxo objects show up in the Finder as folders. Max won’t load these even if they’re in your search path.

Vade wrote about this back in 2006 — MXO externals showing up as Folders in Max/MSP? — and offered a solution.

This fix is easy for the odd external, but cumbersome for a collection of objects. Here’s a Max patch to fix up individual/multiple objects quickly.

Require’s Jasch’s [fscopy], [strrchr], [strcut], and [strcat] objects.
Available from: http://www.jasch.ch/dl/default.htm

Problem: Max objects showing up as folders.
Fix: Drag a .mxo folder into the dropfile object in the following patch.

OSynC 1.1 (32/64bit) VST plugin

OSynC is a way to synchronise performers, computers, and applications with OSC. It is designed to be a stateless system, where performers can join or leave a performance, and receive tempo/time signature and other positional information on a number of time scales to remain in sync effortlessly with a host. It can also be used as a ReWire-like way to synchronise note-generating events across applications.

Typically, an OSynC host transmits transport information, and a client (either on the same machine, or on another networked machine) receives the stream of descriptors. OSynC VST-[32|64]bit.vst grabs transport information from Cubase (and other VST capable DAW hosts like Digital Performer and Notion Music) and formats a number of timing messages as OSC packets (sent over UDP port 10101).

This VST plugin was created to synchronise Max with a VST host. This allows Max to be an event generator, and have Cubase score/record synchronised MIDI events.

Order of messages within an OSynC datagram:

/osync/timestamp 773495082. 23582014.
/osync/play 1
/osync/bpm 125.7
/osync/timesig 4 4
/osync/fractime 59.665871
/osync/barcount 12
/osync/bar 4
/osync/beat 3
/osync/fraction 18
/osync/ramp 0.578125

 

Download for OS X

 

OSynC 1.1 (OSynC VST 32 & 64bit plugins/Max abstraction & help) – 1.1, January 2014

The download contains a VST plugin (for Cubase) and a Max abstraction for unpacking OSynC packets and dispatching descriptors. Check out the OSynC-route.maxhelp patch for capturing and dispatching OSynC messages.

PS. You need to compile and put liblo.7.dylib in /usr/local/bin. You can also find a precompiled version.

Simple Karplus-Strong Synthesis and Note Length

Ever waste a whole bunch of time trying to understand something, where the only way to feel less guilty about wasting so much time on it is to tell someone about it? Me too… here goes.

In between more important things, I was thinking about Karplus-Strong (KS) plucked string synthesis again the other day — more than just the general concept of a filtered feedback network creating something that sounds like a plucked string — specifically, how high notes are quick to decay, and how low notes sustain for longer.

The reason this happens is because of the amplitude scaling that is applied through each feedback recursion. Each feedback recursion multiplies the amplitude by some value less than (but close to) 1. Essentially, the note lasts as long as the recursive geometric sequence doesn’t tend to zero. Because the recursive geometric sequence terminates at zero (well, at least at some point in a fixed bit number system anyway), the number of times a grain repeats before fading to silence is exactly the same whether it’s a low or high note. Since the high notes have a short grain-length, they die quicker.

I set out to figure out a way for one to specify note length, and have the algorithm figure out the scaling factor based on the desired pitch (as the geometric sequence should occur more often, the closer it gets to 1). Not sure if someone has written about this before (there’s probably plenty published about it, and to be honest, I didn’t look), but it was an excuse to kick around a couple of under-used neurons in my brain.

Where to begin…

When we think about sound in a medium (eg. in air, water, or earth), the relationship between pitch and wavelength is that they are inversely proportional about the speed of sound in the medium. The pitch is the number of oscillations per second, and the wavelength would be some value specified in metres, feet, etc. If something has a frequency x, the length of a cycle of the wave is speed of sound in the medium divided by frequency x — so if sound travels 343 feet per second in air, something with a frequency of 343 Hz would have a wavelength of 1 foot. A frequency of 440Hz would have a wavelength of 0.78 feet, etc. etc.

It’s more useful to think of wavelength in the temporal sense though, when it comes to KS synthesis.
‘Wavelength’ in KS synthesis is really just ‘time in ms’ of a repeating grain/cycle. I’ll refer to this from here onwards as cycle-time, or grain-width… ehhh, I can’t decide. Grain-width is probably better. Essentially, in KS synthesis, we’re more interested in the time it takes for a cycle of the wave to complete. Therefore, in the KS algorithm, the delay-line length of the repeating wave (the grain-width) in ms is 1000 / frequency.

Why? Because the frequency is the number of time it repeats each second. For example, a frequency of 440 Hz would have a grain-width of 1000/ 440 = ~2.27ms. A frequency of 343 Hz would have a cycle-time… I mean grain-width… of ~2.915ms. This is the ‘length’ of the repeated noise burst.

Calculating the Scaling Factor

Firstly, let’s lock the desired length of the plucked sound to be 2 seconds. In order for something to play for 2 seconds, we need to know the pitch we want to hear, and consequently adjust the multiplication factor so that for a given cycle-time/grain-width, a repeated wave will terminate after a specific number of recursions.

OK… so let’s say we want to play a 440 Hz note. The cycle-time is 2.27ms. The sound should repeat 880 times (given that it repeats 440 times per second).

It wasn’t immediately obvious to me how I should figure out how to set a multiplication value such that a geometric sequence (starting at 1., ending at 0.) should terminate after 880 recursions. The answer to this question might be fairly straightforward to someone out there with a background in number manipulation (or quantisation) in fixed bit number bit systems, but it seemed like the only way for me to understand this was to plot out the ‘recursions’ given a series of fixed scaling factors.

So, to figure this out, I did something really dull… (I’ll call it brute-force “modelling” to make it sound more interesting than it is). I set up a really basic system to count the number of times that a number (1.) is recursively scaled (with values < 1.) until it was less than 0.000001*. At an amplitude of 0.000001, I figured the sound was essentially silent (for 16-bit audio anyway). The way I did it was pretty rough, but that didn’t matter. I just wanted to glean a sense of how the scaling factor affected the number of recursions.

*Disclaimer: Having a poor understanding of number representation in the computer, I couldn’t count the number of times the function ‘actually’ happened because if I set the counter function to stop when the geometric sequence ‘equalled’ zero, it kept counting. I figured it had something to do with subnormal numbers — or that the internal mechanics of the ‘accum’ object had a finer resolution than the upper world of Max (or maybe it’s that I just upgraded to Max 6.1, where it appears that objects are working internally with 64bit precision… I think).

I therefore used an arbitrary small value, namely 0.000001 (which seemed to be the smallest number in magnitude that I could use as an argument in an object or write in a message box).

This is what I learned:

Starting with 1, the following values were less than 0.000001 after being recursively multiplied n times:

0.90 = 131 recursions 0.990 = 1374 recursions 0.9990 = 13808 recursions
0.91 = 146 recursions 0.991 = 1528 recursions 0.9991 = 15343 recursions
0.92 = 165 recursions 0.992 = 1720 recursions 0.9992 = 17262 recursions
0.93 = 190 recursions 0.993 = 1966 recursions etc.
0.94 = 223 recursions 0.994 = 2295 recursions
0.95 = 269 recursions 0.995 = 2756 recursions
0.96 = 338 recursions 0.996 = 3446 recursions
0.97 = 453 recursions 0.997 = 4598 recursions
0.98 = 683 recursions 0.998 = 6900 recursions

The pattern I spotted was that with more precision (as we tend toward 1.), the resolution jumps tenfold. Seems somewhat straightforward. I thought this over, and came to the conclusion that we can model this ‘precision’ by:

subtracting the scaling factor from 1, and then dividing 1 by that number.

eg. for 0.98:

1. – 0.98 = 0.02
1. / 0.02 = 50.

and 0.998:

1. – 0.998 = 0.002
1. / 0.002 = 500.

This produced values that shared the same ratios as the figures I got from my tests, but the numbers themselves were proportionally out by another value. Dividing the numbers produced by my recursive scaling tests by these precision ‘ratios’ I derived, I get a value somewhere around 13.75. I don’t know what this means… I’ll try to understand this later.

Firstly though, in order for me to calculate the scaling factor for a given pitch, I should do some kind of lookup (or reversal of the above process): find a value that terminates after n recursions, and set that as the scaling factor.

So, back to the example of making a note ring on for two seconds, lets pick a pitch and try it out… Say 440Hz.

OK. 440Hz will recur 880 times in two seconds.

If we want 880 recursions, our scaling factor should be somewhere between 0.98 and 0.99. Given that 0.98 produces 683 recursions, and 0.99 produces 1374, there’s quite a bit of leeway there.

Let’s try using the reverse of the formula (with the mystical ~13.75 ) to try to derive the scaling factor. [Strangely, I only just noticed the 1374 on the line above… maybe it’s just a coincidence.]

880 (number of recursions) / 13.75 = 64

1 / 64 = 0.015625

1 – 0.015625 = 0.984375 (Fits the conditions of being between 0.98 and 0.99. Good).

The formula to calculate scaling factor therefore appears to be,

1 minus the inverse of ((pitch * length in seconds) / 13.75)

or (since 13.75 is just a constant)

1. – (1. / ((frequency * length in seconds) / magicNumber))

It’s not perfect, but notes now seem to sustain for similar lengths, whether played low or high.

Other thoughts

Up to this point (in order to remove variables) I’d been working with an ideal situation, where no low-pass filter existed in the KS feedback network. Depending on the cutoff (or amount that is attenuated), the filter plays a big part in how long the note appears to resonate. Notes down the bottom still start with a broader spectrum, so a proportional filter cut-off (based on the fundamental of the synthesised note) might be a good idea too. I’ll think about that another day. After adding a fixed cut-off it doesn’t sound bad as it is though.

What have I learned from this?

  1. If something is percolating away in your brain, it’s faster to go about something in a laborious way and put pen to paper with the data you acquire, than it is for something to become self-evident.
  2. That producing something useful with the Karplus-Strong algorithm is much more complicated than scaling/filtering feedback of noise.

What am I still to learn?

With regard to different scaling factors, I haven’t yet looked in depth at how different pitches decay with different values (basically, whether a geometric sequence is the same at different scales). From some basic tests in a spectrogram, the geometric sequence appears to be similar at different scales. Notes seem to decay at similar rates when the grain is smaller (and the scaling factor is consequently closer to 1). It isn’t immediately obvious to me that it should or shouldn’t be this way.

At the end of this all… what’s with the number magicNumber constant (~13.75)? I still have no idea. Ultimately, though, the ~13.75 is irrelevant. I arrived at this number from the data I acquired just by counting the number of times 1 (when recursively multiplied by a value less than 1) tended to 0. In the actual synthesis implementation, setting it around 4.5 with a low-pass filter around 8000Hz makes notes appear to last around the desired length. It seems somewhat related to the filter cutoff-value in adjusting note length. (eg. a lower filter value causes the note to appear to fade more quickly, as we’re filtering out a lot of the note’s energy). It’s a little bit subjective though, and I only timed the length of plucked synth note by ear (and on laptop speakers).

Patch

Here’s the patch. Adjust the note length number box to produce longer or shorter notes. Make sure you switch between the ‘enable/disable scaling adjustment’ to hear the difference in note sustain lengths (ie. compensating for different grain-widths to produce similar decay lengths).


----------begin_max5_patcher----------
2143.3oc0bksbaaCE8Y6uBTN9g3TGEhMtjY5CY5zz9T9ApyjghBRlIbQkjJ1
MYh+1KVn1rDAgkHgYGOVxjhB7hCt2ycCz+3xKblV7.qxA7NveCt3heb4EWHO
k3DWzb7ENYQODmFUIuLmb18ES+hyMpOpl8Ps7zoEQyxXUU.2Iggq+z7UYI4o
rZ42D1bx4E40UIemIOGZh61qsXU8Su3kQ0w2kju3ykr3ZkX5QI7uD.EREuED
JOvchK3SMemjYRIhKkuIv0Ym6ZdTl7t579xjnz0excIylwx28lpji5+cIScG
cb.eR7I+7xKEubigHUbQVFKu9.n5Oxillxd6rjJw6fp3nT9LDDM6Kqpp28a7
bfukkrJ9WMpNoHeGvhDPlfn2.vdRzxs4kMX09n95Aa5pobIqi0ARHdBzyyKv
WL7p0DW0pA43qF9AcuZLubwTonLwsGg7zh6WxuFv7jzZVIHdUcw74fjbve88
dErgS74fMwE26fc.VtN1LzPOWs.s2KEPmEkuJJciN87n35hx9DhoHID2nv0m
PLkDpFZjTIFRfZgX5PCwKptOgKjGE7vOWRSbXfhGfFHYM8l3FfwDB+L3VlfD
mMCWIe5wsa9LSxbsKx1STkySK3Sk9TOwENAialgdlXNhzX7wIO2LXX43DPNv
sCWlSx3SB0B9Q.TT2ZL5fZNfsc7CCCaaEfikQbCSfyzn7E8qmqy2rtGrOQgx
kyv.slmvWJFv+rLh6WIkkun9NgGlrpSAmTHRUxL1d5lGWYGgnRjAR6cm7HWp
j2.5imHdCJ06wgGG08BeoP8OTx9mUr73+0xJkPLQoTB81gVnEkRuAO9mtBNG
NnQlKzCUAoH0C8Q5BMmR5FL5IuKchJH+AEWftRiGLQkwBUKrfrFrTsjKgkEU
wEKYOtdrqhKKRSeRLFagj0LQUE4E6NqMFKnPEKNgCFRUEUvVAnVLYripvu9n
itY64p.DHW+oXoKKLRbPaq+99mv5eUxhb9GeZHxprordvWtt3m3S4c3Ent5L
.7vmWXRG.NI4mYzPmdvoFiPHtGUjzMRP2HDMrmQndHhwUbuxqNEHJolk0T+I
meOsnhMSHH2Hd4Cxj02b3GKdSyYZIvFrAdlTrNPW8.r+vnBJ+sOospXoLQj3
OBPZSSr2brGpxIlp2ElmkovZKnv2yWuc6unuOmvnC8k5cxfDUEVnsXnoCdgE
r.elxgmYrY3wGalE7IpPHHBY.BAGadDagNpFLm+yTvzyG4HZJkEJTBcXYjiD
jmtvoHgm.WzFMn8UkF6AQraYpV2cDhNEKRvHzzqib0BF1dKAUoyC875NCVh2
nICVxD5fhK9TnpsCppbnGVnVCVrfQ05YdnAD0Dx3ydxBPzFiFivHz+a3bZF4
Art.9XeIVo53DFqC3v9mrmrdMn5rnEIweTFfDHtHupN5zZX8YDMcfKRYSJ0z
fTsMpCidopIcE32.Pva.uBBdK3UyAuFTy+irqu917ayu+NVI6cfayEW098U4
1747yMecEsAEyA4E0rayq4mtowBaNWF+bxkjCVL3iHKtXUtTX7s6JjOUsQ.T
coEF3pHHbaaMxfsog59trHQMg11hsAoh4CusumKby1mvHB.TfcI.ZAYXOrrD
.mrQu9p4Ptl8UyQ7CtZN95qGzJQrWLtpdYi3z.bbzi5oA6rWLIKhRx0WM4mY
NhX+.waApM9Apk10iNs102T1kaTIEdRSX12mEE+blwtcVn7.IwtpcIhWO5DF
6zeJ0E4blkT1ibWKttssof5KKeLAuewHCzZ1ircCANND85Gaeq806HSSCD0l
8JBNN.lEQ0Laz9HLTFmiGTW8NfgVFTDoeFsfc.pH3p.t7euR7lqEvGDUFLcf
uJjiVvmfW7b14gk0CpKcW7UkCCrmNaH3YTp9AqBhCZxpMUODgvc2KVH87KdX
u1tmjEONnk4Qv+5twRRe3nPx3f9sNZI+1L73BZ6tNsIgy1.F7nAX3QhJ50oq
U10OX01aPuSa3oDMCelvStMWdG6yRY7QdFzqSn9UI4fJd9x4ypt9nYPiN69H
h6g8.YytmApJrenZ2342BVCeg1tY+xakZcSF1XEaZCjxMmdSR5nHy4r5h4Cp
kXSOEoJWa5a8yKcXPbtouNrj1PjJzGkBBQaUosMk8WqR4jBkZMPZYy95tYK9
hUw1QwmxdZuISaZaCztIZcNMY9fxJHuCRp0m7XmIERw42G2pJVUFudPW+LcA
1JnyXU0I4RTZmKRrY624hN7Y7p4ouRNKON0moxinMfcJOj1jG2dWdD05pa7g
S.vSNqc4IKYlrVqMKRDta8f00cKPVPJBANAEBUEvTbps2xdatHEytlKhBAnc
tzixCxTrEYE4QjgxNS8iKOPh0z8fln6IxA2RxiQhC0dvSnIpy1a4hZh4kX2P
ZKpKnIxiWGlWOg5B27HCRvpcbtu61CG.JKDxz4fcnrnl3dB1k6omfonlG6EJ
hp5DqJu38NBiTgb0bo38h8xxyOaZS4a55ukjGz3BeHlvISrm7fLgCz2ebIOd
OOyU0SLov7TXyBcm.QzCOln1QkTj5YQfr2NTtGigxDCDD1Z.t4xicHr8vlHO
H64.wDmxjQUHczP6QmYRHcT6JN3NDGruckGTWJyA1Ud5JgIjEWuLgsWtdYGi
chQQKDXO4wDqK64qvDwgXOxGrQttnVqbDl34BFZMkmmvqnAdfimJyQBFWoai
rXkKMwXGNtpVCzdvCzD0Yn8xrD5Yp4tkjGSXmg3mWpShpGI+mNi7A3bcsjjG
N.4FAwlVuaKAon9.RaSdTsBJZ4xuwJqZFSon3jE8khRwgd2bo5esRpCkcuwo
j8sj0Wu5LQkw2kTyhqWUpZc2CAdNWJtO+7x+CLx3iHB
-----------end_max5_patcher-----------

Tathagata

The base-chords and note progression that forms Tathagata is something that must have been going through Fredrik Thordendal’s head for some time.  This progression is something that has emerged at least 3 times in both Meshuggah and Fredrik’s solo work, and can be heard as the basis of Tathagata (Sol Niger Within, 1997; Sol Niger Within version 3.33, 1999), the close [3:50] of Sublevels (Destroy Erase Improve, 1995) and the outro [9:06] of Fredrik’s Secrets of the Unknown demo.

Tathagata is a beautiful track that seems to will the listener into innately feeling or predicting the next series of notes.  I was wondering why I felt this way about the track, until I sat down and started to score out the notes.  What emerged was a pattern, or a recursive, self-pitch-clipping algorithm.  (While it is a progression that continues to spiral upward, it has octave subtraction protection to stop it going on infinitely upward).  This is a very cool progression that, while it isn’t an auditory illusion, for some reason makes me think of a Shepard-Risset glissando, Risset’s rhythmic accelerando auditory illusion, or Autechre’s Fold4,Wrap5 (LP5, 1998).

At first, I thought the track progressed in measures of 5, with the notes being played on the 1, 2 and 3. Sarah argues that each measure is divided into 16ths with stresses on 1, 4 and 7, which I now think is actually correct.

In lay terms, the track can be represented as such:

The pattern progresses in a count of 16.

The ‘root’ note starts at MIDI pitch 56, and adds 5 semitones after each loop.  If ‘root’ exceeds 62 (or the initial ‘root’ value + 6), then subtract 12.

‘root’ plays a chord (root, root + 7, root + 12 [they are transposed down an octave in the code below]) on the 1 that rings out over the course of the measure.

‘note[1-3]’ is the melodic progression of the plucked parts.  They play on positions 1, 4, and 7 respectively of the measure.

note[1] = root + 2

note[2] = root + 3

note[3] = root + 11

One possible implementation of Tathagata for JS in Max

 

// Simple representation of pitches and key modulation in Fredrik Thordendal's Tathagata.
// Tathagata.js by Alex Mesker / alex@x37v.com / www.x37v.com
// Save as Tathagata.js and put it in your Max search path.
outlets = 2;
setoutletassist(0, "Melodic progression (MIDI pitch)");
setoutletassist(1, "Root chord note/s (MIDI pitch)");
var root = 56;
var cutoff = root + 6;
function msg_int(i) {
if (i==1) {
outlet(1, root-12);
outlet(1, root-5);
outlet(1, root);
outlet(0, root+14);
} else if (i==4) {
outlet(0, root+15);
} else if (i==7) {
outlet(0, root+23);
root += 5;
if (root > cutoff) {
root -= 12;
}
}
}

Save the above JavaScript as Tathagata.js and put it in your Max search path.

A Max patch that uses it can be downloaded here:

 ----------begin_max5_patcher----------
1034.3oc0YtzbaBCDG+r8mBMb1MC5EO5jKc54dqSuzloCATrIEPd.4ooMS9t
WP.wOBTDBaYxgPRDfz+8m1c0hzyKWXcO+IVgE3ifuCVr34kKVHappgEM++Bq
zfmBSBJjOlUJqnHXMyZU88DrmDx1QXZaaayYErLQfHlm8ybVnnt+gT3M1q.X
mpq1MW.207RY6RiyRXB4nfZ6o.Q3l3r0G0KxWkhj8E5F5J.z639guSz1QvlV
iijhje+ie.ha04C7LQVPJSdqOkGGjzdm5dP7msr5A0x509u5kJh+q7FvRQT0
5KKWVcYkhHLjmlVxm2fvuvR3Qwg+H6a73vWIbRbFKjuKSbDX5FwXBsBKHj2v
LF1OiadeHVNgQIRTS5Ex1c.Y3vP9BRxOugmGoBGUkHXa3ADg3Odh3YDfjw9c
4f8Fdb6s.h0HCzv1R+GBQ9Kbsik8XhzvtZDoEWMAc2EmH4fhsLVTmPwdXnXK
CPpSnMRn3bsS+zCQJ9ODANX9XnsqjH3AHRGQF6W33hFZz6BWtiMx.gHSaIHL
7Z6CzKL7GML7mHL1Ojyr.hDdPz8AYqsF8pmRh.kIGbrGeJBjN4MkJ8rRkco2
yx0z3cqyFLjwi5v3oZtnwpKAC51yX65Ji..0LwQStRG33yUhbLSYD5M26O04
dx6l4djl4ImxbO7ZVBYZvuXYbAqracAXWpcm..2O.7jqPPbq+zBn63cOf5rR
wCkowkNH0+gA7PlLoP1SGUtuKPUEl5OZZXDQgd5FMActlQSsFNTunHJjpscS
ul1cAKAXO1Tmv5O9D5Q1u+QiKbfpaAUqN2kZVjDG0yJqnAc2QT4FO3ViCeuw
f.xPKfdzGdmxipMSyYidS1DQlyD0p9nZKrwMVq5iPyo5iNhAkDHNcWpr+bza
KDlPci1y+5Fka8HKGXCff8q8bDhnC55Tu8t9dCfHRGo8m.hZ+4bu0bB950IL
8hhH0AQdsWU+qq8GXqGOiEBxD4bvAGLx3RE19oBZryqyiMPnap7XA3qAhMAq
CDA27XgdS+HjrzGOMJMVqi+Ykg1+svDVf1UGfP6SfNt8f6hsmzxmQdzKmbHi
Rynp8igTAeWdX6HztWXf8VRDqPDmIO1qCdnpiz6fGZSbTDK6v5eSii1xKCua
DQOA4ppoSFtt0T0ZRFTSUkFLnlLKl7UPQtFUQJ4KY34MaEoDzbt2pHoS72tz
ZRoHNiqH3.JhXTE4n.ileJxryZTETjiYWLoJZCMfjnFURDURSZ1L2PkRc6Xd
MAUQSlK2MRk4tpCRvjbRkBAfTyqInJZxfycpjbBYVebUpp7j42KtjTIU.ZFp
IrYC6NY35gSlMrCoRp.DY9UAN1rbBqRkSXytDL1U0xBPyLM4OAMU9Our7ez3
5PsB
-----------end_max5_patcher-----------

Pitched Synthesis with Noise Driven Feedback Delay Lines

The other day I was listening to Autechre’s Quaristice and thinking about my first perceptions of the album.  I remember thinking “This album has a very physical-modeling feel to it”.   I had forgotten about this, and recently I was playing around with Karplus-Strong string-modelling synthesis, after reading about the methodology behind it.

The opening of Autechre’s 90101-5l-l has these gorgeous rich resonating tones, which I accidentally seemed to be able to replicate with a Karplus-Strong model.  It sounds like they are capturing an incoming audio stream and then thrusting a noise-grain into a Karplus-Strong algorithm. (There’s more going on in there, but this seems to be the basis.)

Autechre – 90101-5l-l (Quaristsice)

Example of Noise-Grain Feedback using Karplus Strong Plucked String Synthesis Methods
 

In simple terms, Karplus-Strong tones are generated by taking a burst of noise, and creating a feedback loop that repeatedly lowpass-filters the noise and scales the volume back slightly.

The combination of each of these factors plays a part in the resulting tone:

  • The brighter the burst of noise, the brighter the attack and note.
  • The loop time determines the pitch (shorter = higher, longer = lower)
  • The amount of filtering determines the length and life of the note.

Kind of like concatenative synthesis with a recursive function applied to each grain.

So, for fun I was trying to make a synth that would generate tones using this noise-based feedback model using Max.

In order to generate a specific tones, I was basing my pitches on the relationship:

F0 = FS / Z-L

That is, the fundamental frequency (F0) is determined by the sample rate (FS) divided by the length of the delay line in samples (Z-L).  Put simply, the number of times the delay line is read through per second is its frequency in Hz.

My first versions suffered from a very strange thing where the pitch accuracy was shockingly bad.  I was getting weird results where tones were getting more out of tune the higher they went.

The first problem is that I was using a [delay~] delay line with integer sample-length delays.  Who would have thought that something that was ‘sample accurate’ could be a bad thing?

I realised later that it was bad because the generated tones were inversely proportional to the sample rate and that the values were being truncated to integer lengths.  When the length of the delay was long, the pitch was moderately accurate.  However, as the sample length got shorter, the pitch went way out.

The reason why this is bad is quite interesting.  Consider that we are working at a sampling rate of 44100 Hz.

What tone will be generated if we use a delay line consisting of 5 samples?

44100 / 5 = 8820Hz

What about a delay line of 4 samples?

44100 / 4 = 11025Hz

This is a big problem.  As the frequency that we are trying to generate gets higher, the less range we have in discrete tones.

At first, this appeared to be a sample rate issue.  If we increase the sample rate we should be able to get finer tone control.  However, doubling it means that we can get only one new pitch between the two we got at 44100.

88200 / 9 = 9800Hz

This is still not that good.  My solution was to make an abstraction and try some ridiculous upsampling with [poly~].  With this, understandably came a hit to my CPU.  The solution was still not all that good, as only one single new pitch could be attained between the previously possible tones each time the sampling was doubled.

This had been puzzling me for about a week until I decided to drop [delay~] for [tapin~].  Amazingly,  [tapin~] allows fractional sample length delays, as it does subsample interpolation when given a signal as its delay time.

The following is an example of the nature of the low-pass roll-off effect.  The clip starts with a fairly moderate low-pass filter (centred around 1500Hz) that is slowly opened up until it is barely filtering the delay line.  As the filter starts out strongly attenuating, the note is quickly damped, and demonstrates a transition from:

Muted > Damped > Koto > Plucked Bass > Slap Bass > Harpsichord > Non-Realistic Model of a Resonating String.
[Examples of Simple Plucked String Synthesis with a Relatively Bright Pink Noise Excitation.]

A simple version of the patch can be downloaded here: Karplus-Strong example