James Holden: On Human Timing
Making electronic music has become a precise science. But what’s missing in this perfect world of grids, clips and quantization? Often it feels like a track is lacking a certain something, but it’s hard to put your finger on it. More often than not, the answer lies in the fine art of groove and swing. It's the errors and inconsistencies that give a beat its vibrancy, and a new patch from James Holden, the Group Humanizer, can shoot that much-needed human feel into your productions.
Based on research from Harvard scientists, Holden has built a Max for Live device which automatically shapes the timing of your audio and MIDI channels, injecting the organic push-pull feel you can only get from human performance. In fact, Holden has just introduced the patch into his live show, allowing his modular synthesiser to follow the shifting tempos of his live drummer. Now he’s made it available publicly to show how minute shifts of timing can turn a stale groove into something full of life and energy.
A whole lot of thought, preparation and development went into the Group Humanizer. Before you download the patch and try it out for yourself, you can read the background from James Holden himself, giving a detailed account of the theories and challenges behind turning his concept into a reality. It gets down to the complex topic of human perception and makes for a deep read for anyone interested in the finer points of groove and rhythm.
On Human Timing by James Holden
"But what made Black Sabbath Black Sabbath was the way each of them interpreted what the others were playing. Those reactions create tension – they create the band’s sound. Technology makes it easy to get everything ‘right.’ But if you rely on technology to get it right, you’re removing all of the human drama. The way most music is made today is parts are created and then played perfectly and then copied and pasted. Everything’s in time, everything’s in tune, but it’s not a performance. My goal was to get Black Sabbath back to performing together – to jamming – because they are experts at it." - Rick Rubin.
That quotation has stuck in my head ever since I read Andrew Romano’s interview with the legendary producer and former Columbia Records co-president Rick Rubin in Newsweek last year. When it was published it felt like every musician I knew was referencing it – Rubin managed to explain something about making records real that seemed to strike a nerve with everyone. I personally felt like he had perfectly expressed an essential truth – a hunch that had been growing in the back of my mind for years that the unfakeable magic of a live performance was vitally important in the enjoyment of music. But it turns out it’s not just me and Rick: now it also seems to be corroborated by scientific research by Harvard scientist Holger Hennig, published in the Proceedings of the National Academy of Sciences.
The Harvard scientists focussed on one aspect of musical performance – the fine (millisecond level) details of timing when two people play together. What they found was that the timing of each individual note is dependent on every single note that both players had already played – a minor timing hiccup near the start of a piece will continue to affect every single note after it, up to the last notes. And when you play a duet every note your partner plays affects your playing, and every note you play affects your partner : a two directional information transfer is happening.
Dr Hennig’s paper also references other research which suggests this information transfer back and forth occurs at a deep and fundamental level. When measured in experiments the patterns of electrical activity in the brains of duetting musicians almost exactly correspond. Some neuroscientists think that rhythm – not just in music but in movement and speech – is how we spot the 'uncanny', the unnatural, even how infants recognise other animals of the same species. In short, human timing is very important.
In the thousands of years before the invention of recorded music the only type of music people ever heard was live performance. The first few decades of recordings were broadly similar – get a group of good musicians together in a nice sounding room and keep recording until they all get it right in the same take. But as technology advanced it became possible to record the musicians separately, overdubbing over mistakes if needed. This vastly reduced the cost of recording, and also gave rise to a new idea: that the goal of recorded music involved capturing the “perfect” performance of an individual. And once digital studio equipment appeared this went even further – a bass player no longer needs to make it through a whole verse without a mistake – as long as s/he gets the bassline right just once the producer can copy-paste it wherever needed. And in the process of rendering audio information into something that can be easily manipulated on a computer screen, music software has pinned most modern music rigidly to a constant, inflexible grid.
Thus, over the years recorded music has gradually edged further away from simply capturing a live performance to evolve into a completely different beast. If musicians aren't playing together at the same time at any point in the recording process, there can't be a two-directional information transfer between them. At best, there's a one-directional transfer of timing information from what's already on the tape to the musician overdubbing a new layer.
By way of example, the Harvard team also produced three versions of ‘Billie Jean’. In all three the scale of the random errors (i.e., the average number of milliseconds each beat could be out) was kept the same, but the scientists varied the amount of correlation between the individual errors.
The first has had completely random timing errors inserted, with no link between any previous timing error and the current one, and no link between the errors in different parts. The result sounds unmistakably unmusical and inhuman.
The second mimics a recording where each musician has recorded their contribution in a separate take along to a click track. In each part every error is linked to all the preceding ones but there is no causal link between the timing errors of the individual parts. This version sounds like it has been played by a group of inept musicians, sloppy and unconvincing.
Finally the third version uses the model developed in the paper (known as stochastic fractal linkage) to mimic how real musicians actually play together. And although the average error size is identical across all of the recordings, in the third recording it feels noticeably less sloppy. It's actually quite hard to pinpoint which notes are off because the individual parts are moving around together, naturally.
The point here is: if everything is recorded together in the same take then quite large variations in timing are no problem – they don't sound like errors, just the natural movement of the music. But if the parts are multi-tracked, or sequenced parts are mixed with human parts, then the timing errors are glaringly obvious, they sound wrong because they are unnatural, and our capability to identify the uncanny marks them out as unpleasant and undesirable.
As studio technology has evolved the unintended result is that the size of timing error – and the degree of rhythmic fluidity – tolerated in recordings has had to shrink; if it isn’t played (or faked) to a really tight grid it sticks out like a sore thumb. The science may not be able to prove that this is an inherently bad thing but it is clear that something has been lost along the way. Would more natural musical conversations connect better with audiences? And once all the human timing errors are removed from a piece does it still represent a meaningful musical interaction between human musicians?
And beyond the limited scope of the science – beyond just the timing and back to all the unquantifiable expressive nuances of musicians responding to one another that Rick Rubin was talking about – this thesis seems to be even less of a stretch, it’s downright self-evident. To me, the joy of a live show is watching those interactions, seeing something that is actually happening in the moment. And I can’t be alone in having enjoyed a band live and then been sadly disappointed that their records fail to capture the same spirit. When electronic musicians (through fear or lack of ability) reduce their live shows to performing minor embellishments while a pre-prepared wav file plays through the speakers the sense of lifelessness in the result seems painfully obvious. And listening through your record collection the stark difference in (intangible) feeling between any bands' jammed-out breakthrough LP and their painstakingly-constructed-in-an-expensive-studio final LP is depressingly predictable.
In my own musical work, despite coming from a computer-based music background (that being the easiest entry point for a teenager), I’ve spent years experimenting with ways to make my music sound real; playing everything I could live as well as building chaotic systems (in software and in my modular synthesiser) to simulate the kind of expressive feedback that occurs between musicians and generate similar levels of non-designed fine detail. But convincing timing was always a difficult thing to achieve without actual musicians being involved.
Now using the model proposed in Holger Hennig’s research I’ve developed a set of Max for Live devices which are able to inject a realistic timing into multiple computer generated parts, as if they were being played by musicians performing together. It can even listen to input from a real musician (for example, the jazz drummer who plays together with me in my live shows) and respond to his or her timing errors in a natural manner. This is the first time such a facility has been available for computer sequenced music: as of now there’s no excuse for over-straightened airbrushed fakery. It’s a free download at the link below - consider it my contribution to the resistance.
Download James Holden’s Group Humanizer from MaxforLive.com.
Keep up with James Holden on Facebook and Soundcloud.