Generative Notation and Score with Tone.js

The concept: Notation System for Generative Music

For a long time, I try to find the bridge between generative music, which usually rely on synthesized music, and acoustic (or amplified) music playing and composition.

It felt that starting with creating a notation system for generative music could be the right approach.

MIDI

The effect that the invention of the MIDI protocol had on the evolution of generative (synthesized) music is well known and far beyond the scope of this blog post.

Since the MIDI protocol includes information about the notes that should be played and the way these notes should be played, it served as an inspiration behind my notation system for generative music.

My Generative Music Notation System

The notation system includes two main components:

Legend

The legend is sheet, that defines all possible notes (aka “objects”) that could be played during the generative score. Each note includes the following properties:

  • Color – A unique hex color code that will be used for visualization purposes to represent the note.
  • Frequency – The frequency of the note. Can be presented in Hz (e.g. 440) or letters (e.g. A#3).
  • Amplitude – Volume / velocity in range between 1 (loudest) and 0.
  • Duration – The duration of the note, including an envelope if needed.
  • Loops – The number of time the note should be repeated in a row on the score.
  • Connected_notes – This is the main difference from the MIDI protocol. The connected_notes property will hold a list of notes that should be played with or after this note. Each item on the list, which refer to a connected note, should include the following properties:
    • Color/index number of the connected note according to the legend.
    • The time on which the connected note should be initiated, including maximum and minimum values for silence after the initial timestamp (e.g. if the connected note should be played after the original note, the time will be <the_original_note_duration>+<silence_if_any>).
    • Probability value that will represent the chances that the connected note will be played. All connected notes probability values together should not exceed the value of 1 (== 100% chances).
Generative Music Notation: Legend
Generative Music Notation: Legend
Generative Music Notation: Potential Score
Generative Music Notation: Potential Score

What’s Missing?

Two major properties are missing from the note objects:

  • Instrument (or timbre) – The note object is a set of instructions that could be applied by any instrument. Since I believe that the process of generating music will include the usage of computers (or digital devices), the score can be played with a variety of instruments. The decision about the sound of the piece will be left by the hands of the performer.
  • Timing – Again, since the note object is a set of instructions, these instructions can be initiated and applied anytime during the score, by the performer or by the score itself. The decisions about the timing will also remain by the hands of the performer. The only timed notes are the connected notes which hold instruction that should specify if the note will be initiated with the original note, after the original note, during the original note, etc.

Example

For example, if we will use the legend above and will start the score with the first two notes (7D82B8 & B7E3CC), we will get the following result –

Demo

Using Tone.js, I was able to experiment with generating music based on the legend and score shown above.

The project can be seen here – http://www.projects.drorayalon.com/flickering/.

The current limitations of this demo are:

  • No instrumentation: All notes are being played using the same instrument
  • No dynamics: One of the most likable elements of a music performance is the dynamics and tensions the performer creates while playing the piece. The current implementation doesn’t support any dynamics :\
  • No probability: The current implementation presents a linear and predictable score. Notes have only 1 connected note, and no code was written to support the probability factor that will utilize the notation system to its maximum potential and will make this generative music more interesting (in my opinion).
  • Low-tech visualization: The notation system I described above set up the foundation for a readable visual representation of the score. This visual representation has not been implemented yet.

Some Code. Why Not

This is the code I’m using to run the demo shown above –

//-----------------------------
// play / stop procedures
//-----------------------------
var playState = false;

$("body").click(function() {
  if (playState === false) {
    play();
  } else {
    stop();
  }
});

function play(){
  playState = true;
  $("#click").html("i told you. it is now flickering really badclick anywhere to stop");
  console.log('playing...');
  Tone.Transport.schedule(function(time){
  	noteArray[0].trigger(time);
  }, 0.1);
  Tone.Transport.schedule(function(time){
  	noteArray[1].trigger(time);
  }, 0.4);

  // Tone.Transport.loopEnd = '1m';
  // Tone.Transport.loop = true;

  Tone.Transport.start('+0.1');
  setTimeout(backColorSignal, 100);
}

function stop(){
  playState = false;
  $("#click").html("it is probably still flicker really bad, but it will stop eventuallyclick anywhere to keep it going");
  console.log('stopping...!');
  console.log(Tone.Transport.seconds);
  Tone.Transport.stop();
  Tone.Transport.cancel(0);
}

//-----------------------------
// creating an array of note objects (noteArray)
//-----------------------------

// array of manually added notes
var noteArray = [];

// note constructor
function noteObject(index, color, frequency, amplitude, duration, loops, connected_notes_arry) {
  this.index = index;
  this.color = color;
  this.frequency = frequency;
  this.amplitude = amplitude;
  this.duration = duration;
  this.loops = loops;
  this.connected_notes = connected_notes_arry;
  this.trigger = function(time, index=this.index, frequency=this.frequency, duration=this.duration, connected=this.connected_notes){
    // console.log('time: ' + time);
    // console.log('index: ' + index);
    console.log('');
    console.log('------------');
    console.log('it is ' + Tone.Transport.seconds);
    console.log('playing: ' + index);
    console.log('frequency: ' + frequency);
    console.log('duration: ' + duration);

  	synthArray[index].triggerAttackRelease(frequency, duration, time);

    if (connected !== null) {
      var nextIndex = connected[0];
      var nextTime = 0.01 + Tone.Transport.seconds + connected[1] + parseFloat((Math.random() * (connected[2] - connected[3]) + connected[3]).toFixed(4));
      console.log('generated: ' + nextIndex);
      console.log('at: ' + nextTime);
      Tone.Transport.schedule(function(time){
        noteArray[nextIndex].trigger(time);
      }, nextTime);
    }
  };
}

// starting notes
noteArray.push(new noteObject(0, '7D82B8', 'c3', 1, 1.520*5, 0, [2,1.520*5,0.020*5,0.020*5,0.9]));
noteArray.push(new noteObject(1, 'B7E3CC', 'e2', 1, 6.880*5, 0, null));

// the rest of the notes
noteArray.push(new noteObject(2, 'C4FFB2', 'b2', 1, 1.680*5, 0, [3,1.520*5,0.40,0.80,1]));
noteArray.push(new noteObject(3, 'D6F7A3', 'c#2', 1, 3.640*5, 0, [4,0,0.8,1,1]));
noteArray.push(new noteObject(4, 'ADD66D', 'b2', 1, 0.650*10, 0, [5,0.650*10,0.2,0.2,1]));
noteArray.push(new noteObject(5, 'A4FF7B', 'a2', 1, 1.800*5, 0, [6,0,0,0,1]));
noteArray.push(new noteObject(6, '7BFFD2', 'f#2', 0.2, 1.800*5, 0, [0, 1.800*5, 1, 2, 1]));


//-----------------------------
// creating an array of synth objects (synthArray), based on note objects (noteArray)
//-----------------------------

var synthArray = [];

for (var i=0;i<noteArray.length;i++){
  options = {
    vibratoAmount:1,
    vibratoRate:5,
    harmonicity:4,
    voice0:{
      volume:-30,
      portamento:0,
      oscillator:{
        type:"sine"
      },
      filterEnvelope:{
        attack:0.01,
        decay:0,
        sustain:0.5,
        release:1,
      },
      envelope:{
        attack:0.1,
        decay:0,
        sustain:0.5,
        release:1,
      },
    },
  voice1:{
    volume:-30,
    portamento:0,
    oscillator:{
      type:"sine"
    },
    filterEnvelope:{
      attack:0.01,
      decay:0,
      sustain:1,
      release:0.5,
    },
    envelope:{
      attack:0.01,
      decay:0,
      sustain:0.5,
      release:1,
    }
  }
  };
  synthArray.push(new Tone.DuoSynth(options).toMaster());
}

//-----------------------------
// low-tech visualization
//-----------------------------
b = new Tone.Meter ("signal");
synthArray[1].connect(b);
// synthArray[2].connect(b);

function backColorSignal(){
  if (b.value === 0){
    setTimeout(backColorBlue, 100);
  } else {
    var color = "rgba(0, 0, 255," + b.value + ")";
    $("html").css("background-color", color);
    setTimeout(backColorSignal, 100);
    // console.log('b.value: ' + b.value + " " + color);
  }
}

function backColorBlue(){
  var color = "rgba(0, 0, 255,1)";
  $("html").css("background-color", color);
  setTimeout(backColorSignal, 100);
}

 

MANIFESTO

Even though I’ve spent most of my time up until now in creating new content — from short stories, articles, plays, songs, and drawings, to digital experiences and commercial products — I’ve never sat down to think about my manifesto. So now I did, and it felt just right.

At first, I felt that writing my manifesto could be a process of reinventing my creative self. As it turned out, writing my manifesto was all about clearing the dust off my original intentions and creative needs. It felt like a return to my inner creative studio, where all my inspirations are still hanging on the wall, and the stereo is still playing the great old CDs.

I guess that my present day manifesto could be summarized into a single sentence — “Keep on seeking for your own voice that will carry your words and your ideas across mediums.”

Having said that, here is a more detailed version of what I’ll try to achieve during this semester, and hopefully, forever, as a list of creative principles:

  • Aesthetics – Aesthetics could take many forms. It could be a seen as visual concept or heard as an idea. Aesthetics could be felt in the work process or received as an inspiration. To me, aesthetics is an invitation to look beyond it. It is like a clear glass of red wine that makes a person focus on the red tones of the wine, not on the glass. It is like a magic shower that makes you feel mentally clean after experiencing it. I would like my works to be aesthetic in a way that would invite a viewer or a listener and would influence his / her identity and self-esteem.
  • Surprising, and sometimes unpredictable – The expectation is what leads the viewer / listener to pay attention to my work during its presentation. To keep the work ‘alive’ with the viewer / listener after its presentation, the work should be surprising. I want my work not only to do what is definitely expected from it, but also what is beyond any expectations.
  • Emotional and humorous – To me, humor is an opportunity to cross the line and to experiment with new shapes and forms. I would like my work not only to be light and humorous, but also emotional, expressive and satiric.
  • Generative, model driven – I love patterns that change repeatedly. I want to unlock the model or the system behind my work, and to utilize it to its maximum potential and beyond.
  • Open the imagination – I expect my work to present a solution, but also to shed light on new problems and possible further development.
  • Calmness and balance – I want my work to form from my inner self, my thoughts, and my own imagination. All my inspirations and previous experiences should take the form of calmness and balance during the creation of the work. This inner balance should be present in the work itself.
  • Clarity, honesty and humbleness – I would like my work to come from an honest and humble place. It should be clear and transparent. It doesn’t mean that it has to be an open source project, but its content must be understandable. The viewer / listener should be able to know what the work is doing, and possibly, how it does it and what was the process of making it.
  • Crazy storytelling – As a result of the above principles, I would want my work to tell a crazily beautiful story, in a beautifully crazy way. Such story might only exist within the context of the work, but could serve as an inspiration for the work to come.

NOMNOM 2: The Video Machine – The Physical Computing Aspects of the Project

 

NOMNOM: The Video Machine
NOMNOM: The Video Machine

Intent

The purpose of this project was to allow users to play music (or a DJ set) using videos from YouTube.

NOMNOM is an advanced version of The Video Machine presented for the mid-term. It controls the playback of videos presented on a web browser.
By pressing a button on the controller, the correlated video is being played on the screen and heard through the speakers. The videos are being played in sync with one another. Only the videos that are being played, are being heard.

On the new version, The Video Machine controller offers four functions that allow making changes to the way the videos are being played:

  • Repetition – Affects The number of times a video is being played during a single loop (1-4 times).
  • Volume – Affects the volume of the selected video.
  • Speed – Changes the speed of the selected video.
  • Trim – Trims the length of the selected video.

The first prototype of the new version in action –

NomNom: The Video Machine

Main Objectives

The goal was to gain a few critical improvements from the previous versions of the product. After brainstorming for possible improvements, and reviewing the feedback we had received, the following objectives were chosen:

  • To keep it simple, while introducing more functionality – One of the major strength of the original version was its simplicity. We were able to achieve a design that allowed simple and self-explanatory interaction, that was enjoyable for both experienced DJs and users with zero experience.
    For the new version, new features, such as a consistent and predictable playback sequence, an automatic beat-sync between the played videos, the ability to change the number of times a video will be played over a single loop, and the ability to change the playback properties for each one of the videos while it is being played.
    The new features of the new version allow the user the achieve great results more easily, by using the same simple controls of the old version. A total of 6 new features and improvements were added to the product while adding only a single rotary switch to the previous layout.

    The new features of the new version allow the user the achieve great results more easily, by using the same simple controls of the old version.

  • To make it feel solid – The first impression the user has on a product comes from looking at it. NOMNOM was built from solid materials in order to allow the user to feel free to physically interact with it. The solidity of the controls freed up users from thinking about the physical interaction and concentrating on the content (the video and the sound).

NOMNOM: The new version

  • To smoothen the controls – Enjoyable interaction cannot be achieved only by providing a fast and easy way to complete a task. The time the user spends using the product should be enjoyable as well.
    Is order to build a smooth and fun tangible interaction, a research was done around different potentiometers, buttons, and switches. Eventually, the controls that provide the best ‘feel’, and that were the most accurate, were chosen.

  • To take further development in current considerations – In most cases, the ability to innovate comes from deep understanding of the way a certain system works. To allow further development, there was a need to build the product in a way that will make it be easy to learn and to understand, to both for us and for other future contributors. Therefore, an effort was done to design and build the inner parts of the box in a way that will be very understandable for anyone who reveals it.
    The design of the structure of the internal electronic parts, not only allowed clarity on the debugging stages, but also fast analysis and understanding of the implications of any change or addition.
NOMNOM: Designing the inner structure
NOMNOM: Designing the inner structure

There was a need to build the product in a way that will make it be easy to learn and to understand, to both for us and for other future contributors. Therefore, an effort was done to design and build the inner parts of the box in a way that will be very understandable for anyone who reveals it.

NOMNOM: In the making of
NOMNOM: In the making of

Decision-Making and challenges

Design Overview

Leaning on the design of the previous version, we made a few improvements to our electric circuits, and a few major improvements to our physical interface design.

NOMNOM: Schematic
NOMNOM: Schematic

Doing More With the Same Buttons

On of the major limitations of the first version was that in order change the playback mode (properties / attributes) of a video, the user had to stop the playback, make the changes using the knobs, and start the playback again. Therefore, one of the most important features of the new version, was the ability to change the playback mode (properties / attributes) of a single video while the video is being played.

To avoid adding a series of knobs for each on of the videos, the existing buttons are being used for two functions:

NOMNOM: A single press to start / stop
A single press to start / stop
NOMNOM: Press & hold to make changes to the video playback
Press & hold to make changes to the video playback

 

 

 

 

 

 

The component that was used for the buttons is the Adafruit Trellis, a single PCB that connects 16 press buttons.

The Trellis PCB and its Arduino library support two modes:

MOMENTARY – A mode on which button press event is detected only a buttons is being held down.
LATCHING – A mode on which button press event changes the state of the button (e.g. from ON to OFF).

NOMNOM: One of the challenges was to make the Trellis PCB support both of its different modes at the same time
NOMNOM: One of the challenges was to make the Trellis PCB support both of its different modes at the same time

One problem was that by default, the Trellis can operate on only one of these modes at the time.
Another challenge was to find an efficient way (in terms of performance) to read the button states, so the controller will be very responsive to the user actions — the changes on the screen, and on the LEDs on the controller should be immediate.

After 3-4 weeks of research on the way the Trellis PCB works and coding different experiments, the following Arduino code allowed the support of the two modes simultaneously.


#include 
#include "Adafruit_Trellis.h"

Adafruit_Trellis matrix0 = Adafruit_Trellis();
Adafruit_TrellisSet trellis =  Adafruit_TrellisSet(&matrix0);

#define NUMTRELLIS 1
#define numKeys (NUMTRELLIS * 16)
#define INTPIN A2

int LEDstatus[16] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
int blinkStatus = 1;
int blinkTime = 0;
int buttonPress[16] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
int oldStatus[16] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};

void setup() {
  Serial.begin(9600);
  pinMode(INTPIN, INPUT);
  pinMode(5, INPUT);
  pinMode(6, INPUT);
  pinMode(7, INPUT);
  pinMode(8, INPUT);
  digitalWrite(INTPIN, HIGH);

  trellis.begin(0x70);  // only one trellis is connected

  // light up all the LEDs in order
   for (uint8_t i = 0; i < numKeys; i++) {
     trellis.setLED(i);
     trellis.writeDisplay();
     delay(50);
   }

  // then turn them off
  for (uint8_t i = 0; i < numKeys; i++) {
    trellis.clrLED(i);
    trellis.writeDisplay();
    delay(50);
  }
  while (Serial.available() <= 0) {
    Serial.println("hello"); // send a starting message
    delay(300);              // wait 1/3 second
  }
}

void loop() {
  delay(80); // 30ms delay is required, don't remove me!


  /*************************************
  // SENDING DATA TO P5.JS
  *************************************/
  if (Serial.available() > 0) {

      // reading serial from p5.js
      int incoming = Serial.read();

      // print current status
      for (int i = 0; i < 16; i++) {
        Serial.print(LEDstatus[i]);
        Serial.print(",");
      }

      // step knob
      int pot1Value = 0;
      if (digitalRead(5) == HIGH) {
        pot1Value = 4;
      } else if (digitalRead(6) == HIGH) {
        pot1Value = 3;
      } else if (digitalRead(7) == HIGH) {
        pot1Value = 2;
      } else if (digitalRead(8) == HIGH) {
        pot1Value = 1;
      }
      Serial.print(pot1Value);
      Serial.print(",");

      // volume knob
      int pot2Value = analogRead(A1);
      int pot2ValueMapped = map(pot2Value, 0, 1020, 0, 100);
      Serial.print(pot2ValueMapped);
      Serial.print(",");

      // speed knob
      int pot3Value = analogRead(A0);
      int pot3ValueMapped = map(pot3Value, 0, 1020, 0, 100);
      Serial.print(pot3ValueMapped);
      Serial.print(",");

      // cut knob
      int pot4Value = analogRead(A3);
      int pot4ValueMapped = map(pot4Value, 0, 1020, 0, 100);
      Serial.print(pot4ValueMapped);
      Serial.print(",");

      // blink data
      Serial.print(blinkTime);

      Serial.println("");
  }

  /*************************************************
  // CHANGING BUTTON STATES BASED ON BUTTON PRESSES
  **************************************************/
  blinkTime = blinkTime + 1;
  if (blinkTime == 5) {
    blinkTime = 0;
  }

  trellis.readSwitches();
  for (uint8_t n = 0; n < numKeys; n++) {
    if (trellis.justPressed(n)) {
      LEDstatus[n] = 3;

      continue;
    }

      if (LEDstatus[n] == 3) {
        buttonPress[n]++;
        if (blinkTime >= 4) {
          if (trellis.isLED(n)) {
            trellis.clrLED(n);
            trellis.writeDisplay();
            } else {
              trellis.setLED(n);
              trellis.writeDisplay();
            }
        }
      }

    if (trellis.justReleased(n)) {
      if (buttonPress[n] > 8) {
        LEDstatus[n] = 1;
        oldStatus[n] = 1;
        buttonPress[n] = 0;
        trellis.setLED(n);
        trellis.writeDisplay();
      } else {
        buttonPress[n] = 0;
        if (oldStatus[n] == 1) {
          LEDstatus[n] = 0;
          oldStatus[n] = 0;
          trellis.clrLED(n);
          trellis.writeDisplay();
        } else {
          LEDstatus[n] = 1;
          oldStatus[n] = 1;
          trellis.setLED(n);
          trellis.writeDisplay();
        }
      }
    }
  }
}

This code includes a fast and efficient protocol to read the different states from the Trellis board using a single read command, and to communicate them to the web browser using ‘handshaking’.

At a first glance, this code looks simple, but it includes a fast and efficient protocol to read the different states (“ON”, “OFF”, and “Being pressed”, a state that was used to make changes to the video playback) from the Trellis board using a single read command (trellis.readSwitches()), and to communicate them to the web browser using ‘handshaking’.

More about the programming behind NOMNOM can be found on this blog post, and on the project’s GitHub repository.

Finding the Right Potentiometers

As much as the Trellis board was satisfying as our press buttons, the movement of the potentiometers needed an upgrade. A long research and multiple experiments with different types of potentiometers and knobs (mostly from Adafruit, DigiKey) were made. It appeared that the knobs and potentiometers offered by Mammoth Electronics were the smoothest to turn, most built using high-quality materials, and fit best with our design vision.

Fabrications

One of the major objective for the new version was to make the physical interface feel as stable as the software that supports it. The desire was to build the box from more solid materials, which do not feel breakable like wood or delicate like thin acrylic. Therefore, a solid metal enclosure was used to add sense of strength and stability to the overall interaction.

To avoid any ‘shaky’ feeling when interacting with the product, the design of the drilled holes on the enclosure had to be very accurate and tight to the size of the electronic components.

NOMNOM: Design sketch before the drilling process
NOMNOM: Design sketch before the drilling process

User Testing

After building the first fully functional prototype, a user testing phase some light on the strength and weaknesses of the product.

Luckily, the physical interaction worked well and was largely understood by the users. A few changes were done to the terminology – The term “Steps”, which described the number of times a video will be played within a single loop, was changed to “Repetitions”, and the term “Cut”, which described the ability to trim the video, was changed to the term “Trim”.

The rest of the changes, based on the users’ feedback, were done on the graphical user interface, which now includes a much simpler and straight forward indications for each and every video status.

Presenting the Project to New Audience

As part of the process, I presented the product in front of a new audience, outside of the ITP community. This experience allowed us to get feedback from an audience that is closer to our target audience, and helped us to be more prepared for the (intense) presentation at the ITP Winter Show.

\

The ITP Winter Show

NOMNOM: The Video Machine was presented at the ITP Winter Show 2016.

NOMNOM: The Video Machine @ ITP Winter Show 2016
NOMNOM: The Video Machine @ ITP Winter Show 2016

NOMNOM 2: The Video Machine – The Programming Behind the Project

Credit: This project was developed together with Mint. Thank you :))

For my ICM final, I worked on an improved version of my mid-term pcomp project.

This time the computational challenges were even greater.
Here is the outcome after long weeks of intensive coding –

NomNom: The Video Machine

NOMNOM’s github repository can be found here – https://github.com/dodiku/the_video_machine_v2

Synching the videos

As a conclusion from the mid-term project, we wanted to give users that ability to play cohesive music. In order to that, we knew that we have to find a way to make sure that all the videos are being played in sync (automatically).

There are many ways to make sure the media is being played synchronously, but none of them deal with videos. To workaround that, we repurposed 2 functions from the p5.js sound library — Phrase and Part.
We used these functions to handle our playback as a loop that includes bars. We can call any callback function at any point on the loop, and therefore, we can actually use them to time our play and stop functions (and many others), based on the user action.


/*********************************************
SETUP FUNCTION (P5.JS)
*********************************************/
function setup() {
  noCanvas();

  // setting up serial communication
  serial = new p5.SerialPort();
  serial.on('connected', serverConnected);
  serial.on('open', portOpen);
  serial.on('data', serialEvent);
  serial.on('error', serialError);
  serial.list();
  serial.open(portName);

  // creating a new 'part' object (http://p5js.org/reference/#/p5.Part)
  allVideosPart = new p5.Part();
  allVideosPart.setBPM(56.5);

  // adding general phrase (http://p5js.org/reference/#/p5.Phrase) to the 'part'
  var generalSequence = [1,0,0,0, 0,0,0,0, 1,0,0,0, 0,0,0,0, 1,0,0,0, 0,0,0,0, 1,0,0,0, 0,0,0,0];
  generalPhrase = new p5.Phrase('general', countSteps, generalSequence);
  allVideosPart.addPhrase(generalPhrase);

  for (var i = 0; i<16; i++){
    allVideosPart.addPhrase(new p5.Phrase(i, videoSteps, [0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0]));
  }

  // console.log(allVideosPart);
  allVideosPart.loop();

}

We initiate the Part, a Phrase per video, and a general Phrase that will be used as a clock, on the setup function.

The ‘countSteps’ callback function is being used to store the current step on a global variable, and the ‘videoSteps’ callback function is being used to play and stop video at the right time.

First success with the beat-sync feature – 

Improving the UI

We really wanted to make it easier for users to understand what is going on on the screen, and to provide a better sense of control on the videos.

In order to achieve that, we used the NexusUI JS library and added 4 graphical elements, each of which indicates a different property of the video (number of repetitions, volume, speed, and trim), on every video.

The graphical elements are shown to the user only when the video is being played.

Also, we add a grayscale CSS filter on videos that are not being played. This way, it is easier for the user to focus on the videos that are being played and making sounds.

Built to perform

While designing the technical architecture for the project, I faced many limitations, mostly because of the slow nature of the ASCII serial communication protocol. Therefore, I had to develop a very efficient internal communication protocol to compensate for the delay we had when pressing the buttons on the box. That was the only way to achieve fast responding controller, that will change the video states on the screen immediately.

This was the first time I was required to write efficient code (and not just for the fun of it). After 2 weeks of re-writing the code, and reducing few milliseconds every time, I came up with the following lines:

Reading data from controller (Arduino side) –


trellis.readSwitches();
for (uint8_t n = 0; n < numKeys; n++) {
  if (trellis.justPressed(n)) {
   LEDstatus[n] = 3; 

   continue; 
   }
    
    if (LEDstatus[n] == 3) {
        buttonPress[n]++;
        if (blinkTime >= 4) {
          if (trellis.isLED(n)) {
            trellis.clrLED(n);
            trellis.writeDisplay();
            } else {
              trellis.setLED(n);
              trellis.writeDisplay();
            }
        }
      }

    if (trellis.justReleased(n)) {
      if (buttonPress[n] > 8) {
        LEDstatus[n] = 1;
        oldStatus[n] = 1;
        buttonPress[n] = 0;
        trellis.setLED(n);
        trellis.writeDisplay();
      } else {
        buttonPress[n] = 0;
        if (oldStatus[n] == 1) {
          LEDstatus[n] = 0;
          oldStatus[n] = 0;
          trellis.clrLED(n);
          trellis.writeDisplay();
        } else {
          LEDstatus[n] = 1;
          oldStatus[n] = 1;
          trellis.setLED(n);
          trellis.writeDisplay();
        }
      }
    }

Parsing the data on the browser (JavaScript side) – 


/*********************************************
PARSER: PARSE DATA THAT ARRIVES FROM
ARDUINO, AND APPLY CHANGES IF NEEDED
*********************************************/
function parseData(data){

  // parsing the data by ','
  var newStatus = data.split(",");

  // turning strings into integers
  for (var x=0; x CONTINUE
    if ((newStatus[i] !== 3) && (newStatus[i] === videos[i].status)){
      var vidID = i+1;
      vidID = "#video" + vidID;
      $(vidID).css('border-color', "rgba(177,15,46,0)");
      continue;
    }
    else {

      // getting the relevant phrase
      var phraseIndex = i;
      var updatedPhrase = allVideosPart.getPhrase(phraseIndex);

      if (newStatus[i] === 3){

        if (videos[i].originStep === null) {
          videos[i].originStep = currentStep;
        }

        changeColor(i, 1);
        showKnobs(i);

        videos[i].volume = vol;
        videos[i].cut = cut;
        videos[i].speed = speed;
        videos[i].steps = newStatus[16];
        changeKnobs(i);

        // making the video border blink
        var vidID = i+1;
        vidID = "#video" + vidID;
        if (newStatus[20] === 2) {
          if (($(vidID).css('border-color')) === "rgba(177, 15, 46, 0)"){
            $(vidID).css('border-color', "rgba(255,255,255,0.9)");
          }
          else {
            $(vidID).css('border-color', "rgba(177, 15, 46, 0)");
          }
        }


        // clearing the sequence
        for (var n=0; n<32; n++){
          updatedPhrase.sequence[n] = 0;
        }

        // applying steps changes, if any
        var stepNum = videos[i].originStep;
        for (var m=0; m 31) {
            stepNum = stepNum - 32;
          }
        }

      }

      else if (newStatus[i] === 1) {
        videos[i].status = 1;
        changeColor(i, videos[i].status);
        var vidID = i+1;
        vidID = "#video" + vidID;
        $(vidID).css('border-color', "rgba(177,15,46,0)");
      }

      else if (newStatus[i] === 0) {
        videos[i].status = 0;
        hideKnobs(i);
        changeColor(i, videos[i].status);
        var vidID = i+1;
        vidID = "#video" + vidID;
        $(vidID).css('border-color', "rgba(177,15,46,0)");

        // clearing the sequence
        for (var n=0; n<32; n++){
          updatedPhrase.sequence[n] = 0;
        }

        videos[i].originStep = null;

      }
    }
  }
  serial.write(1);
}


When I review this code now, it all seems so simple (LOL!), but this is one of the pieces of code I'm most proud of.

After looong hours of coding, we are very happy we what we achieved 🙂

The MusicSystem Explained

Background: Why artists still compose music into 3-5 minutes songs?

Ever since popular music has been broadcasted by radio stations (somewhere between 1920’s and 1930’s), and consumed by listeners all over the world, artists were recording most of their music as 3-5 minutes songs.

This convention was born out of a technical limitation – The Phonograph, an early version of the record players we use today, could only play 12” vinyl records. Moreover, when an artist recorded a new album or a new single, the only way to ship it to the local or national radio station was by sending it using the US Post Office services. The biggest box one could send at that time, for a reasonable price, was a box that could only hold only a 12” record. As you can probably guess, a 12” vinyl record can hold a tune no longer than 5 minutes.

A century ago, music production, consumption, and distribution processes have gone completely digital. Even though most of the music we listen to today is basically bits of data that can be manipulated using simple algorithms, we still consume it in the 3-5 minutes linear format. Unlike other mediums, such as text or video, which in many cases are being consumed in a non-linear form, audio is still being consumed (and composed) in short linear sprints.

I believe that in the age of data, we can do more than that.

Let’s Record Data

The MusicSystem will allow musicians to record their musical ideas, and will help them turn them into an endless flow of music, structured from their own core concept.

The software will capture live recording, extract it to its musical features, and will format these features into a reusable data structure. Using this new data structure, the software will create countless versions and combinations, that will all accumulate the essence of the original piece.

The MusicSystem will use the data that will be extracted from the original recording to compose new music. The original recording could be handled as one musical version generated from the data, or as the main piece of the entire tune.

The artist will be able to control the way the music is being interpreted and recomposed, as well as to set rules about the way the music will change according to a variety of inputs, such as sensors.

More about all of that the sections below.

The System and Its Parts

Microphone: Recording Analog Signal

The initiator to the entire composition will a recorded sound. An artist will play an acoustic instrument or will amplify an electric instrument, and analog sound will be captured by a microphone.

The microphone will be connected to a computer, that will run analog-to-digital process to generate a digital file. The digital file will hold all the raw data about the analog recording (using this data, computers are able to play digital music files, such as .wav or .mp3 files).

Digital Audio Analysis

The purpose of The MusicSystem is to use the recorded sound as data, in order to generate new music out of it (instead of playing the recorded data itself).

The software will try to retrieve musical information from the recorded sound — From beat detection, to musical structure, notes, tone, repetition, and any other feature that can extracted from the file.

Using the Recording as a Practice Dataset

The captured and analyzed data will be fed into a neural network, that will identify the relations within it.  Using these relations, The SoundSystem will be able to generate a huge variety of compositions, that encapsulate the same relations.

Since we deal with generative music, composed by a machine learning algorithm, with small data set to practice on, the artist and machine will have to ‘converse’ in order to help the machine to focus on the faster on the expected results. The feedback from the artist will be used as a second dataset, that will be fed into the neural network.

Just like at the beginning, at a certain point, the artist will be able to decide if the music will be recorded and saved as (very) long file, or to save the music a set of rules and configurations. These rules and configurations will be saved as file, which will be used by The SoundSystem player to generate music, based on the artist recordings and decisions.

Playing Infinitely

Once the data has been analyzed, The SoundSystem will generate digital sound based on this data, infinitely.

The infinite playing mode will allow the artists to experiment with different aspects of the musical piece, with the effects of changes (see below) or new recordings, and to capture snippets of the infinite loop and make them permanent (played in a loop, which means that these pieces will not be randomly generative any longer).

The end user will listen to the music in that exact infinite form. The artist will be able to decide where the inifinate playing starts, but not where it ends.

Controlling the New Composition

If we use the recorded sound as data-feed and not as part of the desired outcome, we are starting to loose connection with the original recording. The original recording only ‘inspires’ the end result, but not strictly dictates it.

If the captured data can be interpreted and used to generate new music, we can assume that one of the outcomes could be a tune that is identical to the original recording. The probability that the software will play the original recording will be controlled by the artist. The artist will be able to control the way the software will handle the analog recording:

  1. As a final result that will be played as recorded
  2. As data that will teach the software how to generate new music
  3. As a combination – The recorded audio will be played entirely, and the data extracted from it will be used to generate new music.

Besides that, the artist will be able to control the generative outcome in a variety of ways, such as:

  • Highlighting specific recordings – The artist will be able to decide which of the recordings will be handled as a ‘major’ recording (will have more influence on the end result), and which ones will be handled as a ‘minor’ recording.
  • Use the generative sound as an input – The artist will be able to mark a specific part of the generative music, and us it a new input for The SoundSystem.
  • Strick VS. Loose music generation – The artist will be able to decide how ‘close’ the generative music will be to the original narrative enclosed in the recorded parts.
  • Sensors – The artist will be able to use sensors to change the musical outcome. For example, when the user is walking, in a dark room, or breathing heavily, the music will be played differently.
  • 3rd party data (rules) – The artist will be able to use 3rd party APIs and datasets to affect the music. For example, the music will be heard differently on holidays, or on a night when Phoenix Suns wins a basketball game.

Recording Some More

At this point of the interaction, the cycle can start to repeat itself in order to expand the results or to focus them on a specific musical idea.

The artist will be able to record more and more analog sounds, each of which will be extracted to a new dataset that will make The SoundSystem more educated about the artist direction.

Commits and rollbacks

To allow better communication with the musical piece, I would like the artist to feel free to make decisions, and the change them. In order to do that, I would like to implement a git, and to allow the artist to ‘commit’ changes, and to rollback to an older version of the musical piece.

Open Questions

This broad concept raises some unsolved questions:

Which Data Should Be Analyzed by the Software?

The software can analyze the DSP data, that is being generated through the Analog-to-Digital conversion of the recorded sound. This is the data that is being used to create and play the digital music file.

On the other hand, the software can analyze the digital file itself, and to retrieve information from this analysis.

It is currently unclear which data could be more relevant to create automatically generated (new) music, based on this data.

What is the Relevant Data?

Many types of data can be extracted from a digital music file. What data is relevant for this specific project? How can this data be manipulated or iterated to be used to generate data that is relevat for music creation (or music synthesis)?

How to Capture the Essense of the Original Recording?

It is critical to isolate the data the is most indicative of the ‘original essence’ of the recorded piece. The question about ‘what is an essence?’ or ‘what determines the essence of a musical piece?’ can be raised as well.

What is the relation between the software the composition itself?

Let’s assume that we use data A, that was extracted from the original recording, to produce data B, that will be used to generate new music. Isn’t the decision to produce data B, instead of to produce data C, a composition decision? Will the neural network make these decisions is a ‘trivial’ way, or is it the developer that is actually pulling the composition strings?

How to Create an Infinite Interaction?

In order to create an infinite piece of music, it could be assumed that an infinite creative process should be applied, or at least a procedure that allows such creative process.

The current system design will require the musician to put the instrument down in order to interact with the software.

Inspirations

There are two major inspirations to this project:

  • The Echo Nest API – A music information retrieval API that was used to extract musical features from a recorded track. The API, which is currently closed to the public, inspired the technical possibilities in the field.
  • The Infinite Jukebox, developed by Paul Lamere This web application inspired the creative applications that are currently possible using musical data, such as those provided by the Echo Nest API.

 

 

Final Project Proposal – The SoundSystem

Overview

Ever since popular music has been broadcasted by radio stations (somewhere between 1920’s and 1930’s), and consumed by listeners all over the world, artists were recording most of their music as 3-5 minutes songs.

This convention was born out of a technical limitation – The Phonograph, an early version of the record players we use today, could only play 12” vinyl records. Moreover, when an artist recorded a new album, or a new single, the only way to ship it to the local or national radio station was by sending it using the US Post Office services. The biggest box one could send at that time, for a reasonable price, was a box that could only hold only a 12” record. As you can probably guess, a 12” vinyl record can hold a tune no longer than 5 minutes.

A century ago, music production, consumption, and distribution processes have gone completely digital. Even though most of the music we listen to today is basically bits of data that can be manipulated, we still consume it in the 3-5 minutes linear format. Unlike other mediums, such as text or video, which in many cases are being consumed in a non-linear form, audio is still being consumed in short linear sprints.

I believe that in the age of data, we can do more than that.

Inspirations

The inspiration for the problem, and for the first steps of the solution, can to me from watching and interacting with The Infinite Jukebox project, build by Paul Lamere. Lamere posted a blog post, that tell about the process of making this project.

The Infinite Jukebox - user interface
The Infinite Jukebox – user interface

snapshot-111212-1004-am snapshot-111212-1005-am

 

Project proposal – The SoundSystem

I would want to build a system that will liberate music creators from composing their musical ideas into 3-5 minute songs.
Instead, artists will be able to focus and record their musical idea, and the system will generate an infinite, interactive, and dynamic piece of music, “conducted” by the artist.

Since I won’t be able to build the entire project for the ICM course final, I plan to build the first part of this project. The specifications of this part are highlighted in the text.

This how I would imagine the interaction (at least of the prototype)

Recording and analysing the recorded sound:

  • Artist will record a short snippet of audio.
  • The system will identify the tempo of the recorded snippet (beat detection).
  • The system will analyse the recorded snippet to get frequency data, timbre, etc. (and maybe in order to identify notes and / or chords?).
  • The system will suggest a rhythmic tempo to go along with the snippet.
  • The system will play the recorded snippet as in infinite loop, along with the rhythmic tempo.
  • The system will try to find new ‘loop opportunities’ within the snippet, in order to play the loop in a none linear way.
  • The artist will be able to record more musical snippets.
  • The artist will be able to choose which parts will be played constantly (background sounds), and which parts will be played periodically.
  • The system will suggest new and interesting combinations of the recording snippets, and play these combinations infinitely.

The listener interacts with the played tune:

  • Since the tune can be played infinitely, some controls will be given to listener. Each and every artist will be able to configure these controls differently. For example, one can decide that the controls will include 2 knobs, one of them changes the tune from ‘dark’ to ‘bright’, and the other changes the tune from ‘calm’ to ‘noisy’. The artist will decide what will happen when each one of these knobs is being turned.
  • For the ICM final, a generic user interface will be provided to the listener. The interface will include a visual representation of the played tune, and will allow the listener to change the rhythmic tempo.

Applying machine learning algorithms:

  • The system will try to generate new music, based on the recorded snippets, and earlier decisions by the same user. This new music will stretch the length of the recorded tune.

Modifying the system’s decisions:

  • The artist will be able to effect the system’s decisions about the looped tune, and about the new music it generates. For example, the user will be able to decide when a specific part enters, or which algorithmic rules won’t generate new music.

Applying sensors and automations

  • The artist will be able to set rules based on 3rd party data or sensors. For example, the tune can be played differently if it is rainy on the first day of the month, if it is currently Christmas, if it is exactly 5:55am, or if the light in the room was dimmed to certain level. These rules will apply to each tune separately.

Formatting

  • There should be a new music format that could hold the tune (or the snippets) and the data necessary for playing it correctly. In the same way, a new player should be introduced in order to read the data and to play the tune correctly.
  • This format should allow the artist to update the tune configuration or the musical snippets at any time, after the tune was distributed to the listeners.
  • For the ICM final (and probably for the end product as well), the tune will be played in the web browser.

 

Controlling video playback features

Overview

For the first time in my ITP history, I was able to combine home assignment for ICM with home assignment for Pcomp.

I created a video manipulation interface, that could be controlled by a physical controller.
The entire project was build with Mint for our Physical Computing class mid-term.

The Video Machine - Web Interface
The Video Machine – Web Interface

 

The Video Machine – Web Interface from Dror Ayalon on Vimeo.

Functionality

I used the JavaScript video API to do the following manipulations on the video playback:

  • Loop – Playing the video in an infinite loop.
  • Volume – Changing the volume of the sound.
  • Cut – Trimming the length of the video.
  • Speed – Changing the speed of the video playback.
The Video Machine
The Video Machine

Code

Here is the JavaScript code I used for this project –

The Video Machine

Overview

The Video Machine is a video controller, powered by an Arduino, that controls the playback of videos presented on a web browser. By pressing a button on the controller, the correlated video is being played on the screen and heard through the speakers.
Videos are being played in an infinite loop.
Only the videos that are being played, are being heard.

I was lucky enough to work on this project with the super talented Mint for our Physical Computing class mid-term.
Working with Mint not only was a great learning experience, but also a lot of fun! I hope I’ll be able to work with her again on our next project (more on that below).

The Video Machine from Dror Ayalon on Vimeo.

Many thanks for Joe Mango, our beloved resident, who assisted a lot with finding the right technologies for the project, and helped us on one critical moment, when suddenly nothing worked.

The Video Machine – Split Screen from Dror Ayalon on Vimeo.

The building process

The process of building The Video Machine went through the following stages:

  • Prototyping – Once we had a broad idea about what we want to make, we wanted to test how hard would it be to build such interaction, and if the interaction feels ‘right’ to us.
  • Understanding the complications The prototyping stage helped us understand what could be the possible complications of this product, and what might be the limitation. We analysed what could be the limitations of the serial communication between the Arduino board and our laptop computer, and what types of video manipulations could be easily achieved using JavaScript.
    Understanding what’s possible helped us shape our final design, and the different features
  • Designing the architecture – Before we started to build the final product, we talked about the technical design of the product under the hood. These decisions basically defined the way the end product would operate, and the way users would interact with it.
  • Picking up the technologies – To apply our technical design, we needed to find the right tools.
    For the video manipulations, we decided to use vanilla JavaScript, because its easy to use video API. The biggest discussion was around the implantation of the buttons, on which the user needs to press in order to play the videos. After some research, and brainstorming with Joe Mango, we decided to use the Adafruit Trellis. That was probably the most important decision we took, and one that made this project possible to make, given the short amount of time we had at that point (four days).
  • Building, and making changes – We started to assemble the project and the write the needed code. While doing that, we changed our technical design a few times, in order to overcome some limitations we learned about along the way. And then came then moment where everything worked smoothly.
The Video Machine - Final product
The Video Machine – Final product

Some code

The entire code can be viewed on our GitHub repository.

Reactions

The reactions to The Video Machine were amazing. The signals started to arrive on the prototyping stage, when people constantly wanted to check it out.

When we showed the final project to people on the ITP floor, it appeared that everyone wants to put a hand on our box.

The Video Machine
The Video Machine

People were experimenting, listening, looking, clicking, laughing, some of them even lined up to use our product.

The Video Machine
The Video Machine

Further work

I hope that Mint and I will be able to continue to work on this project for our final term.
I cannot wait to see the second version of The Video Machine.
I believe that the goals for the next version would be:

  • To add more functionality, that will allow better and easier video/sound manipulation.
  • To make playing very easy for people with no knowledge of music or playing live music. Beat sync could be a good start. The product should allow anyone to create nice tunes.
  • To find a new way to interact with the content, using the controller. This new interaction needs to be somethings that allows some kind of manipulation to the video or the sound, that is not possible (or less convenient) using the current and typical controller interface.
  • To improve the content so all videos will be very useful for as many types of music as possible.
  • To improve the web interface to make it adjustable for different screen sizes.
The Video Machine - Controller
The Video Machine – Controller

The ITP GitHub Hall of Fame

To practice my web APIs skills, I got some data from the GitHub API, and used it to create a list of the most starred (== popular) ITP related repositories.

See it here.

The ITP GitHub Hall of Fame
The ITP GitHub Hall of Fame

The expected (unpleasant) surprise

The process of making this page wasn’t smooth at all. In fact, this project cannot be further away from my original intentions. Here’s the truth:

  • I decided to let users enter a word (as an input), and to use this word to form a Haiku poem from three different poems.
  • I found poetry.db, which at first glance looked like the perfect source for my project.
  • I started to work on the logic that will form new Haiku poems out of the text I’ll get from poetry.db.
  • The logic was partially ready, and I was eager to test it on real data.
  • Oh no! My browser is shouting at me that I have a cross origin request error, and it is blocking my request to the API. I’m starting an endless search to find a solution, while going over countless Stackverflow threads, that appeared to be somewhat useless in my poor situation.
  • Aha! Apparently, JSONP should be JS’s workaround for this problem. My project could be saved!
  • Oh no! poetry.db’s server does not support JSONP. My options are:
    • To setup a web server and to hope for the best.
    • To try to find a new project that has better chances of success. — Chosen.
  • A new project! This time I’ll go for an API that was build by great developers, for the average developer, something that is a ‘one size fits all’, and I can assume that is stable and well maintain – GitHub’s API!

Some conclusions

GitHub’s API is well structured, fast, and reliable. I will surely recommend it to everyone.

Eventually, this process taught me (again), that when I deal with web APIs, the hustle is just part of the game. I’m sure that if I had some more time (a few extra days) I could have found a solution for the poetry.db API, and I could have completed my original project.

The cross-origin problem is a problem I face often, and I hope that we will be able to talk about it in class. I really feel that mastering the logic behind the cross-origin workaround is critical for web development.

(if you got to this point, feel free to scroll up and click that ‘See it here’ link again).