Sound to MIDI

A device that listens to incoming music, then transmits an in-tune MIDI note.

The code in this article can be found here: https://github.com/pfeiffer3000/Sound-to-MIDI

I wanted a way to have synthesizers in the DJ booth play notes that were in tune with the music being played during a live mix. This attempts to do that, and works reasonably well if you don’t listen too closely. 🙂 I know I could just use my own two hands to play the synths, but that sounds waaay to easy. I figured I could make it waaay harder if I slogged through programming it all, building the circuits, and 3D printing the case. In the end, I was right, but I was also wrong. The final product works pretty well, making it easier to play the synths with no hands. I’ve used it live, and it helps create a weird ambiance in the set by injecting pads and drones that belong to no one.

Originally the microcontroller used a microphone to listen to sounds in the room. But that quickly became awkward when the device was listening to itself via a feedback loop that mixed with the DJ music. Depending on the levels of the synths and music, the MIDI notes would produce some interesting results that didn’t always sound good. So I replaced the microphone with an RCA jack that listens to the main output from the DJ mixer. That way it only listens to the music while it tries to play along.

The microcontroller performs a Fast Fourier transform (FFT) on the incoming musical sound, then focuses on the lowest, and hopefully strongest, of the frequencies. FFTs is an computational algorithm that breaks any sound into a series of sine waves that represent the relative frequencies and amplitudes present in the sound; it breaks the sound into analyzable building blocks. In a lot of musical sound, the lowest frequencies tend to be the root or tonal center of the current music. That’s not always the case, and there have been a lot of cool experiments (musical and scientific) that explore this feature of the way we prefer to absorb music. For the drum & bass music that we play on the air, the bass notes are usually the most important, and are usually the tonal center of the overall music. That turns out to be super helpful in this project.

After analyzing the sound, the microcontroller then uses an Arduino MIDI library to send out a single note that hopefully matches the fundamental frequency from the music. Because of nerd reasons, I often have to program things to the tempo of 175 bpm, which is our preferred tempo for drum & bass music. Specifically, I use 343 ms as a time measurement in my code because it’s the time for one beat to elapse at 175 bpm. A measure takes 1372 ms, an eighth note lasts 171.5 ms, and so on in that fashion until you realized that you’re waaay too deep in the genre for rational thought, lol. It’ll be a sad day if I ever switch to playing house music… All my code will be wonked!

The device uses a variable resistor to set the time each MIDI note should last. The knob’s analog reading is mapped from 1 to 32, then it’s multiplied by 343 ms to determine how long each note should last. It sends a “note on” MIDI message, waits for a multiple of 343 ms, then sends a “note off” message an eighth note before then end of the duration. This way, if the synth is playing sounds with long attack and release times, it kind of sounds like it’s in time with the music.

An inherent flaw in long notes is that they are inevitably lagging the music. If the microcontroller was fast enough, it could process and send MIDI message faster than listener could perceive. But I wanted this to produce drones and atmosphere, which means longer notes, lots of reverb and delay, and a sense of space. So, I just have to be judicious about when I let the automatically controlled synth notes come into the mix, like in the middle of a phrase, or at a transition to an atonal section such as a drum intro.

I tested the device using an Arduino Leonardo because of its ATmega32u4, which allows pretty easy MIDI implementation over USB. Eventually, I used an ESP32 because of the increased memory and ability to perform the FFT faster. The MIDI output is on a DIN connector on the back of the unit with data coming from one of the ESP32’s GPIO pins. It’s serial data, and the MIDI library takes care of most of the work for you.

The code below is borrowed heavily (stolen) from many sources online, and I’m kicking myself now for not documenting it better. I hadn’t planned on sharing the results, so I didn’t keep track of from whom I had stolen…I mean ‘borrowed.’ I apologize in advance for the inconsistencies in variable naming conventions — I still haven’t picked one that I stick with, despite our family adage of, “Pick one and stick with it!” Thanks to the great minds that figured out the FFT in Arduino, the MIDI libraries, and Adafruit for developing the libraries for the OLED screen!

Below is the Arduino code used in this project.

Sound Input To MIDI

/*  Listens to mic or audio line input, does FFT, turns that into MIDI output note
 *  
 *  OLED Screen pins on Uno: 
 *  SCL --> A5
 *  SDA --> A4
 *  
 *  OLED Screen pins on Leonardo:
 *  SCL --> D3 (digital pin3)
 *  SDA --> D2 (digital pin2)
 *  
 *  OLED Screen pins on ESP32:
 *  SCL --> 22  
 *  SDA --> 21
 *    
 */

#include <Arduino.h>
#include <Wire.h>
#include <Adafruit_GFX.h>
#include <Adafruit_SSD1306.h>
#include <mtof.h>
#include <HardwareSerial.h>
#include "arduinoFFT.h"

#define SCREEN_WIDTH 128 // OLED display width, in pixels
#define SCREEN_HEIGHT 64 // OLED display height, in pixels

// esp32 start TX2 for MIDI.
HardwareSerial SerialPort(2);  //if using UART2
#define RX_pin 16  
#define TX_pin 17  
#define MIC_pin 34 

#define LED_PIN 2 
#define POT_PIN 4

String start_msg = "FFT - MIDI";

int potValue = 0;

// Declaration for an SSD1306 display connected to I2C (SDA, SCL pins)
Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, -1);

// FFT stuff======================================
#define SAMPLES 256              //SAMPLES-pt FFT. Must be a base-2 number. Max 128 for Arduino Uno.
#define SAMPLING_FREQUENCY 1024  //Ts = Based on Nyquist, must be 2 times the highest expected frequency. We only want frequencies below ~500 Hz for long pad notes from synths.

arduinoFFT FFT = arduinoFFT();
 
unsigned int samplingPeriod;
unsigned long microSeconds;
   
double vReal[SAMPLES]; //create vector of size SAMPLES to hold real values
double vImag[SAMPLES]; //create vector of size SAMPLES to hold imaginary values
 

void setup() {
  // Various setup==============
  pinMode(LED_PIN, OUTPUT);

  // esp32 start TX2 for MIDI====
  SerialPort.begin(31250, SERIAL_8N1, 16, 17);  // baudrate, SERIAL_8N1, rx_pin, tx_pin

  // FFT setup===================
  samplingPeriod = round(1000000*(1.0/SAMPLING_FREQUENCY)); //Period in microseconds   

  // OLED display setup==========
  if(!display.begin(SSD1306_SWITCHCAPVCC, 0x3c)) { // Address 0x3D for 128x64
    Serial.println(F("SSD1306 allocation failed"));
    for(;;);
  }
  delay(1000);
  display.clearDisplay();
  display.setTextSize(2);
  display.setTextColor(WHITE);
  display.setCursor(0, 30);
  display.println(start_msg); 
  display.display();
  delay(1000);
}

void loop() {
  potValue = analogRead(POT_PIN);  // the pot determines the length of the note
  potValue = map(potValue, 0, 4095, 1, 32);  // maps the pot value to a range of 1-32 beats at 175bpm

  int note_total_time = potValue * 343; // 343 is one beat at 175bpm

  int note_off_time = 172; // 171.5ms is length of eighth note at 175bpm
  int note_on_time = note_total_time - note_off_time;
    
  double freq = FFT_calculation();

  int midi_note = int(mtof.toPitch(freq));
  
  Serial.print(freq);
  Serial.print(" -- ");
  Serial.println(midi_note);
 
  display_info(freq, midi_note, potValue);
  
  noteOn(0x90, midi_note, 0x45);
  digitalWrite(LED_PIN, HIGH);
  delay(note_on_time);
  noteOff(0x90, midi_note);
  digitalWrite(LED_PIN, LOW);
  delay(note_off_time);
}

void display_info(int value1=0, int value2=0, int value3=0){
  display.clearDisplay();
  display.setCursor(0, 0);
  display.print("Freq: ");  
  if (value1 >= 10000){   // If nothing is connected to the input pin, the freq and midi values display a really large number
    display.print("NA");
  }
  else{
    display.print(value1);
  }
  display.setCursor(0, 20);
  display.print("MIDI: ");
  if (value2 >= 10000){
    display.print("NA");
  }
  else{
    display.print(value2);
  }
  display.setCursor(0, 40);
  display.print("Time: ");
  display.print(value3);
  display.print("b");
  display.display();
}

double FFT_calculation(){
  for(int   i=0; i<SAMPLES; i++) {
    microSeconds = micros();
  
    vReal[i] = analogRead(MIC_pin); //Reads the value from analog input pin, quantize it and save it as a real term.
    vImag[i] = 0; //Makes   imaginary term 0 always

    /*remaining wait time between samples if necessary*/
    while(micros() < (microSeconds + samplingPeriod)) {
        //do nothing
    }
  }

  /*Perform FFT on samples*/
  FFT.Windowing(vReal, SAMPLES, FFT_WIN_TYP_HAMMING, FFT_FORWARD);
  FFT.Compute(vReal, vImag, SAMPLES, FFT_FORWARD);
  FFT.ComplexToMagnitude(vReal, vImag, SAMPLES);

  /*Find peak frequency and print peak*/
  double peak = FFT.MajorPeak(vReal, SAMPLES, SAMPLING_FREQUENCY);
  //Serial.println(peak);     //Print out the most dominant frequency.
  return peak;
}

void noteOn(int cmd, int pitch, int velocity) {
  // send MID not on message
  SerialPort.write(cmd);
  SerialPort.write(pitch);
  SerialPort.write(velocity);
}

void noteOff(int cmd, int pitch) {
  // send MID not off message
  int velocity = 0x00;
  SerialPort.write(cmd);
  SerialPort.write(pitch);
  SerialPort.write(velocity);
}

Discover more from DJ Pfeif

Subscribe to get the latest posts sent to your email.

Leave a comment