Mark Matamoros ITP Blog

ITP Courses

Immersive Listening

Week 5: Sketch 5 - Unreal Engine Audio Spatialization - Sound Cues and VCV Rack. October 15, 2019


For week 5, our class was introduced to the creation of Sound Cues for audio playback within the Unreal Engine. Through the utilization of this particual object, one can organize, manipulate, and playback multiple audio files within a single entity. This ability allows further flexibility within the implementation of audio inside this programming environment, as one would not need to re-render multiple tracks within an audio workstation if individual audio samples (within a rendered file) needed to be re-adjusted.

Additionally, the class recieved lecture pertaining to the virtual modular synthesizer environment VCV Rack. While my background within audio synthesis is lacking experience in the modular area, this information provided a great experience in a virtual utilization of this type of hardware.

Our assignment for the week required the utilization of VCV rack for audio generation, where the captured sounds would be utilized within an Unreal map.


Reflecting upon the nature of the prior week's project, I chose to focus on the musical aspects of audio generation. Within the VCV Rack environment, I approached the sound generation in a manner similar to my experiences in patch creation for synthesizers. Specifically, I typically will create a patch of strong interest. Thereafter, I "mutate" this patch to find other sounds that might compliment the original. With this method, I am able to create an audio palette containg complimentary sounds. Upon 20 modification iterations of the original patch, I independently captured each one for editing.

VCV Rack GUI VCV patch loader with patch files

After handling the sound renderings, I loaded each audio file into a Logic session, where I began editing the sounds. As the manner of capturing audio from VCV did not cater to grabbing a solid "start" and "end" point of the generated sound, I needed to establish these points within Logic. Upon finding these points, I duplicated and tested each file to confirm a solid looping capability within Unreal. It must be noted that my processing approach within Logic was rather minimal, solely involving equalization. This notion derived from my desire to let the generated sounds "speak for themselves." Furthermore, I find that my workflow is more efficient with constraints (such as this one).


Upon rendering these edits within Logic, I began the Unreal phase of the work. In line with my somewhat minimalist approach pertaining to the sound generation, I chose to create a map in a similar manner. Specifically, my thought towards the utilization of my rendered files could lead to a progression of evolving sounds as the player journeyed within the environment. In light of this notion, the map would need to cater to a specific character movement. Thus, a hallways would be ideal. It must be noted that minimal, geometric "splashing" lighting was created to give depth, while the end-wall was completely splashed with light. This latter aspect might lead to an "end" of the sonic journey.


After creating the map component, I began the creation of sound cues for the environment. This process contained a strong sense of trial-and-error within the layering of sounds. Furthermore, upon establishing multiple sound cues, I began referencing each one to the next to find how one cue would progress to the other. This approach led to multiple revisions of each sound cue until I was satisfied with the cue palette. Thereafter, each sound cue was laid within the map and adjusted for sound coverage.


After removing the viewable player components and adjusting the player speed to control the pace of sound exploration within the environment, I captured video and sound footage of the work. The following video contains this demonstation:


This project proved to be quite enjoyable, as I have yearned for more experience with modular synthesizers. Additionally, the prior week of issues within the Unreal environment granted an easier-to-navigate experience within the process of this work. I must note that the balancing of audio samples within the cues could be further tweaked. Additionally, the implementation of reverb for some of the less-tonal samples could lend to a "smoother" sound cue experience.

Week 4: Sketch 4 - Unreal Engine Audio Spatialization - Ambient Sound Object. October 7, 2019


This week's lecture and assignment pertained to the utilization of Epic's Unreal Engine for sound spatilization. I must admit, I was quite excited to venture into this area, as the utilization of this environment is an entry way to future areas of employment. I'm currently under the impression that this type of programming environment can possibly prove be highly assistive in both AR and VR works.

As one might conclude from my prior statements, I did not have any experience in this area of development. Logically, this project proved to be a challenging experience in the implementation and handling of audio.


We students were required to create or manipulate an environment for the utilization of Ambient Sound Objects. Thereafter, these Ambient Sound Objects would be placed within the virtual space, creating a "sound stage." As I have intended to explore the creation of music experiences within gaming engines, I chose to create a work reflective of this notion.

I would imagine that Unreal's workspace has been finessed throughout the years, as navigating within this environment was handled in a relatively easy manner. Object creation, object manipulation, and asset organization were readily available. It must be noted that we were encouraged to utilize Unreal's premade, level creation Blueprint's, as the focus of this assignment pertained to the implementation of sound. In light of this notion, I chose to utilize the First Person Blueprint, as navigating through a map as the playable character would highlight spatialized audio.

Unreal Graphic User Interface

Upon creating a new instance of the First Person Blueprint, becoming familiar with the programming environment, and manipulating the map with a minimal aesthtic, I began planning the creation of my project. Within the familiarization process, I noticed that lighting objects could be programmed in various manners. Particularly, the Spotlight Object had strong similarities to the Ambient Sound Object, where the throw, intensity, and falloff mirrored aspects of a sound source. Upon internalizing this comparison, I chose to utilize Ambient Sound and Spotlight Objects in conjuction within one another. Specifically, each spotlight's coverage would be reflective a sound object's coverage. Furthermore, each spotlight would vary in color, and each variance would represent a music sequence.

Unreal GUI with a spotlight's associated sound in the viewer

In light of this idea, I created a new Logic (audio workstation) session to begin the music creation process. Upon some deliberation regarding the creation of these sequences, I chose a minimalist approach, as the map had been handled in a similar manner. For audio generation, I chose to utilize Native Instrument's friendly synthesizer Massive. Thereafter, I wrote a 4-note chord progression that was later split into single note sequences. With this tactic, the user could explore the environmnet and experience single note sequences to full chord sequences in a spatialized manner.

Massive, a synthesizer plugin Logic session for sound creation

After rendering the individual note sequences, I included these files within Unreal session's project asset folder. Thereafter, I began adjusting both the lighting parameters of a single spotlight and the sonic characteristics of the associated sound file. Upon finalizing these parameters, I began creating a grid of Spotlight and Ambient Sound objects. It must be noted that I added a drone across the center of the map for testing purposes. I chose to have it remain in the map for sound variation purposes.

Ambient Sound Object Parameters Spotlight Object Parameters Spotlight with Sound grid

Thereafter, I set the engine to "play" this map, where I would be able to experience the final result. While navigating the environment, the spotlights/sounds within my close vicinity were sequenced in an expected manner, where the firing and looping of each sequence were correctly timed. However, upon navigating to further spotlights, each note sequence was delivered in an offset manner. Though this miss-timing carried a somewhat disorienting, yet interesting experience, it was not my intended output. In light of this situation, I began pursuing the internet for answers.

After hours of attempting to find a solution to this issue, including trial-and-error programming, I came to the conclusion that the amount of sound occurences had a limit. This issue appears to have a bit of logic, as sound processing and handling for the user should only pertain to sounds that would occur from the player's perspective. Any sound occurences outside of this perspective should not be processed within the engine, as it could be wasteful of resources. In light of this conclusion, I reduced the amount of spotlights/sounds within the environment to a number that could be handled in an appropriate manner.

Reduced Spotlight and Sound grid

The following video displays the finalized project. It must be noted that I had difficulty capturing the audio output of the Engine. Due to this situation, one application captured the audio (Audio Hijack) and another captured the video (Quicktime). Futhermore, utilizing the mentioned applications caused performance issues in the demonstration.


Though the PROCESS section contained the highlights of the work, smaller issues needed to be handled throughout the project. Some of these problems included the following: wall and ceiling manipulation, gun removal from character, and modification of the character movement. However, as mentioned in the Process area, the biggest issue occured with the note sequence handling. Numerous attempts in remeding this situation involved the utilization of multiple instances of the same audio asset, project setting manipulation, and Ambient Sound Object manipulation. Though the prior attempts did not remedy the sitatuion, they proved to be lessons in manipulating aspects of the Unreal Engine.

With that said, I'm looking forward to further explorations within this environment.

Week 3: Sketch 3 - 360 Video and Audio Spatialization. September 30, 2019


Upon reviewing the students' sketches from the prior week, we were given a lecture regarding the utilization of spatialized audio for 360 degree video. Furthermore, this information specifically pertained to utilization of ambisonic recording equipment for 360 degree camera handling. Our week's assignment required the utilization of the mentioned equipment to create a chosen scene.

After pondering upon various scenarios where this setup would be useful, I chose to capture a moment at ITP's facilities, where a student (myself) would experience distress when exposed to loud music emananting from a nearby room. Furthermore, the loud music would stop after opening the room's door, leaving the student in a confused state. The scene concludes with the student's final door opening, where Eduard Khil's "I am Glad, 'cause I'm Finally Returning Back Home" is loudly playing within the hallway.


Regarding the ambisonics setups, two units were completely self-contained, where the recorder included a built-in microphone. While these units could be handled in a relatively easy manner, I chose to utilize the third setup, where a dedicated ambisonic microhpone (Sennheiser Ambeo VR 3D) would be connected to a multichannel field recorder (Zoom F8N). In this particular setup, the quality of microphone and recorder was able to produce a higher quality recording, as the sensitivity of the microphone was superior than the others. Furthermore, my prior experience in field recording did not grant any reservations with this choice.

Upon arriving to the ITP floor in an early morning hour, I began setting up the equipment for the work. Though this physical process was quickly handled, placing the setup in a proper location proved to be somewhat challenging. This notion pertains to the placement of the video recorder in conjuction with the microphone, as both items needed to be next to each other to properly capture the aural and visual compenents of the scene. As this last statment was a necessity, minimizing the view of the audio setup was difficult, as the camera was able to catpure the entire environment (360 degrees).

Sennheiser Ambeo VR 3D microphone, Zoom F8N field recorder, and Ricoh Theta V 360-degree camera. Project assets folder

After attempting to handle this issue, I began shooting the scene. At this moment, I utilized my phone to time-out my movements, as the assignment required a scene-length of 1 to 2 minutes. After the second take, I felt that one of the two recordings would work for the project.

Regarding my inital step in the post production phase, I downloaded the captured videos onto my laptop. Upon choosing the better recording of the two videos, I processed my choice through Ricoh Theta's application, where the initial two-sphere video recording was changed into a rectangular 360-degree video. Afterwards, I edited the video through the utilization of OS X's quicktime player.

Original unprocessed video

Within the audio editing phase of the work, I initially planned to utilize audio samples for physical sounds that might occur within the environment. However, the unplanned occurences of two passing individuals in the recording were nicely captured from the audio recording equipment. In light of this situation, I decided to forgo the sound effects and let the microphone "speak for itself." The captured footsteps and jangling keys were lovely. While this component of the design process was deemed unecessary, loud music needed to be edited and processed to create the illusion of music emanating from a nearby room.

For song selection, I chose to utilize Darude's "Sandstorm." Processing this music proved to be somewhat challenging, as the song not only required a reflective quality of sound (room characteristics) captured within the environment, but also needed to be properly positioned in a spatialized manner. To handle the muffling and room characteristic for the music, two equalizers and a single reverb were heavily manipulated. This process took numerous passes within the Reaper workstation to grant a desired effect.

Reaper Project Screenshot

Upon finalizing the audio processing, I began the encoding phase for the 360 video. Fortunately, the downloaded Facebook tools included an easy-using application to encode the video. However, after initially processing the video (for upload) and reviewing the item on Facebook, I was rather dissapointed in the music levels and positioning. Thereafter, multiple re-edits and encoding instances were handled until I was satisfied with the work.

Reaper rendering settings for 360 video. Encoder settings for Audio 360 application

The following video diplays the final product. It must be noted that the uploaded video will not be properly viewable in one's browser, as it was specifically designed for viewing within Facebook. Furthermore, the spatialized audio willl not coordinate with the video due to this issue. However, it should grant an overview of the audio processing.


As my experience with handling 360 video processing and post production spatialization was not evident, this project proved to be quite challenging. While having a visual reference for viewers may assist with the placement of audio within an environment (as visual cues can benefit localization), perhaps it can be percieved that the precision of design in accordance with field recordings requires a great deal of accurracy. I am fortunate that I not only chose a scene with minimal action, where I was able to focus on processing a few audio cues, but I was also fortunate that the recording equipment was able to capture a high degree of audio fidelity.

All-in-all, the biggest "take away" from this work is having a great understanding of both the quality of utilized equipment and nature of sound that will occur within an environment. I would imagine that this preparation will dictate how one should handle this manner of field recording.

Week 2: Sketch 2 - 3D Spatialization. September 23, 2019


For the second audio sketch, students were asked to create a 60-second audio narrative utilizing Facebook's Spatial Workstation's Reaper session. Within this environment, we are able to spatialize audio for a 360 degree environment, utilizing a set of tools specifically developed for this process.

Regarding my project, I chose to create a scene that highlights movement in an environment, where an insect is floating around the subject's head. Furthermore, planes fly past the subject and perform a bombing run close to the individual's vicinity.


In line with the prior project, I initially began this work with an audio sample list that contained necessary audio files for the project. Furthermore, this list was organized in accordance to the scene's events. It must be noted that both and BBC's sound effects library were utilized for sample gathering.

Upon gathering multiple samples of each item, I began placing the files within Reaper's timeline in accordance to the given actions. Thereafter, the majority of the samples were treated with equalization, followed by spacializing each sound through the utilization of the 360 Spatialiser plugin. Regarding the movement of sound, I utilized Reaper's latch automation to create a dynamic motion of audio within the environment.

Project session showing automation. height= Facebook's spatializer plugin

The following file displays the final product:


Though the finalized work was representative of my initial vision, I encountered a few issues within the process. My chosen narrative created issues in the location of required audio samples for the project. Specifically, there were a limited amount of audio selections pertaining to the insect noise, airplane flybys, and bombing sounds. In light of this issue, the chosen samples required a great amount of processing to create an audio experience where the audio sample quality contained similar sonic characteristics from one to the next. Furthermore, equalizing the bombing samples with deeper low-end cuts could have smoothed the final mix.

Regarding levels, my utilization of the Spatialiser plug-in created gain structuring issues, where the summing area was recieving a non-ideal amount of energy. This was compensated through the utilization of a final brickwall limiter's gain input function. Within future projects, I am intending to pay closer attention this mentioned issue.

Despite the mentioned problems, I must note that Facebook's Spatialiser plugin is rather impressive, as the depth of placing sounds within a 360 environment was rather convincing.

Week 1: Sketch 1. September 16, 2019


For our first audio sketch, our class was asked to individually create a 60 to 90 second narrative of our choosing within the DAW Reaper. As I haven't utilized this particular workstation in many years, I found that this assignment would prove to be a great re-introduction to the application.

After some self-deliberation of potential ideas, I chose to design a realistic narrative. Within this particular story, the character is attempting to fry frozen french fries. However, the character's attempt is thwarted by a kitchen fire caused by oil splashing onto the stove top.


To begin the audio sample gathering process, I envisioned the entire product from top-to-bottom, as I needed to create a list of sounds for the work. The following list contains the actions of the character and environment:

It must be noted that the scene was designed into two parts, as the oil would need time to heat to an appropriate level for frying. After creating this list of actions, I began gathering sounds form Furthermore, any sounds that were either unavailable within the site or deemed a "better-record-it-myself" sample, I would handle the sound design with my Zoom H4 field recorder. The following samples were found (and renamed): sample page Project assets folder

Regarding this list, the stars that followed the sound sample indicated the amount of samples located for each item. Though, all samples were not necessarily utilized for the project. Furthermore, as I was required to utilize self-recorded samples, I decided to track footsteps and room-tone. Both were not present in the list, as I failed to account for these necessary sounds for the actions within the narrative.

Upon gathering and recording the samples for the project, I began placing them on Reaper's timeline. Throughout the process, the common means of treatment pertained to cutting the low-end of the samples, as this bottom-end of the spectrum was heavily present within all recordings. Furthermore, the mid-range was reshaped for the samples to assist with creating "space" within the mix and sonic-similarity from sample to sample.

After placing each item in the timeline, further processing was implemented. Reverb was applied to various samples in attempt to place each noise/action within the kitchen. Compression was applied to smooth out dynamics of the samples. Delay was specifically applied to the fire to grant an exageratted effect. Furthermore, the kitchen fire "ignition" was time-stretched to create a dramatic moment. Lastly, pitch shifting was implemented on the burner noise in attempt to have the sample sit lower in the mix for spatial purposes. After balancing the levels of these samples, panning was applied in both static and automated manners.

Reaper Project Screenshot

After the mentioned processing, a brick-wall limiter was applied to the stereo bus to shape the overall dynamic and bring the mix to a suitable listening level. It must be noted that two external plugins were also utilized in the mixdown: Izotope's RX and Voxengo's Elephant. The former was utilized to "clean-up" a few of the audio samples, and the latter handled the brick-wall limiter functions. The following audio player contains the final audio mix.


Though I was somewhat satisfied with the results, there are a few areas for improvement. Primarily, the samples utilized within the project can be improved, as the selection found within derive from multiple sources; thus, the quality of each sample differs from one to the next. In an ideal situation, the recording of the samples would be solely handled within a controlled environment, where the manner of recording and type of equipment would be consistent. Furthermore, an ideal recording environment would be isolated from any unwanted external noises.

The other notable change within this work would pertain to the overall mix. Though a great deal of processing assisted with placement and spatialization within the sonic field, further processing could be implemented to "glue" the mix together and assist with further separation of sounds placed within the environment.