Mark Matamoros ITP Blog

ITP Courses

Fall 2019

Spring 2020

Fall 2020

Hello Computer

Final: Rap Ping Row Baht. December 12, 2020.


For my final project, my classmate Erkin and I collaborated on a music piece featuring a synthesized voice. Furthermore, this song is activated through a limited-feature browser application that functions via voice commands. Regarding the voice commands, the user interacts with the site where the user is instructed to mute and unmute the bass and drums of the song. While we were unable to completely fullfill this component prior to class, I was able to correct the issues after our presentation.




The music piece originally stems from a requested 40-second composition for a startup's commercial. Unfortunately, as this company was unable to raise an adequate amount of funding, the startup folded. Years after this event, a few peers and I discussed a music collaboration, where an extended, structured version of the original work would be utilized for the EP. Like many music projects, this particular one fell off the tracks.


When speculating on potential projects for my final, the idea of utilizing a synthesized voice with an unused music piece became an attractive item. Thereafter, I began the process of creating a project workflow, where a modified version of my first Hello Computer assignmnet with a re-routed system output were utilized with the mentioned song's Logic (DAW) session.

hold hold

Upon verifying the functionality of the workflow, Erkin and I discussed the functionality of the browser application. Thereafter, Erkin began writing the lyrics for the composition. Upon finishing his work, I began synthesizing the lyrics and capturing the audio within the Logic session. Editing then commenced, followed with the resynthesis of words and phrases for recapture. The music was finalized after a hasty mix and master.

The following pictures contain a few virtual instruments that were utilized within the session:

hold hold hold

The following day after completing the music, Erkin and I attempted to complete the project with the creation of the web application. While we were able to implement the response system with Dialogflow, we were unable to completely implement the logic pertaining to the music interaction, where the user is able to interact with specific musical components of the work. As mentioned, this component was fixed after our class presentation.


As mentioned within the post presentation critique, the lack of visuals had a negative effect on the project. In its current state, this application can be properly presented in a Google Home Smart Spearker. However, if this project remains in the browser, the inclusion of lyrical text will be a necessity. Addtionally, the music has plenty of room in the mixing and mastering department. Furthermore, the initial site interaction could also benefit with additional audio balancing and the reconfiguring of the automated response system, where the application gives the user a better understanding of the interaction.

Assignment 4: Diagflow Fulfillment. November 25, 2020.


Continuing from last week's lecture, the class recieved further information pertaining to Diagflow fulfillment with the utilization of Firebase and Node.js. Understandably, we student's were tasked with an assignment where we utilize these Diagflow features. Regarding this assignment, I chose to modify my previous Knock Knock Joke diagflow bot to function with the mentioned items.


As mentioned, modifications to last week's work pertained to the utilization of Firebase with Node.js. Specifically, each intent's response logic is now handled within the index.js file. Regarding this logic, each intent has an associative array of response phrases that are cycled-through upon the calling of the respective function. Furthermore, an associative array index handles the mentioned progression.


Assignment 3: Diagflow Introduction. November 18, 2020.


This prior week pertained to the utilization of Diagflow, an environment for developing an AI automated conversational system. Accordingly, we were tasked to create a bot for this week's work. Regarding the nature of this bot, I decided to create a system where users deliver "Knock Knock" jokes and receive a "Mark Matamoros" response.

These particular jokes were pulled from the Fatherly article The 87 Funniest Knock-Knock Jokes for Kids.


Assignment 2: Speech Recognition in the Browser. November 13, 2020.


In accordance with this week's assignment, I decided to utilize speech recognition in conjunction with a Chuck Norris joke database from Within this webpage, one is able to submit a subject via voice, followed with a server response for joke delivery. As the nature of this project is rather silly, I attempted to design the page in a similar manner.


Regarding the scripting component, the professor's in-class demonstration was heavily referenced for the work. Though, notable differences lie within the textual feedback and API usage. Pertaining to the latter component, the API's joke text-search query was utilized for the work. While a category search would have granted the user a directly related joke, the number of category choices was extremely limited. Thus, utilizing short phrases or words supplied a better selection of responses. Consequently, this approach does not typically deliver an exact match regarding topic, as the input only sources jokes that contain the vocalized word or phrase.

Note: I was rather dissapointed in the delivered jokes.

The following images display my source code:

hold hold v hold

Assignment 1: Speech Generation in the Browser. November 4, 2020.


My initial dive into this assignment appeared to be hindered by an external influence. Specifically, the presidential election provided a significant distraction; thus, conjuring an interesting project appeared to be challenging. However, upon rewatching the lecture and utilizing the examples that were provided within Mozilla's Web API pages, I came across an idea reflective of a childhood experience.

This experience pertains to being "yelled at" from my parents. The frustration of my bilangual mother and father began "speaking" from the computer after altering components of the "SpeechSynthesisUtterance" class. Influenced by this surprising find, I began to design the script for pauses between words, phrase repetition, gradual voice alteration, pronunciation shifts, and a "finale" of the saying "Dios Mio." Furthermore, a "cheesy" background color alteration was included for my own sophomoric humor.


It must be noted that the foundational components of the script are reflective of the professor's in-class coding demonstation. However, as mentioned in the prior paragraphs, additional items were programmed. The following list contains the specifics:

hold hold