Home → The Classics → Farai's Codelab
Virtual Jo Revisited
Published:
For my senior project, I had to make something that would show what I learned in my time as a CS student. Even though my professor (Dr. Zelle) suggested that we spend the three-week Christmas break thinking of a project, I had a hard time finding one. Fortunately, my professor gave some suggestions. Among them was a virtual assistant using the Google Home. Given that I had prior exposure to voice assistants using Amazon Alexa, I went with that. It’s called Virtual Jo in tribute of Dr. Joseph Breutzmann who retired that year.
The Plan
Initially, my idea of a voice assistant was rather elaborate. It would allow a professor to control a computer using voice commands. That way, you could run terminal commands, edit files and execute scripts. After talking to my professor about it, I realized that it was needlessly complicated and it would be better to serve as an information hub. The assistant would then
- Provide information on meals served in the main cafeteria, the Mensa,
- Lookup course information
- Answer frequently asked questions and
- Help students study
With a general idea of what I wanted to do, I started by doing some research on conversational UIs. Thanks to InterLibrary Loan, I found two books
- Designing Bots: Creating Conversational Experiences by Amit Shevat and
- Designing Voice User Interfaces: Principles of Conversational Experiences by Cathy Pearl.
Honestly, these didn’t help all that much. After three weeks, I realized that I wasn’t doing much and I decided to actually start writing code.
For the stack, here’s what I used.
- Dialogflow– this was the recommended way of making Google Actions so I went with it.
- Google Cloud Functions– As part of the Google Cloud Platform, Cloud Functions is a serverless compute platform. I couldn’t be bothered to set up a server and I suspected that Cloud Functions would be much cheaper anyway since it only needed to run for seconds on each invocation. Also, since I was already with Google for Dialogflow, I decided to stick with a Google Stack. It isn’t as polished as AWS’s Lambda, however.
- Node.js– As much as I wanted to use Python, I had to go with Node.js since it was the only thing that Cloud Functions supported. It also had a great SDK for Dialogflow.
Getting Meal Information
The first task I tried to accomplish was getting meal information from the cafeteria, the get_meals
intent. Since they didn’t have an API of sorts, I had to study the web page’s HTML and parse it using cheerio.js. The scraper works for most days except those with a special event since the cafeteria staff leave a note on the menu instead.
Once I made the get_meals
intent, my professor suggested that I expand it to look up a food item and return the next time it would be available. I went over this before, but I essentially used grep. For this part, the nextServed
intent, I
- Downloaded all the menus since the menu for each day was accessible through a URL like
mensa.wartburg.edu/diningmenu/DiningHall/daily.asp?1=2018-03-04
(not a real URL btw). - Used grep to extract the list elements from all the files, extract the unique elements, sorted them and wrote them to a file.
- Went through to remove anomalies and
- Added alternative names, like
Jambalaya
forAndouille Sausage Jambalaya
. - Uploaded the resulting file to Dialogflow to turn each food item into entities.
I thought this feature would be incredibly difficult to implement, but it worked very well.
Getting Course Information
As for the find_courses
intent, it turned out to be way harder than I thought. I was initially relieved that I wouldn’t have to use a headless web browser to programmatically log in, navigate to the course search page and search for courses. I discovered that the search pages stood alone and were just <iframes>
. However, that relief went away once I started trying to scrape the webpage.
Looking back at it now, it is wise of them to make scrapping as difficult as possible since it’s needed for security. All the cookie management and anti-forgery tokens were needed to prevent, well website forgery. I got really frustrated after discovering that because it still didn’t work. Turns out that I needed to pass a referrer as well to make it work.
By the way, authentication was just part of the problem. I needed to set up a way to submit the form, parse the results (which wasn’t hard, just time consuming) and biggest of all, making a user interface by which users could access the course information. Ideally, the user interface would be able to look into related classes, return information on availability etc. Sadly, I was running out of time so the most I could do was make a basic yet fragile feature to state the course and its description.
Other Tasks
As for the other tasks such as answering FAQ’s and helping students study, I didn’t have enough time to even start those features.
Presentations
During the semester, we had to do a bit of talking. That included:
- a technical talk,
- a limited demo, and
- a general demo
It went rather well although it got a bit too complicated. My professor said it was kinda like a Google Tech Talk in its complexity (and complicatedness). I had a video of it, but I deleted it (which I shouldn’t have).
The limited demo was showing what I did to my class. It was far shorter than it was supposed to be, but I showed off something functional. Seemed like all of us were having a hard time, however.
When the time came for the day of the general demo RICE Day, I still had a lot of work to do. Still, I had to present something so I went with my Google Mini and people really liked it. While it wasn’t complete, people were amazed at how well it understood them. They also loved how you can search for a specific food item.
After RICE Day, I had to summarize and document everything for the next person who might finish it off. Unfortunately, the API I used was due for depreciation the moment I turned everything in.
Conclusion
I had high aspirations for this project, but I couldn’t pull it off because my scope was too broad and ill-defined and I didn’t approach this project in a systematic way. For that reason, I gave myself a B; I did a great job, but I didn’t finish. Thanks Dr. Zelle for helping me throughout this project and being a great advisor!
This was the first time I seriously used JavaScript in the backend. I tried Node.js while I was doing Free Code Camp, but I didn’t get it at the time. I still used JavaScript on the front end but I stuck to ES 5 JavaScript rather than the modern ES 6 standard.
Google Cloud Functions was okay, but I do think my professor was onto something when he was skeptical of cloud solutions. I was a guinea pig for the CS department expanding to cloud services. The main issue was that he was scared of vendor lock-in. I tried to decouple my code so it didn’t need Cloud Functions, but it’s still coupled to the way I went about this project. If my professor had told me earlier, I would have used J.A.R.V.I.S A.I running on a Raspberry Pi instead.
Even though I didn’t finish, it was great to see people enjoy my project. I should have gone about this project in a more systematic way, but even as I was frustrated at various points in this project, the joy I got from figuring something out was worth it. I guess that’s why I wanna be a programmer; to solve problems and gain a rush from doing so.
I’ve been struggling to revisit this project since I can’t help but think that I need to do this project properly. Having written out this revisit, I now have a sense of closure and can shelve this project to work on newer, more exciting things.
Addendum I forgot to mention. As I was making this project, I got 2 people email me for help with Dialogflow. I really wish I had the confidence to send an invoice for my help. Still, it’s kinda nice to have people reach out to you based on the work you put out there.