Zoom: the unicorn with an actual horn

Martins Untals
7 min readApr 18, 2019

Looking at the upcoming crop of unicorn IPOs I can't help but participate with my take as well. With the usual twist — suggestion of some product ideas.

Lyft, Uber, Pinterest, Zoom, Airbnb, Slack, Pinterest — tons of tech IPOs coming up this year. I have used most of them, and most of them have had thousands of articles written. Everybody has an opinion on Uber IPO (including me — I think it is The Uber Bubble), but I have not heard much about Zoom. So I wanted to say a couple of words about this company as it stands now, as well as see where it could go in the future.

In essence — Zoom is a video conference call software maker. From one angle — sounds boring. From all other angles — they are the first who have made video calls that actually work. There is no lag, no bad quality image, no jittering, and stuttering. Of course, sometimes it still gets hit by connectivity issues, but somehow it feels several steps ahead of anything else I have used. Skype is always of poor quality, Teams is unpredictable, Hangouts — trouble to connect (when I last time used them, not sure if it even exists anymore). And those are made by giants of the industry. Companies with R&D budgets of millions and millions. But somehow, it is Zoom who has figured out some sort of edge that makes it feel like — yes, finally someone has done it.

Yes, there are also ‘enterprise’ solutions, like those made by Cisco and other network equipment makers. But they are all too expensive upfront and unusable in any sort of scale, doomed to stay only in boardrooms gathering dust.

So it is in a strong competitive landscape, relatively old type of product, but Zoom is a new player who has made it work and has actually managed to make a profit in the process. This company does not rely on some future magic to turn multiyear loss-driven hyper-growth into profit sometime in an unknown future. Profit is already here and competition is lagging behind.

Of course, if Zoom's IPO is successful, competition, especially Microsoft, is going to really push to catch up. They can't afford to lose anything from their entrenched positions in enterprise. Any new software company that would start to play a big role inside enterprises, providing generic, non-customized software, could be a potential bridgehead for new trouble.

Though, in a typical Microsoft strategy, they will just try to integrate (badly) their products with one another and then cross-sell (excellently) them to CIOs all across the world. And if the quality is really good, Zoom's days would be counted.

So, what could be the product strategy for Zoom for it to stay ahead of the pack? Let's look from different angles and see what we can come up with.

Smarter conference analysis

There is almost no smartness currently within any conference call software. Nobody knows what exactly went on during the meeting — videoconference platform itself is as smart as rotary telephone when it comes to the actual content that goes through.

  • Voice recognition algorithms could be employed to make automatic transcripts of conference calls. Seeing how Facebook and youtube have been automating transcription in recent years, I am sure that technology is not out of reach
  • Even if actual transcription of text would be too hard to achieve for now, especially given a multitude of languages and accents used by business users, it would be interesting to try to at least classify different speakers. And then create a map of the conference call recording, showing markers who have spoken when on the call. It would be enough for the secretary of the meeting to attach the names for the first marker after the meeting, and Zoom would then fill in the rest of the occurrences. And in next call would use already previously gathered data.
  • When markers are attached, then more advanced analytics can be provided as well. How much time each speaker has used, how many times they have been interrupted, and so on.
  • If there is anybody taken notes during the meeting, they could be timestamped and automatically attached to the recording of the meeting.

Conference product extensions

A good way to stay ahead of the competition is to keep your product better. And once they catch up on the voice and video quality, then only obvious next step is to build add on products. If any of them become popular enough, then the competition has another hill to climb before they can win away your users.

  • With a click of the button (or voice command), Zoom could initiate a voice contract. Each party would identify themselves, then the agreement would be stated and then all parties would confirm the contract. Zoom would guide all the process, confirming identities (just with voice, or maybe sending OTP SMS to their phones), replaying agreement before confirmation, and at the end sending recordings to everybody's email addresses.
  • Integrations with Alexa/Siri/etc — there is a wast amount of audio processing software already built for AI assistants. Zoom could tap into it by relaying parts of the conversation to them if directed by members of the conference call.
  • Automatic chairing of the meeting — going through agenda points, giving specific allotted time to different speakers, asking for decisions, etc.

Hardware(ish) innovations

In a typical call where there are more than two people in the meeting room, and then several people at some table at the other end of the line, there is always a problem with somebody ‘speaking too quiet’. Of course, they are actually speaking ok, just mic is too far. And it is never possible to figure out how well actually you are being heard, how sensitive is the mic and so on. I think there is a lot of space for innovation in this area.

Simplest would be just to show some good visual indicator of how much sound is mic receiving at any given time. Make it very visual and very visible, then people will automatically try to get closer and speak lauder if needed.

Then we could go into the more arcane territory. Cooler. My idea is as follows — there is one microphone, attached to laptop most likely, that is having trouble picking up pretty much anything beyond the closest 2–3 people. But, if the meeting room holds 15 people, then it also holds at least 15 extra mics — everybody in the room has a phone, which has a mic. If we could somehow tap into this unused hardware, everybody would be heard. People would just open an app and it would connect to an ongoing conference call.

Phone. With apps.

Of course, having two different sound sources from the same room usually ends up in the high pitch shriek sound instead of extra clarity. So, to use extra phones as extra mics, there would need to be significant extra signal processing to be done. I am sure it would be lots of engineering, but I am also quite sure that science is out there on how to do it. People have been working with audio for decades and decades. Now we would just have a lot of it, real-time and processed by amazing CPUs that are in each phone and/or cloud machines that can even stream video game graphics in real-time now, distinguishing John from Jeff would be a relative pice of cake.

And imagine if we could also do a little triangulation of where everybody sits in the meeting room. Probably by comparing several sources of sound, by comparing relative strengths, we could plot where people are sitting. Of course, would need to provide the first two points with some location anchor — but that could be a person who operates the main laptop that is initiating the call. If they would make use of their phone mic, then it would provide a second location point in addition to laptop and any next person who joins would make location mapping more precise.

Conference table. All triangulated.

The direction of sound are sometimes used by conference cameras to automatically focus on the speaking person. This could be achieved by Zoom using this multiple phone mic triangulation and then laptop camera would know where to focus. Of course, laptop cameras are not the most mobile devices in history. But if somebody from the participants would be using not only phone mic, but also a camera, then video sources for another end of the call could change automatically as well.

This is a bit of a stretch, of course, it is not like everybody is going to sit holding their phone cameras pointed at others. It is not a rock concert after all. But it might come in handy, especially if solution scales from 15 person meeting room to a 500 person town hall meeting. Then anybody in the audience would get a chance to speak while the person standing next to them would provide both mic and video stream of the speaker.

So, to sum it up — Zoom is a software product that generates both revenue and profit. It does not rely on hordes of underpaid gig workers. It does not aim for the future singularity to bring a return on investment. It has the potential to expand its product with all sorts of technological gimmicks that can be made possible using technologies from the far side of the hype cycle, not some sort of crypto-vegan-AI future tech that will be mature only by the time Elon reaches Mars.

Post Scriptum. Do not treat this as investment advice. I've known nothing about that sort of thing. (setting aside my investment as a philosophy article from a couple of days ago) This is more like I-told-you-so preparatory piñata. And also a chance for me to let off some steam of idea generation.

--

--

Martins Untals

Author of “The Invisible Complexity”, IT executive, and consultant, living and working in UAE, Dubai