Zeus, the ruler of Olympus, was not known for fidelity to his wife Hera. One of Zeus's many consorts was the river nymph Io. The ever-suspicious Hera one day sought out Zeus in an attempt to catch him with the nymph, but before she could do so, Zeus transformed Io into a splendid white cow. Pretending to be entranced by its beauty, Hera coyly asked Zeus for the animal as a gift - a request that he could not refuse. Once in possession of the transformed Io, Hera set one of her favorite servants to guard the cow - the giant Argus, said to have 100 eyes, with at least one open at any given time. For this he was known as Argus the All Seeing.
Argus was so dedicated to his task that it eventually cost him his life at the hands of Hermes, sent by Zeus to retrieve the cow. For his loyal service and watchfulness, Hera honored Argus by placing his eyes on the tail of her sacred bird - the Peacock.
Not in mine eyes alone is Paradise. - Dante Alighieri, "Paradiso"
Last week's editorial provided a general overview of the field of Machine/Computer Vision. We looked at some of the technical aspects of the technology and its current range of applications - both those already deployed as well as those that are still being developed and wrung out. The opportunities for creating a modern day Argus seem almost endless.
There are some companies that are the Davey Crocketts and Daniel Boones of this sector, leading the charge into unexplored territories. In today's blog post, we'll look at some of these leaders and innovators to get a flavor for where things are heading.
(Open Disclosure: none of the firms below made a request of me or paid me to talk about them, nor do I have any financial interest in any of these companies.)
Based in Israel, this company pioneered computer vision applications for the automotive sector and call their collective offerings "Driver Assistance Systems." Mobileye's solutions are a combination of SoC hardware and software that use images captured both in the visual spectrum and with radar. The image data is interpreted to identify, track and anticipate position changes in vehicles, pedestrians, animals and detritus. Static items such as road boundaries, barriers and lanes can also be identified. Traffic signs and signals can be both detected and understood.
Offerings are 3P optimized and built to automotive standards for reliability and durability. They rely on a single camera for a simpler, more cost effective implementation. The company describes their products as providing a driver extra 'eyes' on the road to provide warnings on traffic conditions, speed limits and so forth - an automotive version of Argus, so to speak. There are also radar-visible light 'fuzion' systems for tasks such as collision prediction and avoidance.
Systems are offered with their own display & control units and can alternatively be connected to a user's smartphone thru Bluetooth. Product configurations exist that provide 360 degree coverage for the vehicle. There is even automated support for high beam control.
It took a while for Mobileye offerings to catch on, but they have now done so with a vengeance.The company has become a near monopoly standard for automotive collision avoidance and lane integrity systems. Currently, 18 vehicle manufacturers employ Mobileye products in 160 car and truck models. Mobileye projects this to grow to 20 OEMs and 237 models over the next 12-18 months.
The runaway success of Mobileye has in fact become a problem for the automotive industry. The SDK, software stack and hardware are all closed and provided 'as is', leaving precious little room for differentiation. Naturally, the industry is seeking alternatives amongst a broad selection of systems and chip houses which are scrambling to get a piece of the action.
Mobileye is not wasting time sitting on their laurels, however, and is well ahead in defining the future of computer/machine vision for automotive applications. The company is partnered with two as yet unidentified auto manufacturers to release a more complex system that will permit unmanned, autonomous driving on highways. The offering will combine multiple cameras & radars and is being developed to function properly at highway speeds while accounting for traffic congestion. Products beyond 2016 are being engineered to permit unmanned driving on smaller county roads and even within urban areas, though the company openly admits that the software hurdles are daunting.
As a semiconductor IP supplier based in Mountain View, California, CEVA has risen to a position of hegemony in the embedded DSP space. The company acutely perceives the opportunity that ubiquitous dispersed electronics represents and is taking steps to be a major player in the SoC underpinnings of the IoT in all its manifestations, including machine/computer vision.
To that end, CEVA has developed an interesting offering for this space - the MM3101. The processor is a VLIW/SIMD engine with 7 parallel blocks capable of single cycle operations up to 256b. The architecture is Modified Harvard, with separate L1 program and data caches in addition to scratchpad/TCM support. Vector and scalar processing are handled by separate blocks in a 10 stage pipeline. Combined with its home-grown toolchain and a version of the OPEN-CV computer vision library optimized for the architecture, the MM3101 is targeted at offloading CPUs and GPUs for image processing and computer vision algorithms. With the MM3101 handling these sorts of streaming data applications, CEVA offers SoC designers the option of harmonizing their chip architecture by dedicating the CPU to decision heuristics, network access & administrative/control operations and letting the GPU focus on 3D modelling and analysis.
Though certainly a well conceptualized offering, the MM3101 has a potential flaw common to image and video processing architectures with multiple deeply pipelined parallel DSP units - namely, how to keep those pipelines highly utilized. Such SIMD engines are notorious for compelling programmers to hand-optimize algorithms at the assembly level in order to achieve decent IPC (instructions per cycle) in each parallel pipe - a very aggravating and iterative process filled with coding dead ends, rewrites and conundrums.
At the moment, it appears CEVA is depending on its internal applications folks and third party partners to develop optimized algorithms on behalf of their customers. It remains to be seen if CEVA can find a means to permit customers to do so independently and with relative ease so that users can hand craft implementations and create their own unique value-add.
There's almost nothing in the High Tech world that somebody in the Fraunhofer Institute is not researching from a novel perspective. It is something of a wonder (as well as a clear indication that the organization is not managed effectively) that Fraunhofer isn't generating a level of revenues equal to Samsung, Apple, Google, Intel and Microsoft combined.
Fraunhofer's latest work in machine/computer vision stems from a subtle insight into the current limitations of CAD. In order to manipulate, observe and interact with 3D models, designers today are forced to use clunky and limited peripherals such as keyboards, mice and joysticks.
A new machine vision system from Fraunhofer completely bypasses such inefficient mechanisms with a gesture recognition apparatus. An array of cameras track and interpret hand and finger positions dynamically, altering a 3D screen image to permit a user to virtually interact with the model - in essence, as if they were reaching into the design to activate switches & levers, press buttons and so on.
The implications are enormous. Remote control of robotic/cybernetic systems could be taken to a whole new level of operability. Think of firefighters deploying cyborgs that could be remotely piloted to enter burning buildings, or miners controlling equipment deep underground while safely on the surface.
There are even some mundane applications that could benefit from such a gesture recognition appliance. Computer mice could become as archaic as wheel dial telephones. Video game console manufacturers would have to be stupid to not at least consider licensing this technology.
The Way Forward
Dreams are the touchstones of our character. - Henry Thoreau
The potential of machine/computer vision to alter the way we use all the devices and machines of our modern age is unquestionably breathtaking. But upon reflection, even these astounding innovations merely scratch the surface of what is possible.
What if, instead of remotely manipulating a cyborg with a more advanced version of Fraunhofer's gesture recognition system, you could simply tell it what to do? There are a few things a cyborg would need to have in addition to machine vision. For instance, the cyborg would have to have the capacity to hear and comprehend the spoken word - a voice capture and recognition system, along with a means of understanding, acting upon these inputs, deciding upon & executing actions and learning from their results.
We'll have to reserve such discussions, though, for a future post. ;-)
Next week, the silly season gets back into full gear! The 17 high tech companies I cover will announce their Q4 2014 and year end financial results during the weeks of January 26 and February 2. We'll be talking about these numbers and other industry developments in the January 30 and February 6 editorials. :-)