7 years of hard work, iPhone X facial recognition, past lives
In June 2010, Jobs released the classic 4. This generation of iPhone is still three years away from fingerprint recognition, and only three months later, Apple acquired a Swedish company called Polar Rose, which is a small company with only 15 people. Is it anything else that Apple is interested in? It is Polar Rose’s unique face recognition technology. That’s right, this is the predecessor of the Face ID that was carried on the internet seven years later.
Seven years of hard work, iPhone X facial recognition, past lives
It is hard to imagine that in 2010, before fingerprint recognition was born, Apple began to lay out 3D vision, and after 7 years of dormancy, it first appeared in the form of facial recognition. So when there is talk that Face ID is just a substitute for the iPhone to solve the fingerprint on the screen in a hurry, we have to admit that we underestimate the Apple Empire.
Polar Rose’s main technology is to use artificial intelligence to engage in image and video analysis and extract three-dimensional information from two-dimensional images, but Apple’s layout in 3D vision is still one step behind Microsoft’s. Because as early as the E3 exhibition in June 2009, Microsoft officially released Kinect, a somatosensory game device, which changed the way people interact with games. We can play games by shaking our bodies (Nintendo’s Wii still needed a remote control stick).
Kinect somatosensory game (picture quoted from gamefy)
And even the key here is the depth camera, which is used to capture human body movements and then hand them over to the program to identify, remember, analyze and process these movements. This technology is called PrimeSense by a company.Provided by the company.
PrimeSense is an Israeli company founded in 2005, and developed a 3D sensor in 2006. Its founders all have strong scientific research background. At that time, the game industry had been stagnant, and they have been thinking about how to make changes, such as letting people pick up the sword in the game instead of the remote control handle. Finally,PrimeSense locked the camera.
Kinect(Image quoted from cnblogs)
At the game developers’ conference that year,PrimeSense showed the prospect of 3D sensors in the game field, and was recognized by Microsoft, which gave birth to the later.Kinect。
3D recognition of the originator’s structural light to achieve PrimeSense
PrimeSense3D recognition technology is mainly a kind of structured light--—Light Coding (optical coding technology), we know that the camera wants to capture human movements, and it needs to be realized by judging the depth of field. Structured light is to structure light, and its principle is to use a projector to project a grating or a linear light source onto the measured object, forming a distorted shape on the surface of the measured object, such as the bar grating in the picture below, which projects irregular lines on the surface of the fish.
Principle of structured light technology
Such a distorted image is captured by a camera in another position, and the lines seen from its angle are obtained. When the positions of the projector and the camera are fixed, the distortion degree of the lines captured by it is also consistent, so that the distorted two-dimensional image can restore the three-dimensional imaging of the object surface. Here, the principle of optical triangulation is mainly used.
Principle of structured light technology
Principle of structured light technology
The Light Coding used by PrimeSense is somewhat different. It will emit a light source called "laser speckle" in space. This light source is characterized by high randomness, and different patterns will appear with different distances, so that the speckles in different spaces are the same. After the light source is emitted, it is equivalent to that the whole space has been marked. When an object is placed in this space, the spatial position of the object can be monitored through the change of speckle.
PrimeSense will record a speckle pattern on the reference plane at intervals in space to form a three-dimensional speckle pattern group. When someone enters the space, it will record it again to form a contrast and correlation operation, and then a three-dimensional image of the whole scene can be obtained. This principle is used by Microsoft’s first generation Kinect depth-of-field image.
Since 2009, technology companies have discovered that 3D vision is a huge treasure that has not yet been tapped, including Microsoft, Intel, Google, Samsung and other companies that have invested in the development of 3D vision through acquisitions and independent research and development. In 2009 and 2010, Microsoft acquired 3D-TOF camera companies canesta and 3dv. In 2013, Intel launched RealSense technology and Google launched Tango project.
At this point, we haven’t seen much of Apple’s shadow, but since Microsoft announced its parting ways with PrimeSense, Apple has just appeared.
Microsoft’s self-developed TOF Apple Leaks Achievement Face ID
In 2013, a new generation of Kinect came out. Instead of continuing to use PrimeSense, it chose to independently develop 3D sensors. In fact, the second generation of Kinect adopts a 3D vision technology-TOF, which is completely different from PrimeSense structured light. These technologies are accumulated from the previously acquired companies canesta and 3dv, which hold the patent of TOF camera. Previously, it was always thought that the first generation of Kinect used TOF technology.
TOF 3D vision (picture quoted from China Electronic Network)
TOF is short for Time of flight. It is to send light pulses to the target continuously, and then receive the light from the object with a sensor. By detecting the flight (round-trip) time of the light pulses, the distance between the target and the object can be obtained. Usually, the LED emits infrared rays, the imaging sensor receives infrared rays reflected from the surface of the object, and the device emits a sinusoidal signal whose intensity changes with time period, and the depth is calculated by obtaining the phase difference between the transmitted and received signals.
TOF technology principle
In addition to structured light and TOF technology, there is also a binocular stereo imaging technology, but it will not be introduced here because the technology is immature.
Both structured light and TOF technology have their own advantages in 3D vision. For example, structured light has mature technology, low power consumption and high resolution of plane information, but it is easily affected by illumination, and its performance in strong light is not very good, and its recognition distance is close, but its cost is high. TOF scheme has good anti-interference and long recognition distance, but it has low plane resolution, high power consumption and medium material cost.
Comparison of three 3D vision technologies (picture quoted from Haitong Electronics Research)
In any case, Microsoft gave up using structured light technology.PrimeSense,Although PrimeSense was somewhat disappointed, it didn’t stop the pace of research and development, and decided to reinvent itself, and developed a very small 3D sensor in the world.CApri and Capri have three times the depth resolution and 50 times the light resistance (enabling them to work in sunlight), but their bodies are reduced by 10 times.
Maybe it isPrimeSense’s efforts in product miniaturization have enabled 3D sensors to be used in notebooks, tablets and even computers, and also allowed Apple to see its potential. Microsoft is right.The abandonment of PrimeSense finally made Apple miss it..
So, finally, in September 2013, Apple officially made its debut and spent $360 million to acquire PrimeSense, which enabled this ambitious company to master the core 3D vision technology.
Those invisible ones behind Face ID are bought in buy buy.
Apple’s acquisition of PrimeSense is still advancing the miniaturization of 3D sensors, and increasing investment and patent layout. In April 2015, Apple submitted a patent application for 3D gesture control technology; In July 2015, Apple submitted a facial recognition patent named "Low Threshold Facial Recognition". In March 2017, Apple released a new patent for 3D vision-face recognition using depth of field information, and a complete patent map of face recognition gradually became clear.
Then there is a series of crazy acquisitions:
In April 2015, Apple acquired LinX Imaging.
LinX Imaging mainly develops multi-hole cameras for mobile devices, which can reduce the height of the camera and make the camera no longer protrude. Another black technology is to use porous design to accurately measure the difference between pixels in different images, so as to create a depth-of-field image, which allows us to scan the object in three dimensions only by taking pictures.
LinX Imaging 3D scanning technology
In November 2015, Apple acquired Faceshift.
Faceshift is a motion capture company, which developed the technology of tracking facial expressions in real time and then using animation. This technology was used in Star Wars movies to make the expressions of animated characters imitate the expressions of actors more accurately. In the field of games, users can use avatars that are updated in real time according to their own expressions.
Faceshift expression capture
It turned out later that,Faceshift is not only used asIn addition to face recognition, it is also used by Apple in Animoji.
In January 2016, Apple acquired Emotient.
Emotient is a company dedicated to judging people’s emotions through facial expression analysis. This technology will grab people’s faces and then use recognition technology to identify facial expressions.
Perceive facial emotions
In February 2017, Apple acquired RealFace.
RealFace is good at facial recognition. The company has developed a unique facial recognition technology, which integrates artificial intelligence and brings human perception back to the digital process.
RealFace face recognition
The large-scale acquisition enabled Apple to quickly master the core technology of 3D vision, especially in the field of face recognition, and also prevented competitors from catching up with Apple quickly through acquisition, so at the new product launch conference in the fall of 2017, X with Face ID function made a stunning debut.
IPhone X facial recognition
The real essence: a quick glance
According to Apple’s perfectionist product design concept, X’s "Liu Haier" should not exist, but Jonathan Ive is still willing to make way for "small space", because he also knows that it is not just a full screen on the front that really builds Apple’s next decade, but more importantly, the stunning "that light glance".
"All along, we have an idea, looking forward to creating such an iPhone: it has a full screen, which allows you to completely immerse yourself in it when using it, as if forgetting its existence. It is so smart, your touch, touch, word, word, even a glance, will get its heart-felt response. And this idea has finally become a reality with the arrival of iPhone X. Now, just meet the future. "
Little space
Looking back, "Small Space" is highly integrated with Apple’s research results in face recognition, in which the dot matrix projector will project more than 30,000 invisible light spots and analyze them to draw accurate and detailed depth maps for our faces. The infrared lens will read the lattice pattern, capture its infrared image, and then send the data to the security compartment in A11 Bionics chip to confirm whether it matches, using the structured light technology of PrimeSense. In addition, Apple can recognize your face even in the dark through the floodlight sensor and invisible infrared light.
Apple chosePrimeSense’s structured light, rather than TOF, pays more attention to the proper recognition distance and high resolution of structured light. Compared with TOF, the lower power consumption also makes structured light more suitable for mobile platforms. However, structured light has a natural disadvantage in strong light. I don’t know if iPhone X can still perform well in the big sun.
It is worth mentioning that Face ID is only a part of the energy released by Apple’s accumulated 3D vision for 7 years. In addition to unlocking the face on iPhone X, it also creates a series of functions that seem to show muscles but are very interesting, such as "moving expressions".
Through the complex system of the original deep camera, iPhone X can analyze more than 50 different facial muscle movements to detect facial 3D contours. On this basis, the "dynamic expression" is derived, that is, the user’s face is captured in 3D modeling. The special effects we often see in movies belong to this category, but this time we can achieve them through mobile phones.
Verbal expression
Combined with A11 bionic deep learning, Face ID also has the ability of deep learning. It has a special neuron engine, which can use machine learning technology to identify the changes of your appearance. Recently, the white paper on Face ID disclosed by Apple revealed that if your face changes greatly, such as shaving, Face ID will confirm the facial changes through password input and automatically record them in the database, instead of letting you re-enter your face.
Face ID, a small test, Apple’s 3D big move is AR.
Through patents, we can see that Apple has also reserved a series of 3D gesture operation technologies. Today, when the somatosensory operation has been tepid, we are very happy to see that the future somatosensory operation still needs Apple, a "latecomer", to subvert, and this is the real use of 3D vision.
The 3D gesture operation may be realized on the Mac platform, and the Face ID on the mobile platform is just a small test of 3D vision. Apple said that AR will be an indispensable link in the future, and the energy exerted by the rear dual camera to join 3D vision can not be underestimated.
AR game
At present, Apple has released ARKit development platform, and it has become a huge AR development platform in one fell swoop. However, the AR experience of X is still based on traditional cameras, which is far from 3D cameras. Therefore, when Cook enthusiastically introduced the AR function of iPhone at the press conference, he not only gave the expectation for the next decade, but also had already targeted your wallet for the next decade.
The above is a report on the evaluation of Apple’s iPhone products. For the follow-up content of Apple’s iPhone, such as its appearance, screen, photos, battery life and performance, please continue to pay attention to Zhongguancun Online’s report on the evaluation of Apple’s iPhone.