A few years ago a startup I was working for was acquired by a super massive payments platform in China. I never truly appreciated the power of large AI training sets until one hazy Hangzhou afternoon we were discussing the accuracy of various algorithms.

What really captured my attention was that even though the biometric our startup had invented (eye vein recognition) was theoretically more unique than almost all other biometrics it was outperformed significantly by our parent company’s ordinary recognition. Eye veins are truly random more so than fingerprints. I was baffled that of all things facial recognition was beating us and it wasn’t even close.

Over some lip numbing, soul warming hotpot with our colleagues we figured out why this was happening. It was their training set. It was a mind blowingly massive training set probably unique for the entire world. It was a half of a billion high resolution standardized photos of people’s faces used solely for identification purposes in one centralized repository that our parent company had curated. The implications of this training set ultimately lead me to leave the startup not long after acquisition. I never truly felt comfortable with how these biometrics could be used.

Silicon Valley and the world’s largest automakers are all vying for autonomous superiority and failing however all of their strategies are sound on paper. The problem is at scale there simply isn’t enough diversity in their autonomous training. You can’t drive a test vehicle loaded with sensors around Cupertino and expect it to perform well during an Alaskan winter. It should be no surprise that the few cities with active autonomous vehicle fleets are in warm climates with little precipitation. In order for autonomous vehicles to work we need orders of magnitude more diverse and in depth training sets just like when our scrappy startup’s theoretically superior biometric was outperformed by a relatively simple facial recognition implementation. Having a massive training set is key.

Tesla is probably the closest but their vehicles are expensive and cater to a very narrow demographic (in the US the mean Tesla owner can be described as an affluent white male working in a STEM career) in other words Teslas aren’t driving on dirt roads or in low income neighborhoods often. This isn’t just true for Tesla, this is the case for all players in the field right now. They are either doing very sparse geographically limited data collection via test vehicles and contractors or large scale collection using sensors on their high end vehicle trim packages. None of these vehicles are driving around in the Ozarks or the Iron Range in Minnesota.

In order to usher in our autonomous future we need a massive training set. There’s no way around it. Insight Autonomy is setting out to short circuit the status quo when it comes to AI training. We want you to participate actively and ethically. Moreover will compensate you for your participation.

We will never sell any personally identifiable information to a third party. Our goal is to accelerate progress in vehicle autonomy by providing AI training sets and models based on how people drive in diverse conditions, regions, social economic status, vehicle type, etc.

We want to disrupt the slow trickle down of advanced vehicle safety features. We don’t want you to have to wait 10 or 15 years to be safer. We believe emphatically with your help we can bring about our autonomous future sooner and in a way that includes everyone and not just people who can drop $90k on a car.