SDS PODCAST
EPISODE 251:
TRANSFORMING
THE IDENTITY
AUTHENTICATION
SPACE
Kirill Eremenko: This is episode number 251 with CEO and Data
Scientist at TypingDNA, Raul Popa.
Kirill Eremenko: Welcome to the SuperDataScience Podcast. My name
is Kirill Eremenko, Data Science Coach and Lifestyle
Entrepreneur, and each week, we bring you inspiring
people and ideas to help you build your successful
career in data science. Thanks for being here today,
and now, let's make the complex, simple.
Kirill Eremenko: This episode is brought to you by our very own data
science conference, DataScienceGO 2019. There are
plenty of data science conferences out there.
DataScienceGO is not your ordinary data science
event. This is a conference dedicated to career
advancement. We have three days of immersive talks,
panels, and training sessions designed to teach,
inspire and guide you. There's three separate career
tracks involved, so whether you're a beginning, a
practitioner, or a manager, you can find a career track
for you and select the right talks to advance your
career.
Kirill Eremenko: We're expecting 40 speakers, that four, zero, 40
speakers to join us for DataScienceGO 2019. Just to
give you a taste of what to expect, here are some of the
speakers that we had in the previous years, Creator of
Makeover Monday Andy Kriebel, AI Thought Leader
Ben Taylor, Data Science Influencer Randy Lao, Data
Science Mentor Kristen Kehrer, Founder of Visual
Cinnamon Nadieh Bremer, Technology Futurist Pablos
Holman, and many, many more.
Kirill Eremenko: This year, we will have over 800 attendees from
beginners to data scientists to managers and leaders,
so there will be plenty of networking opportunities with
our attendees and speakers, and you don't want to
miss out on that. That's the best way to grow your
data science network and grow your career. As a
bonus, there will be track for executives. If you're an
executive listening to this, check this out. Last year at
DataScienceGO X, which is our special track for
executives, we had key business decision makers from
Ellie Mae, Levi Strauss, Dell, Red Bull, and more.
Kirill Eremenko: Whether you're a beginner, practitioner, manager, or
executive, DataScienceGO is for you. DataScienceGO
is happening on the 27th, 28th, 29th of September
2019 in San Diego. Don't miss out. You can get your
tickets at www.datasciencego.com. I would personally
love to see you there, network with you and help
inspire your career or progress your business into the
space of data science. Once again, the website is
www.datasciencego.com, and I'll see you there.
Kirill Eremenko: Welcome back to the SuperDataScience Podcast, ladies
and gentlemen, super excited to have you on the show
today because we've got a very exciting guest, Raul
Popa, who is the CEO and Data Scientist at
TypingDNA. What you're about to experience on this
podcast is very, very different to probably anything
you've heard before because we're talking about a
brand new industry that is completely crushing it and
disrupting everything we know about security, and
this is the world of typing biometrics.
Kirill Eremenko: The idea behind typing biometrics is that based on
how you type, whether it's on your laptop or computer
or on your mobile phone, it can be established that
you are you. You can be identified just as that can be
done with your fingerprint or with facial recognition,
same thing can be done through the patterns that you
use for typing. As you can imagine, that can
completely revolutionize how we identify people, the
whole world of two-factor identification. It's also a very
passive process, non-intrusive. You don't have to get
an SMS and type in a code. It just happens in the
background.
Kirill Eremenko: Raul Popa is the CEO and the Data Scientist at a
company called TypingDNA that is spearheading this
whole industry, one of the leading companies in this
space. Today, we'll get to hear from him all about this
world. You'll learn about typing biometrics, what it is,
how it works, how machine learning and data science
enable and propel this industry forward. You'll also
hear about different applications ranging from making
sure students don't cheat on exams all the way to two-
factor authentication for banks and other financial
institutions.
Kirill Eremenko: The fact that Raul is both the CEO and the data
scientist at the same time makes this conversation
that much more interesting because we get to dive into
both worlds. From the perspective of data science, we
talked about pattern recognition, anomaly detection,
one-shot learning, binary classification, data sampling,
generative algorithms and more. From the world of
business, we talked about what it's like to run a data
science startup and going from idea to research to
making a business happen. As you can imagine, we've
got lots of interesting things to cover.
Kirill Eremenko: Before we dive in, I want to give a shout out to the fan
of the week. This one is from Andy who said, "The
FiveMinuteFriday episodes always feature an insightful
look into a unique topic, meditation, the importance of
enjoying the moment, how to maximize efficiency and
continuous inspiration to supercharge our own
careers. Thank you, Kirill." Thank you, Andy, so much.
For the rest of you guys out there, if you haven't left a
review yet, make sure to head on over there on your
podcast app or just go to iTunes, and you can leave a
review on the SuperDataScience podcast. It would
mean a ton to me. I would be very, very excited to read
your review. All right, guys, I'm super excited about
this one. Let's dive straight into it. You will learn
plenty about running data science startups and how
the world of typing biometrics works. Without further
ado, I bring to you Raul Popa, the CEO and Data
Scientist at TypingDNA.
Kirill Eremenko: Welcome back to the SuperDataScience Podcast, ladies
and gentlemen, super excited to have you back here on
the show. For today's episode, we've got Raul Popa
joining us from New York. Raul, how are you going
today?
Raul Popa: Hi, I'm fine, everything is fine, welcome to your
audience and nice to meet you.
Kirill Eremenko: That's awesome. That's awesome. Thank you for
coming on the show. I like how we were chatting before
the podcast that because you're based between
Romania and New York, we're talking about the
weather in New York, and it's like 17 degrees, and
usually, when I talk to somebody from the US, it's in
Fahrenheit. That's very cool. How do you find
adjusting to the weather in the US, when everything is
in Fahrenheit and you are used to Celsius?
Raul Popa: I just set it on Celsius on my phone, and everything is
fine. I don't really know it's Fahrenheit. I know that I
have to set temperature in the apartment around 70,
but I don't really know what that means in Celsius.
Kirill Eremenko: Yeah. Yeah, it's so interesting. It's a nonlinear
conversion. It's a trivial conversation from Fahrenheit
to Celsius. I personally get confused all the time, and I
just wish there was one system around the world.
Nevertheless, how are you enjoying New York? Have
you been there for a long time this time around?
Raul Popa: Yeah, only for two weeks. I used to spend more time.
Last year, I've been here about five months. The rest of
the time, I've been in, mostly, Europe and Romania.
Kirill Eremenko: Okay. Got you. You've done quite a few presentations
and pitches through the nature of your business.
You've even done a TEDx talk, but this is your first
podcast, so congratulations. I think our listeners are
going to be very excited to hear what you're about to
share. Are you excited about this?
Raul Popa: I'm really excited, yeah.
Kirill Eremenko: Awesome. Awesome.
Raul Popa: Maybe a bit nervous as well.
Kirill Eremenko: That's totally normal. That's very normal. All right, so
Raul, tell us a bit about yourself. You're running this
very innovative, different business that many people
haven't even heard of this type of technology before,
and by the looks of it, from what I've seen, what I've
read, you guys are really crushing it. To get our
listeners up to speed, please tell us what is TypingDNA
and how did you come up with this idea.
Raul Popa: Yeah, so I'm CEO and Data Scientist at TypingDNA,
and this is called typing biometrics, a behavioral
biometrics company, basically. We'll look at how
people type and build behavioral biometrics profiles
that we use for authentication and fraud prevention.
From an AI perspective, I'm more into pattern
recognition, anomaly detection, one-shot learning, and
binary classification. I know it sounds trivial, but
being able to give a yes or no answer in a fraud
problem is really tough, and the difference between 90
to 95% accuracy matters a lot.
Raul Popa: The techniques behind our technology are
exponentially more complex than what you would
typically think for solving a binary classification
problem or a one-class classification problem if you
want, so just to understand, from a machine learning
point of view, how it looks. I think what we're building
looks innovative from the outside. From the inside, it's
just pattern recognition. It's just applied to something
that machine learning was not applied before or not to
this extent.
Kirill Eremenko: Okay. Okay. Got you. That's very exciting. For us to
get a better understanding, like an intuitive
understanding of how this works, let's say you're
working ... You partner up with financial institutions.
You partner up with banks, other kind of companies
that ... Tell us a bit about that. What kind of
companies use TypingDNA services?
Raul Popa: Yeah, so we're at the beginning now. TypingDNA itself
is at the beginning, but basically anywhere where you
would need another factor for authentication or
another security layer other than a simple password or
anything like a push identification or one-day
passwords sent via text message, you will probably
want to use TypingDNA.
Kirill Eremenko: Okay.
Raul Popa: We started with proctoring companies or online
assessment companies, companies like ProctorU or
Mind Prov. Actually, they verify students when they
take exams, and before using our technology, they
were using real people. It's more expensive like that.
We helped them reduce people, and everybody is
happy. Students can take exams faster, and they
cannot cheat as much as they did before.
Kirill Eremenko: Okay. Got you. Very interesting, and so tell us a bit
about how this would work in the background. Let's
say a financial institution, a bank is using TypingDNA
to authenticate their users. I'm logging onto my online
account. I get to my computer. I type in the address of
the URL of the bank's online portal. I type in my name.
I type in my password. Then I click Log In, and this is
the point where I'd normally get an SMS to verify that
it's me. Where in that process does TypingDNA come
in? Where do your algorithms start to recognize me?
Raul Popa: You don't really have to get the SMS if you're using
TypingDNA. That's the point. Anywhere you type
something, like your email password, credit card,
anything that you previously typed, you can use ... An
application can use TypingDNA to record the timing
between the keys, how long you keep each key
pressed. These are the kind of information that we're
looking at. Based on that, we build a profile that I told
you about, and we can do authentication right then
and there and use that as a two-factor.
Kirill Eremenko: Got you.
Raul Popa: Also, financial institutions can use our technology for
employee-facing authentication. Nowadays, for 2FA,
they typically use a hard token or a push identification
system, you'd have to install something on your mobile
phone and use that. I think that's really not something
very user-friendly, so we offer more like a frictionless
solution. We just look at how people type. If that fails,
sometimes, it fails, if that fails, we can always go and
use the one-time password over SMS or the push
identification or anything other.
Kirill Eremenko: Okay. Got you. How much typing is required to
authenticate a user? If I just type in my email, is that
enough, or do you need me to type in maybe 20 words
or 50 words? What is the minimum amount of words
required to authenticate somebody?
Raul Popa: I think this is where things get really interesting. For
data scientists, I think this is where things get really,
really interesting because to build a good model, you
will need 15 samples at least, 15 previous typings, so a
person typing 15 types. You asked for the length of the
text, we can work on text of eight characters or 20
characters. We found that a lot of use cases have
around 28. This is the average of use cases that we
form, around 28 characters, like email plus password
combination, or credit card name, or other things like
that, but also username, password is a bit less, but
still, 20, 20-something plus usually work really well.
Raul Popa: We can do authentication even with just one or two
samples. I know this is something that machine
learning was not supposed to be used for, but we
actually do that. This field of one-shot learning, we're
using very few samples to do prediction, to predict
whether somebody is the same person or not with the
owner of the account. That's really interesting.
Kirill Eremenko: That's very cool. How unique are these typing
patterns? With fingerprints, they're pretty reliable.
With facial recognition, even more reliable. How about
typing patterns? What is the chances of two people
having the same typing pattern?
Raul Popa: This is a good question. I want to say it's as ... I think
the technology is at the beginning, and we will see
more improvements in this. Depending on how much
text you've got from the user, if the user typed a
sentence 10 times, then you can capture the
uniqueness of his typing. It will be really hard to
break. If you only want to do authentication after just
one or two samples, then the authentication, the
match will have some potential error, some potential
false positives or false negatives exist there.
Kirill Eremenko: Yeah. Okay. You would say it's almost as reliable as a
fingerprint when you have enough data to authenticate
the user, let's say, 10 samples we have or 15 samples.
Raul Popa: Yeah, I'd say that, but I'd say that fingerprints and
face recognition, for example, on the other side are so
public. I mean, you leave your fingerprints everywhere,
your faces everywhere, so somebody can snap a photo
of you at high resolution and use that to authenticate
pretty much anywhere other than face ID. With typing
biometrics, it's not the same. You can ask in your own
application for that user to type a specific name that
he will never type in other platforms, like random
combinations of words, his password, or anything like
that. To get how that person type that exact text in a
different environment is almost impossible. We're
looking at something that has more reliable character
like typing pattern. From a reliability point of view, I
think that it can be even more reliable than fingerprint
or face.
Kirill Eremenko: Wow, I'm still sitting here in amazement because I've
never considered this be an option for authenticating.
Well, in that case, if you ... You have somebody who
types in their password, but what about tools that
teach people how to type? There's the whole fast speed
typing learnings where you can use the QWERTY or
the Dvorak keyboard layout to type faster. Let's say
somebody is using the standard QWERTY keyboard
layout and they learned how to type faster, so they
learned those techniques for typing. Now, you have,
let's say you have a thousand people who learned
using the same typing program, and they're very
diligent. They got very good at it. Wouldn't they have
very similar typing patterns?
Raul Popa: I don't think they are similar because their size of
fingers and how they use the fingers and the muscles
differs. If people would take the same running classes,
would they run the same? Would their walking style be
the same? I don't think so. It could be very similar, but
still, there is a lot of character in watching somebody
typing or watching somebody walking or even
speaking. Even if you sing at the opera and other 10
singers, you would not maybe distinguish the opera
voice between those, but when you talk like normal
people, your voice will be singular. Right?
Kirill Eremenko: Mm-hmm (affirmative).
Raul Popa: This is only intuition. I'm just talking about what my
intuition says, but from our test, we adapt. For
example, it's like face recognition adapts when you
grow a beard, or-
Kirill Eremenko: Or you're wearing sunglasses.
Raul Popa: Yeah. Yeah. Yeah.
Kirill Eremenko: Okay. Got you.
Raul Popa: We adapt when people start typing faster or differently,
but if it's very different, then we have to fall back to a
different method for authentication, for example, but
not all use cases are about authentication. We can do
all sorts of other things, like making sure somebody is
not sharing accounts with other people, like all sorts of
things. We have about 20 different use cases that we
identify.
Kirill Eremenko: Got you. Okay. Well, that's very cool already, so I hope
our listeners are as excited as I am to dive into this. I
really love that you are both the CEO and the data
scientist, and you actually point that out in your
LinkedIn that you perform both roles in your company.
It's very cool. That means we can dive into the whole
notion of setting up a business about a really cool idea
and about some research, a data-driven business, and
on the other hand, the techniques and algorithms that
allow for all this to happen, for this technology to
work.
Kirill Eremenko: Probably, let's start on the data science side of things.
We already touched on a little bit on the pattern
recognition, anomaly detection, one-shot learning,
binary classification. What can you share? I know a lot
of this would be proprietary information and that you
can't share freely on this podcast. Nevertheless, what
can you share with our listeners that might be exciting
for them? What kind of machine-learning algorithms
are used in typing biometrics these days? What's new?
What's hot, and what can they look into if they ever
want to get into this field?
Raul Popa: It's a tough question. For TypingDNA, I cannot go into
details, unfortunately. I can say one of the most
important things, greatly underestimated, not just in
typing biometrics, but in any kind of machine-learning
applied technology is data sampling. We had a lot of
misfortunes at the beginning of building TypingDNA
because we didn't address that correctly. Whenever
you deal with fraud prevention, anomalie detection,
you'll always find extremely unbalanced sets of data,
also very noisy, so it's easy to throw away 90% of your
extra data, but it's not always an option.
Raul Popa: Generative algorithms also seem to work really well on,
mostly, all levels, I mean, regardless of what you're
building, I suggest people should try to generate data
as much as possible. Make sure you don't compromise
your testing and cross-validation sets, however, but
definitely generate data if you can. It can help you
improve your general accuracy no matter what you're
building.
Raul Popa: A big part of what we're doing is called one-shot
learning, being able to predict the class just by seeing
one single sample or very few ones. Techniques like
transfer learning might work well here, just there's not
enough information out there about what to do and
how to do that. I was researching a lot about how to do
one-shot learning because I really wanted to make the
technology work for just one previous sample or two
previous samples because, otherwise, it's not efficient.
Raul Popa: Also, since we're talking about security, so our
technology used to prevent fraud for security purpose.
A good technique is to use multiple algorithms
[inaudible 00:23:28], stacking generalization or
blending, but unlike for Kaggle competition, stacking
generalization work well for security algorithms
because it's harder to break multiple algorithms at the
same time. If you're a hacker, you want to break these
algorithms, really hard.
Raul Popa: You probably know about adversarial examples
samples used to trick traffic signs, probably use
[crosstalk 00:23:57], stop signs that you stick some
tape on it, and all of a sudden, self-driving cars will
recognize those as 45 miles limit, speed limit instead of
stop sign, and that's a huge thing. Right?
Kirill Eremenko: Yeah.
Raul Popa: Adversarial glasses used to trick face recognition
systems to believe you're somebody else. Using
completely different types of machine-learning
algorithm in your production will help reduce the
ability for a hacker to hack into your system with a,
sort of, master key if you want, but this is something
that I suggest people would do if they do anything
related to security or to fraud prevention or to verify
users in any way, anything related to authentication or
identity. Yeah, so unlike Kaggle which people do
blending and stacking to be at a better accuracy. Here,
you're not after the best accuracy. You're after better
chance of succeeding or smaller chance of succeeding
for hackers.
Kirill Eremenko: Got you. That's very interesting about the comparison
to data science competitions versus the real world and
different objectives. What would you say your
experience with research has been in data science? We
talked a little bit about, in competitions, you just want
many models to get the best accuracy. In the real
world, you want many models, and specifically in this
use case, to get the best security. What about
research? How was your process of researching this
technology using data science?
Raul Popa: I learned a lot. I did some of the main data science
online courses, tried to follow masters, Andrew Ng,
Geoff Hinton, Yann Lecun, Yoshua Bengio, Ian
Goodfellow, so forth. Basically, I read stuff that was
new to data science or to machine learning and trying
to understand what are all these things going, where
are all these things will ... Where it will connect
without a purpose in mind, like doing typing
biometrics or other pattern recognition problems. I just
wanted to learn more.
Raul Popa: A few years ago, there were not a lot of resources, not
a lot of frameworks and libraries, and you had to
basically code everything yourself, and understanding
the math, use the great apps. It was really important.
For example, I never did Kaggle competitions. I really
think people doing that are really crazy. I know a few
guys who really scored on top of that. I know building
hundreds and hundreds of model and stacking them
and getting rid of half of your work or 90% of your
work just to make a small minor improvement takes a
lot of time, a lot of ambition. I could have never done
that. Yeah, but one of the things that I really like is
you have the Winner's Interview on Kaggle blog, those I
really like to read, and I still read them, really cool.
Kirill Eremenko: Okay. Got you. Got you. From your research, first of
all, you mentioned before, back in the day, there
weren't so many frameworks available, and you had to
go into the math and that was very helpful. Now that
we have things like TensorFlow, PyTorch, and other
tools that make it easier to create, let's say, for
instance, in this case, deep-learning algorithms, AI,
would you say that learning the math is essential, or
people can get started faster without having to learn
the math?
Raul Popa: You don't need to know the math anymore. I mean,
you can if you really want or if you want to develop the
field. If you want to advance the field, yes, you
definitely have to understand the details to create that.
Other than that, just to use machine learning for your
standard problems like typical computer vision,
recognize objects, classify things, definitely, you don't
need to know the math.
Kirill Eremenko: Got you. Okay. Then going from research to building a
company, so tell us a bit about that process. How did
that researches you were doing, turned into the idea of
actually turning into a company and what are some of
the challenges that you face with building a startup
out of research?
Raul Popa: Yeah, I spoke to a few events about this, and they keep
inviting me to talk about this because it seems like a
lot of data scientists want to make the move to start a
business or do a startup, and it's not really easy
because they're very, very good at what they do. We all
know data scientists are paid really well, really hard to
break from that lifestyle and make a company where
you will work forever, 24/7. You will never see the light
at the end of the tunnel. That's really hard.
Kirill Eremenko: [Inaudible 00:29:49] Yeah.
Raul Popa: Yeah. Yeah.
Kirill Eremenko: Eat rice and beans, live on a friend's couch, very
difficult.
Raul Popa: Exactly, so it's really, really hard. I wouldn't advise a
lot of people to do this unless they're risk-takers.
There is a case where you research something and you
find that thing that nobody advanced it enough like I
found typing biometrics. There are so many fields
where the research didn't go deep enough, and you
just know that you have a ... You can say it's a
breakthrough, but maybe it's just something that you
saw in a different domain. You just apply that and
realize, "Hey, I can use this to advance this field." At
that point, you can actually use that to create a
company or to start a product and actually solve the
problem. I think that's like a calling. If you have that,
why not start a company? Right?
Kirill Eremenko: Okay. Do you think somebody can start a company
while they have a normal nine-to-five job and see how
it goes, test things out?
Raul Popa: I don't know a lot of people who managed to do that. I
think it's really hard sometimes, and if you have a safe
net to go back to like a normal job, you would just do
that. You would have to let go that job and live on
whatever you have saved or raise some money, find co-
founders and start a company. It takes a lot of
courage, but the good thing is that you can always go
back to work for Goldman Sachs or whatever company
pays you.
Kirill Eremenko: Yeah. Yeah, so you kind of have to burn the bridge to
force yourself. Let's just say if you want to take the
island, burn the boats, right, something like that, so
there's no way back.
Raul Popa: Yeah, definitely. Yeah. Yeah. Some of the challenges as
data scientists starting their company were around
creating a prediction-level software, so basically, you
start with the research, you have models. You have
stuff like that. You have to turn that ... We turned that
into an API that is completely scalable, can do millions
of transactions in a day. To do that is not complicated,
but it's not easy science. It's different type of science,
that data science. It takes a bit of ambition to learn it.
Raul Popa: Also, for what I did, I didn't start with a lot of data that
I got from university or something like that. Actually,
so gathering data, lots of data was really, really hard.
If you're a startup or you're building a startup, it's
almost impossible to do it with small data. For
example, I started with 200 friends. I sent the link to
type and ask them to ask some text and do a very
quick survey. After that, I went to my Mensa group,
the high IQ society?
Kirill Eremenko: Men-
Raul Popa: Those high IQ society called Mensa.
Kirill Eremenko: No, I haven't heard of them.
Raul Popa: Yeah. I am a member of that.
Kirill Eremenko: Okay.
Raul Popa: I asked them to help. I told them that I can build a
classifier based on the people typing patterns that will
try to differentiate between regular people and high IQ
ones. They loved it. I got about 400 people to take the
test and complete the survey with things like age,
gender, personality profile, stuff like that.
Kirill Eremenko: Yup.
Raul Popa: After that, I used to data to build a fun test. It wasn't
enough data to build authentication systems but like
400 people. I built a fun test titled Find Out What Your
Typing Says About You. I put it out there. In two days
or so, it got viral. People started sharing to their
friends and forums and personality forums, and it got
on Reddit at some point. One morning, I woke up with
about 20,000 samples in my database. The server was
down. I realized that "Hey, I have more data than
needed to start my research," but to get there, I had to
trick people to help me.
Kirill Eremenko: Well, trick, I wouldn't say trick. You just encouraged
them in different ways. You gave them back something
that they-
Raul Popa: Yeah, actually, I created some algorithms trying to
classify things like gender or age or IQ based on how
you type.
Kirill Eremenko: Yeah.
Raul Popa: There are some similar characteristics between people
that shared the same attributes like age, for example,
and you can see them typing in a different way. We got
60, 70% accuracy, and so not a lot, but for a fun test,
was really fun. People were really intrigued. A lot of
people like that, they didn't get the right MBTI profile
name or they've got different gender. They were
questioning their gender now. I mean, it's just a fun
thing.
Kirill Eremenko: Nice. Yeah. Got you. You link it up to that and Myers-
Briggs personality test, right?
Raul Popa: Yeah.
Kirill Eremenko: Yeah, that's smart. That's pretty cool. Basically, what
you're saying is that when somebody is moving from
being a data scientist and having an idea, maybe even
doing some preliminary research on that idea, seeing
that they can break into a field and revolutionize
something, and moving from there to actually building
a company, there's a lot of challenges along the way
from productization to gathering data, and probably
lots more other challenges. Maybe we'll talk about a
couple more.
Kirill Eremenko: You need to be prepared for them, and you need to
also think outside the box this whole [inaudible
00:36:01]. In this case, with the data situation, like the
fun test. I think that's a genius idea and will get you
data because data is value, right? Data is valuable.
You can go scrape the web for data. You can go buy
data. You can go do a fun test like that to get the data,
but you need to consider all of these things before you
dive into starting a data science business, right?
Raul Popa: Yeah, totally. Yeah. I think being able to think
creatively about how you gather your first data or to
think in steps, first, you do this, then you do that, but
then you have real data to do your research. I think
it's really important.
Kirill Eremenko: Got you.
Raul Popa: You can partner with some other companies or
universities. Sometimes, you can do that. Other times,
really, really hard.
Kirill Eremenko: Okay. What are some of the other challenges that you
face when starting your data science or typing
biometrics business?
Raul Popa: Well, there are business problems, stuff like you need
more people or you need people to help you with
marketing and research, like research on the
marketing side, market research, sales. You have to
meet investors, extremely hard. You need investors.
They want to see someone who knows everything,
who's able to sell, who's able to do research, who's
able to manage people and everything like that. I'm not
saying I'm not that person. I'm saying nobody is that
person. It's really hard to be a perfect dude to do a
machine learning algorithm to a business, basically.
Kirill Eremenko: Got you. Okay. Okay.
Raul Popa: You have to meet a lot of things about that. Internet
has plenty of links.
Kirill Eremenko: Yeah. Yeah. I can totally attest to that, that you ...
Running a startup is not just about an idea. You need
to also have a business mind or start learning to have
a business mind, and that's completely different, or
maybe you can partner up with somebody right? You
can stay the data scientist, and somebody else can be
your sales director or your business development
director, something like that, chief operations officer,
somebody who's going to help you grow the business.
That's also another thing to consider.
Raul Popa: That's what I actually did. I asked two friends of mine
that were helping me from the side to join a project,
Christian and Adrian. One of the things that we did is
apply to accelerators. We got to Techstars in New York
City last year, and that really helped us with investors,
and helping cover the gaps that we had at that point,
really, really valuable for us. Also, we're from Romania.
We started in Romania, Eastern Europe, really hard to
build something from that side of the world and get
global exposure so we had to move to US to do that.
Kirill Eremenko: Yeah. Yeah. Got you. Now, you're half-half, US,
Romania, right?
Raul Popa: Yeah. Yeah. Yeah. I'm half-half. Right.
Kirill Eremenko: The team as well, like you got some people in the US,
some people in Romania.
Raul Popa: Yeah, but we do R&D is still Romania for different
reasons. I really think that Romania has a lot of talent.
Kirill Eremenko: Yeah, definitely, Romania has a lot of talent. I wanted
to ask you though, how do you find combining your
role as a data scientist and as a CEO because,
previously, from our discussion, I think everybody got
the gist that you are actually very involved through the
algorithm and you do quite a lot of research. You're
very up to date with the technological part of things.
That requires a lot of time from how I can imagine it.
At the same time, running a business requires a lot of
time as well, meeting clients, doing marketing, making
business decisions, scaling, growing, things like that.
How do you find combining those two, and plus, as
well, I know you're a father, a husband. How do you
coming all those things? When do you find the time?
Raul Popa: I don't know. I don't have a prepared answer for that. I
think that the key is that I really like the data science
part. It's like a hobby for me. The CEO part or the
founder part makes sense because I really think typing
biometrics can make a difference. I think in the future
people will type more than ever. Today, we're
communicating through voice or typing. I think if you
look at young people, 11 years daughter that you
mentioned, every time I talk to her, even if we're in the
same room, she would WhatsApp me, right?
Kirill Eremenko: Yeah. In the same room, she would WhatsApp you.
Raul Popa: Yeah, so it's like asynchronous type of communication
in which you type whenever you want. The other
person replies whenever they want. I think this is the
future of communication. If we look at young people,
we know that. It's no-brainer. I had to do this because
I realized that we will type more than we used to, and
with devices and with other people. I think having a
layer of security based on typing is really a key to a
better world in a way.
Raul Popa: I have sort of motivation to be a CEO and founder, and
I have love for data science, so it's like a hobby or
something that I really like to do. I have to find time
for everything, of course, family as well. For sure, of
course, you can work full days on whatever you need
to, but eventually, you have to find time for family and
friends. I think that's really important.
Kirill Eremenko: Okay. Got you. Just to clarify because I only realized
this now when you're talking about WhatsApp and
typing on your phone, so TypingDNA and, in general,
typing biometrics is not only designed for keyboards
and computers. It can also work on your phone. Am I
getting that right?
Raul Popa: Definitely.
Kirill Eremenko: Wow.
Raul Popa: Yeah, it's a different thing. It's a different thing. It's not
the same thing, but we're not focused only desktops.
We also have algorithms for mobile phones. On mobile
phones, we think we're even better at some points.
Mobile phones, you move them a lot. You have
pressure and you have a lot of things, at which you
can do when people type. Yeah, we're quite good in
mobile phones, and we have a few really important
projects on that.
Kirill Eremenko: You can also measure not just the typing speed and
how, in some phones, how much force people applied
to press the buttons, but also how the people are
holding the phone, orientation in space, as you said,
how they're moving their phone as they're typing,
those types of things.
Raul Popa: Yeah. Yeah. Yeah. Yeah, so we can do those as well. As
I said, people are using, whether it's mobile phones or
desktops, it defers a lot for that use case. I think in
enterprise if you look at what people used to perform
their task is probably always going to be their
computer. You'll rarely see people using mobile phones
or tablets in their office for office business. It's like for
personal rather than business. On the other side, you
have personal communication and entertainment that,
typically, you use mobile phones for that. We have to
understand these both.
Raul Popa: When we're talking about banks or financial apps,
people are using both mobile and desktops. A lot of
focus on mobile later, and so people are checking their
bank accounts over mobile phones. They're typing in
some pins or some other sensitive information that we
can look at how they type this and use that for
authentication and flagging suspicious users and so
forth.
Kirill Eremenko: Got you. One of the fears that entrepreneurs or
entrepreneurs-to-be have, so data scientists that may
have even come up with an idea, really, genius idea
that can revolutionize the whole industry, one of the
fears that stops them from starting a business is the
fear that as soon as it starts, and if it proving to be
working, a large company can come in with lots of
R&D, lots of funds, budget, presence, huge user base.
Kirill Eremenko: We're talking likes of Google, Facebook, LinkedIn,
Microsoft can come in and just copy the idea, not
necessarily like create a solution for the same problem.
You've identified the problem, but somebody else can
come in and do the same thing. How did you feel about
that when starting your business? Did that hold you
back or not at all? How has that played out? Do you
have any major competitors at the moment?
Raul Popa: What you're saying is a real thing. It happens every
time, right? Think about this, this is not a bad
problem to have. It's actually a very good problem to
have. Google coming in the game, just saying, just
dropping a name, right, means validation for our
technology, means that this is mainstream now,
means that anyone who started working on the
technology like three or four or five years ago has a leg
up, right? It means that other competitors of Google
will want to buy a company, or investors will say, "Hey,
we want to fund this company because this thing is
mainstream, and they have a leg."
Raul Popa: I think, definitely, a good problem to have. Clients will
want to use the technology that we're building rather
than other alternatives that are not typing biometrics.
Of course, some of that competition will go to Google in
that case, probably most of it, but when a big player
like that comes into a new space, I think they create a
large ocean that didn't exist before that you can
benefit as well and everyone in the space will benefit at
some point [crosstalk 00:47:18].
Kirill Eremenko: Got you. Got you. Basically, you're saying it's better to
have, I don't know, 1% of a huge pie than 100% of a
tiny pie.
Raul Popa: Yeah, I know it sounds less. 1% always sounds less
than 100%. I think what you said is like is true.
Kirill Eremenko: Yeah. You imagine if you have 100,000 users and you
have 100%, or you 1% of 7 billion users, better, 7
billion.
Raul Popa: Yeah. Yeah.
Kirill Eremenko: Got you. Got you. Okay, very, very cool. How is this
industry right now? We didn't talk about this. How
long have you been working on TypingDNA and has
this industry or this technology matured from your
perspective? What's the competitive space right now?
Raul Popa: Yeah, I'm working from 2014 on this technology as a
side project. Actually, I researched for two years. I
actually started working in 2016, but I already
connected most of the dots, and I knew that I can do
it, sort of. I had the conviction that I can pull this off.
Regarding the competition space, a lot of people tried
to create algorithms, recording typing patterns. There
are patterns that are almost 30 years old in the space.
I mean, now, there are public space, right?
Kirill Eremenko: Yeah.
Raul Popa: People try all sorts of things, but without machine
learning, statistical models are not really accurate.
You can use them, but before I started, the most
valuable technologies that were built around this thing
needed you to type for a day or two for the technology
to be able to recognize you with a 70-something
percent, 80% accuracy. We can do zero false positives
with three samples of you typing in your email and
password, for example, I think, for your credit card.
That's like rare, right? We don't want to replace
passwords. We just want to have a second layer of
security that can go with anything that you type.
Raul Popa: Also, we have an algorithm that works when somebody
types anything. You can type in the chat window and
just chat with somebody else, and we'll look at what
you're typing statistically and we can say ... I mean, we
create a statistical profile of your typing. We don't
record exactly what you're typing, only how you type
letter A, letter B, stuff like that, right, the averages,
standard deviations, that kind of thing, and we build a
profile, and whenever you're typing again, we can do
matching and that's the machine learning part.
Raul Popa: We can do matching and we can say whether it is the
same person or not on any text when you type
completely different things. The downside of that is
that you have to type about a tweet-length but imagine
you got to a country where you don't live. It's the first
time you get to Vietnam and somebody steals your
computer, your mobile phone, everything. You try to
get online.
Kirill Eremenko: As it happens, yeah.
Raul Popa: Yeah, and you try to get online to connect to your
people, to your friends, ask for some money to be able
to go back to your home, whatever hotel, wherever you
are, you will need a new computer or you will try to log
in to Gmail, let's say, from a new computer, a new
location without your phone, without your UBQ,
whatever you have. Your password, you don't
remember it because it was in your OnePass,
whatever, LastPass, whatever you're using for
password managing.
Raul Popa: All of a sudden, you're in front of a computer unable to
log into your account or to talk with your people, your
friends and ask them for help. That's, really, a
situation when you are into that, you will just have no
escape. One thing that Google can do, for example, or
anyone, any big application like that, Facebook,
anyone like that, they could ask you to type a sample,
a sentence, whatever.
Raul Popa: They can use that along with, let's say, a question that
they can ask, stuff like, "When did you log in last
time? From where?" They can combine pseudo-
information with typing biometrics and other things
like that and be able to reset your account or enter
your account like that. That's really-
Kirill Eremenko: Yeah. Yeah. Yeah. Yeah, fantastic, definitely a great ...
It's actually bringing a lot of convenience, a lot of, even
in this case, certainty to the world that if you go
traveling and you lose all your things, you can still get
in touch and log into your accounts because I think
that's, for me anyway, that's always a concern, when
I'm in third-world countries and so, exactly what do
you do in that situation?
Kirill Eremenko: Moreover, as we discussed before, typing biometrics
brings a whole new user experience where instead of
waiting for that SMS or doing those CAPTCHA, when
they're showing you images of a bus and I find those
so time-consuming when you have a grid of nine
images or an image broken down into nine parts, and
you have to point out where there's a bus, or where
there's a car to authenticate yourself or to show that
you're not a robot. I think that could be also removed
with typing biometrics, and definitely a whole new user
experience that can be introduced. That's a very cool
example.
Kirill Eremenko: I wanted to ask you another question. What is the
future, what future do you see for TypingDNA? I think
it's a very cool thing to think about and always wonder
about. Do you see an IPO some time in the future? I
know it's probably very early stages right now. Do you
see the company growing and, I don't know, building
teams around the world, getting clients in lots of
different countries? What is your vision for your
business right now?
Raul Popa: Well, I'm not thinking of the IPOS at this point. We're
really small. We're a startup. We raised that seed
round two months ago. I believe that the tech can
become mainstream and can save a lot of situations,
can be used to protect all sorts of accounts and
businesses and money and assets. We have a lot of
crypto exchanges, crypto wallets, crypto projects that
want to use our technology for making people safer
when they do transactions like that.
Raul Popa: We are happy that we can play a role in this entire
scheme, and we can make authentication and
communication easier, so you don't have to go to
friction in order to authenticate or to send some
important messages or to deal with private data. On
the how big the company can get, I think it really
depends on whether the technology gets adopted or
not on the wider scale. We believe there are clear use
cases on which the company can grow to multi-billion-
dollar size, but it all depends on how the audience and
how the market will receive that. We do have good
science though.
Kirill Eremenko: Got you. Got you. Also, well, hopefully, it all goes really
well, and this will be the first interview that you did
just before your company becoming a billion-dollar
company. Before we finish up, I wanted to ask you,
from everything that you've gone through because I
think your journey is very exciting from data scientist
to research to founder and CEO to growing the
business and disrupting a whole new space.
Kirill Eremenko: What would you say has been your one biggest
learning, something that you can share with our
listeners out there who are data scientists who might
be considering starting a business, who might be
considering staying in their current positions or
progressing with their careers in data science?
Nonetheless, what is one learning, something that
you've found very useful for yourself in your life that
you can share with them?
Raul Popa: I think data scientists are typically very intelligent
people, and so I think typical intelligent people have
this problem of overthinking. By far, this is like the
biggest problem that, collectively, data scientists have.
All that, I know. Also, on the other side, entrepreneurs
are rarely overthinkers. Usually, they are more
opportunistic, kind of optimists, that think that things
will somehow solve by themselves in the future.
Raul Popa: This is why they start companies, they do a lot of
things like that, so making the switch between those
two is really, really hard. The key is to, first, you have
to be less anxious, to overthink less. I think, to do
that, you have to be more relaxed. I have a mentor,
Kevin O'Brien from GreatHorn. Actually, this is his
advice. Whenever you're entering the ring to box with
an opponent, you can't be that stiff, kind of rigid
opponent that you would fear at first, right, that you
see that there's a lot of fear on his face as well, or you
can be that relaxed guy who just enters the ring by
knowing things will work out.
Raul Popa: By being laid back, you will be able to spot the other
person mistakes because people make mistakes al the
time. If you relax, you can do that. You can use those
mistakes to get an advantage. I think that's really
important. Things that you can learn when you want
to turn into an entrepreneur from a data scientist,
relax. That's really important. Become curious about
things that are not really as important and you think
they should do more things for fun, get out of your
comfort zone, stuff like that. I think those are really
important things.
Kirill Eremenko: How do you relax?
Raul Popa: Yeah, as I told you, I am a curious person, so I like to
learn things about all sorts of things. Sometimes,
when I'm really stressed about something, I find a
topic on which I want to learn more. I always want to
learn more about something. Trying to do that, you
will focus on something new. Entering that learning
mode, really, really valuable. Two, three, four hours,
learning something about something else like your
normal life, you will just find yourself in a comfortable
position where you don't see the risk anymore, you
don't see the pressure anymore, and you can think
clearly, really important.
Kirill Eremenko: Okay. Got you. Got you. Raul, that's fantastic advice.
I'm going to take on board as well. I sometimes
overthink a lot of things as well. Yeah, good, good
advice. On that note, we've slowly approached the end
of this amazing podcast. Raul, thank you so much.
Before I let you go, can you tell us a bit about where
can our listeners find you, connect with you, maybe
for a career, maybe some business owners then get in
touch about trying out TypingDNA, or maybe if some
data scientists that are looking for jobs that might be
interested in this space and looking for job
opportunities that you might have. What are some of
the best places to find you?
Raul Popa: Typingdna.com, we have a lot of demos there, a lot of
information. Our recorders for typing patterns are
actually open sourced, so if you have another project
in typing biometrics, you want to use these for, I don't
know, detecting how people type, you know, use that,
you can contact me on [email protected] or R-A-U-
[email protected]. That's pretty much it.
Kirill Eremenko: Awesome. Is LinkedIn okay as well?
Raul Popa: Yeah, LinkedIn is fine, Raul Popa at LinkedIn, sure.
Kirill Eremenko: Got you. Okay. Awesome, and guys, make sure to get
in touch. One final question before we finish up today.
What's a book that you can recommend to our
listeners to help them in their careers and life
journeys?
Raul Popa: I don't recommend data science books, typically. I
recommend online courses, but one book I recommend
for everyone doing data science is Black Swan from
Nassim Taleb, also the last books from this author like
Antifragile and Skin in the Game touch the essence of
the rare asymmetries that we find in the real world
and how these may lead to positive outcome. I think
these are extremely satisfying books for me as a data
scientist.
Kirill Eremenko: Got you. Awesome. That's Black Swan, Antifragile,
Skin in the Game.
Raul Popa: Yeah.
Kirill Eremenko: Great, great advice, and on that none, thanks so
much, Raul, for coming on the show, very, very
exciting podcast. I'm sure our listeners enjoyed it as
well, and best of luck with TypingDNA. I hope you
change the world.
Raul Popa: Thank you very much. Thank you for having me.
Kirill Eremenko: There you have it. That was Raul Popa, CEO and Data
Scientist at TypingDNA, what a discussion. I hope you
enjoyed this conversation as much as I did. We went
into, first of all, the world of typing biometrics. How
crazy is that? It blows my mind. Just by typing in my
logging and password a couple of times, I can then be
identified in the future as myself, and that is crazy, as
you can imagine that, or from what we discussed in
the podcast, there are plenty of applications that can
make our lives easier and safer in many, many ways.
Kirill Eremenko: Also, it's very cool to learn both the data science
aspect of things and the different approaches,
techniques, algorithms that are used in the space as
well as setting up a startup around a data science
idea. If you have a data science idea, maybe now, you
have some better ideas of what is coming up for you,
what to expect. My personal favorite part of this
podcast, something that stood out to me the most was
the creative ways of collecting data, as Raul
mentioned, on when they needed those patterns, how
they created that fun tool for people to use to learn a
bit about themselves, their personality type, and
things like that based on their typing pattern. That
allowed them to collect data. I think that's a very out-
of-the-box type of thinking, type of idea.
Kirill Eremenko: On that note, make sure to connect with Raul and you
can find his URL to his LinkedIn as well as all other
materials that we mentioned on this episode at
www.superdatascience.com/251. That is
superdatascience.com/251. Make sure to hit up Raul
on LinkedIn. If you know anybody who is looking to
create a startup in data science, who has a really cool
idea that can be empowered with machine learning or
data science, or any other data-driven technologies,
then make sure to send them this episode, and maybe
they can cut through their learning curve and get
some really cool ideas from here. On that note, thanks
so much for being here today. I look forward to seeing
you back here next time. Until then, happy analyzing.