SDS PODCAST
EPISODE 321:
THE LIFE OF ONE
ADVANCED DATA
SCIENTIST
Kirill Eremenko: This is episode number 321 with advanced data
scientist, Morgan Mendis.
Kirill Eremenko: Welcome to the SuperDataScience podcast. My name
is Kirill Eremenko, Data Science Coach and Lifestyle
Entrepreneur. And each week we bring you inspiring
people and ideas to help you build your successful
career in data science. Thanks for being here today
and now let's make the complex simple.
Kirill Eremenko: This episode is brought to you by DataScienceGO
2020, our very own data science conference. We've
already done three events in the past three years and
we're moving into our fourth year in 2020. And to give
you a feel for what to expect, here are some stats from
DSGO 2019. We had 620 attendees fly in from 25
different countries. 38 speakers gave talks, 150 plus
business decision makers attended the sessions as
well and get this, 2400 cups of coffee were drank
during the networking sessions.
Kirill Eremenko: So DataScienceGO is not just a place where you will
get all the top data science skills that you need for
your career. That's definitely a huge component of the
conference, but also it's a great place where the
community comes together to network. At
DataScienceGO, you will meet data scientists and
professionals from companies like Accenture, AIG,
Wells Fargo, MasterCard, Facebook, Google, IBM,
Microsoft, Salesforce, Teradata, Amazon, eBay,
Shopify, and many, many more.
Kirill Eremenko: So this is a great opportunity to meet and network
with your colleagues, to meet and start catching up
with your mentor or maybe to even meet the manager
at the next company that you'll be working for. At
DataScienceGO 2020, we've been almost doubling
every single year. So we're expecting about a thousand
attendees at this next event.
Kirill Eremenko: DataScienceGo is happening on the weekend of the
6th, 7th and 8th of November, 2020 and you can
already secure your tickets today at
datasciencego.com. And one more thing is that we
actually have different tracks. So we found that this is
a very important component for attendees and we have
tracks tailored to your experience. So if you're a
beginner, there's a beginner track which will help you
get the skills to break into data science. If you're an
intermediate practitioner, there's an intermediate track
for you to progress to advanced. And if you're already
advanced, there's an exclusive advanced track just for
you.
Kirill Eremenko: So whatever your level, you can find the right track,
the right talks, the right workshops, the right sessions
and case studies and panels at DataScienceGO. So on
that note, this is the best conference for you to attend
to skyrocket your data science career. So make sure to
secure your ticket at datasciencego.com today. And I
can't wait to meet you in person in California in
November, 2020.
Kirill Eremenko: Welcome back to the SuperDataScience podcast ladies
and gentlemen, super pumped to have you back here
on the show. Our today's guest Morgan Mendis is one
of the most advanced data scientists I have ever had
the privilege of meeting in person. Morgan and I met at
DataScienceGO 2019 a couple of months ago. And
since then his life has taken on so many interesting
twists and turns. You will be so excited to hear what's
been going on in his life. In this podcast you will learn
the story of how we met and why we got to chatting at
DataScienceGO in the first place. You'll also hear
about a VP of data science, a vice president of data
science role that Morgan is helping fill in Washington
DC. So if you're an advanced data scientist somewhere
in that area or if looking to relocate in that area, this is
going to be super exciting for you. Roles like that don't
just lie around on the ground. They're quite hard to
come by and this is an opportunity of a lifetime.
Kirill Eremenko: So listen up is going to be really, really exciting. It's at
the very start of this episode, you'll hear about that
role. And even if you're not looking to get into a VP of
data science position, maybe if it's a bit too early for
you, maybe is that something you're aspiring towards
in the future? It will be very interesting to hear what
kind of requirements are for in a role like that and
what is the goal of a role of a vice president of data
science. You will also hear about why Morgan decided
to turn down a very exciting opportunity in his career
and in order to follow his dreams, pursue his dreams
and passions and move to Haiti and what he's doing
there, the very noble and admirable cause that he's
helping with his data science skills called Ayiti
Analytics. You'll hear all about that and how you can
get involved if you're also excited about helping others
learn data science. So a very, very wonderful thing that
Morgan and the team at Ayiti Analytics are doing and I
was very Inspired to share this story.
Kirill Eremenko: And of course we also went through Morgan's
background and all of the great takeaways that he's
learned along his way to becoming an advanced data
scientist. So you'll learn about Excel and how for some
applications he still uses Excel and why it's important
to know which tool to use where, how you can
automate Excel with R and you'll get some very
valuable tips there, especially on how you can save
time to apply to do more exciting things in data
science. You learn about how Morgan mastered Python
and why and when he uses R, when he uses Python,
when he uses both. How he combined his data science
skills with his econometric skills and what that led to.
You'll also learn a lot about the ETL process in data
science, how to maintain models, why it's important.
And also Morgan went into quite a lot of depth on the
Airflow tool, a very cool tool for extract, transform load
procedures which you can already use in your career.
Kirill Eremenko: So if you've never heard of Airflow before, this is a
great opportunity for you to get up to speed with this
tool and see if it's right for you. Those are just some
examples of what you will hear on this podcast. I'm
very, very pumped about the conversation that we just
had and so let's not put it off. Without any further ado,
I bring to you advanced data scientist Morgan Mendis.
Kirill Eremenko: Welcome back to the SuperDataScience podcast ladies
and gentlemen, super excited to have you back here on
the show because today we have a very interesting
guest joining us. Calling in from Haiti, Morgan Mendis.
How are we going, Morgan? How's everything going for
you there?
Morgan Mendis: Good. How's everything with you Kirill?
Kirill Eremenko: Amazing. Everything is good and so cool that you're
calling in from Haiti like how do you pronounce it
again? You just told me I already forgot. How do you
pronounce the name of the Island?
Morgan Mendis: Oh, so it's Haiti in English, but people might know it
also in Creole as Haiti.
Kirill Eremenko: Mm-hmm (affirmative). Haiti.
Morgan Mendis: So it means land of many mountains.
Kirill Eremenko: The land of many what?
Morgan Mendis: Mountains.
Kirill Eremenko: Mountains. Okay. Well I've never been to Haiti. I would
love to go one day. And you told me your origin... Your
mother is from Haiti, right?
Morgan Mendis: Yes, my mother is Haitian and my father is from
Malaysia.
Kirill Eremenko: Okay, fantastic. Very cool. And it's very interesting.
You've only gone back there like, would you say 12
days ago?
Morgan Mendis: Yeah, so I just moved here about like 10 days ago to
start a new job down here.
Kirill Eremenko: That's crazy. Like so much has been going on in your
life. So let's rewind back a bit. So we met at
DataScienceGO 2019 in San Diego. That was what,
one and a half months ago. Right? And since then so
much has happened. So first things first, how did you
get to DataScienceGO? What were you doing there?
Because I thought you were actually, you live in
California.
Morgan Mendis: No. So what originally happened was about in 2018 I
saw on LinkedIn a bunch of people in my network in
data science, I posted about DataScienceGO being a
really awesome conference and that they were actually
able to connect with other data science practitioners
rather than with like industry and more corporate
sponsors. So I remember during that week I went
ahead and got the early bird special and was like,
"Yeah, I want to go to this conference in California."
And coincidentally through my work previously, I also
got accepted to be a presenter at another conference in
the San Francisco area. So I went out there and then I
told my employers at the time, I was like, "Hey, I'm
already going to be out here in California. Would you
guys mind supporting me going down to San Diego
while I'm out there to go check out this data science
conference I've already paid for?" And they said, "Yeah,
sure, go down there, check it out. And also if you can
make a plug for our new opportunity for head of data
science at the team." So that's how I ended up at
DataScienceGO.
Kirill Eremenko: Oh yeah. Yeah, okay. And then that's when we talked
and you mentioned that you're hiring for a VP of data
science. Did you manage to find anybody at
DataScienceGO?
Morgan Mendis: No. That was actually the reason I remember I came
up to you at the last day of the conference because I
was like, "Man, I've met a lot of interesting people, a lot
of students." And a lot of them had questions for me
and, I was really excited to give back, but I was also a
little frustrated because I was looking for somebody at
a higher level who could help support the team,
especially because I was interested in taking this
opportunity in Haiti.
Kirill Eremenko: Yeah.
Morgan Mendis: So that's when you invited me on the show. So I could
hopefully let it be known that they're doing really cool
and exciting things at Inspire and to meet somebody
that would be awesome.
Kirill Eremenko: Yeah, absolutely. And I wanted to say that I really
appreciated your feedback there and we actually very
actively took that on board and consider it. And I was
like, "Okay, why couldn't Morgan find a VP of data
science at the event? How can we, fix whatever that is
indicating? And we've actually worked a lot on the
event and at the next events at DataScienceGO
starting from like the next immediate one, we're
actually going to have a separate track for very
advanced practitioners like yourself. Like we already
had some advanced talks this time and I think you
mentioned you enjoyed like the Salesforce talk by the
head of data science at Salesforce, is that right?
Morgan Mendis: Yes, yes. She was talking about model comparison and
kind of the infrastructure and it was really great
hearing her perspective and the future of Salesforce
and data science there.
Kirill Eremenko: Yeah, so we already have a few talks like that, but next
year we're actually going to have a separate dedicated
track for advanced practitioners only. It will be very
exclusive limited seating to get as many talented
advanced practitioners like yourself in the conference
so that you can make network with each other and
also that you can give back to the rest of the data
science community. So definitely to give feedback on
boards. Thank you so much for providing that and
coming up to me and indeed for our listeners out
there, one of the reasons why I'm excited to have
Morgan on the podcast today is because this is so rare
where a company is hiring for a VP of data science.
Kirill Eremenko: So if you're an advanced practitioner or you've been in
data science for three, four, five, six years or so or
more and you really believe you can lead not just
several data scientists into their professional career
growth, but actually lead an organization in the space
of data science. Morgan's got a great opportunity for
you. This is actually... So as I understand, you were in
this position yourself and then you decided to leave to
go to Haiti to pursue your passion and dream in this
other company and now this position has freed up and
you're looking to help your previous company fill this
role. Is that right?
Morgan Mendis: Not, not exactly. So the opportunity arose early, right
about the same time I went to DataScienceGO, my
previous manager had left and they were interested in
me stepping up and taking on more responsibility. But
at the time I was like, I'm really passionate about
pursuing my dreams and that means I have to go to
Haiti. So I wanted to support them as much as
possible in finding this new person to fill in the role.
And I thought that it was really important that the
person have data science experience from the get go.
Not potentially somebody ladder who's making a
horizontal move from software engineering or like a
director of business analytics, but more focused
specifically on data science that has a vision, a
strategy for the tools, but also how can we incorporate,
design into, and human elements into like healthcare
data science.
Kirill Eremenko: Well, on that note, why don't you tell us a bit about
your company that you've just left and where this
opportunity exists and if anybody's interested, how
they can get in touch about this role?
Morgan Mendis: Sure. So Inspire is one of the largest online networks
for patients and caregivers to share information about
medical conditions. So Inspire is connecting people so
they can share information to help them really
understand the condition, but also find emotional
support through others. So through the position of
data science, we're trying to find different ways that we
can surface relevant content and information to people
so that they can find others who are like themselves
and who are going through these really complex
medical condition and potentially aren't able to get
that information directly from the doctors. So a huge
aspect of the work I was doing was trying to synthesize
information and all of this health data that they might
be getting in terms of medical jargon and translate
them to information that they can use to better plan
and understand their condition to really honestly
choose to live their lives in the best way that they can
based on their own values. So not the values that
potentially academic is prescribing them, but based on
giving them access to information so that they can
make their own decisions.
Kirill Eremenko: Okay. Wow, that's very noble. Would you be able to
provide an example of how you're using the data to
help patients who fight on given a specific condition,
something like just to put it into like a tangible output
somebody would get?
Morgan Mendis: Okay, sure. So we have a product at Inspire called...
So we're developing, it's called Health Profiles.
Kirill Eremenko: Mm-hmm (affirmative).
Morgan Mendis: Which actually allows you to, you answer a few
questions about yourself and then you're able to see
relatively in the community for example, what drugs
are you taking for your lung cancer. You will answer
these questions and then you could see other people in
the community who are taking similar drugs and
based on the privacy settings, because health
information is, security and privacy are really
important to us at Inspire. I'm saying Inspire as we,
but they're really important. So you want to protect
the patient and give them full control over their
information. But we still want them to be able to find
other people who they might be able to relate with and
connect with on a personal level.
Morgan Mendis: So we want to use the data to help them connect. So
they're able to answer these questions and they're able
to find other people who've answered similarly and
then reach out to those people if they've said, "Hey,
you can share my information with others in the
community." So that they can then ask them follow up
questions about, "Hey, I've been getting this side effect
from taking this drug. Have you experienced this as
well?" Right. Or they might say, "Hey, I've had a really
tough time adhering to my medication treatment
schedule, what are some tips that you do in order to
stay on top of your medication schedule?"
Kirill Eremenko: Okay. Okay. Got you. That's a very, I would imagine
helpful service to people out there who are going
through these tough challenges. So hats off to you and
to Inspire for doing this. This is a very cool
undertaking. And tell us a bit about the role. So this
VP of data science, how big would the team be?
Morgan Mendis: So the team is expected to grow. When I left there was
about five members on the team and from the
indication of the CEO, they really expect that data
science is going to become the bread and butter of the
company. So I wouldn't be surprised if the data
science team were to grow to being something a 15
plus. Again, I can't speak too much about it because
I'm not currently at the organization. However, they
see the strategic importance of data science and they
really want to find new ways of leveraging the data
that they currently have and potentially looking to use
existing open data to augment the data that they have
as well. So for example, there's a blue button that CMS
or the Centers of Medicare and Medicaid Services as
well as the veteran affairs have opened up via fire
protocols.
Morgan Mendis: So this is also the same protocol that Apple HealthKit
is using. So currently if you're, let's say you're a
veteran, you might be able to go in and get all of your
medical information from veteran affairs, right? But
this data's coming out to you on a text blob. How do
you visualize? How do you use that information?
That's a challenge currently for data scientists in the
healthcare space, is that, we need to take this
information and we need to be able to analyze and
present it back to people in a way that then they can
make their own decisions about their health and how
they pursue medical services. So it's a huge political
debate in the U.S. right now about how our medical
system and how we pay for medical services. But one
thing that's not debated by anybody is that patients
should be in control of their medical information and
they should be in control of their medical care.
Kirill Eremenko: Mm-hmm (affirmative). Yeah, totally. Totally got you.
And what kind of a person is the company looking for
this position? For VP of data science. What kind of
experience?
Morgan Mendis: So they're definitely looking for somebody who's got
experience getting their hands dirty, but they're also
looking for somebody who has experience strategically
envisioning how to lead a team and how to take a
product from prototype all the way to production and
deployment. So they definitely want somebody who
has the engineering experience and the analytical
experience a little bit, kind of that business acumen,
but is also willing to get a little bit deeper into the
weeds of the technical depth of evaluating models,
evaluating technologies. So of course, right? There's
the one thing about the idea of it being unicorn, but I
think a key thing that we also want to push in this role
would be the idea of understanding the importance of
the patient being at the center of it.
Morgan Mendis: So definitely have to have a little bit of engineering and
the product mindset, but I think that a really big
important thing to Inspire is the culture. So making
sure that somebody is able to understand the
importance of patient centricity and autonomy. We
don't want somebody who's pushing like, "Hey, this
data science, it's going to solve everything. They just
need to give us their data and then we'll manage
everything." No, it's got to be a give and take
relationship. So if they give us data, we need to be able
to give them back something so that they have an
incentive in order to share their information. Because
Inspire is, it's mission and goal is actually to promote
medical science. So it's trying to work together. It has
relationships with the FDA and other research bodies
trying to understand how can they better understand
these conditions from the patient perspective. And so
at Inspire, they call that the patient voice. So having
someone in a role who understands the technologies,
but also understands the human element, I think
that's really important for Inspire.
Kirill Eremenko: Mm-hmm (affirmative). Mm-hmm (affirmative). And
where is Inspire located? Where would this role be
based?
Morgan Mendis: So this would be based on the D.C. area, Inspire's
located at Arlington, Virginia.
Kirill Eremenko: Okay.
Morgan Mendis: Right. It's going to be right next to new Amazon
headquarters.
Kirill Eremenko: Nice. D.C you mean Washington D.C. right?
Morgan Mendis: Washington, D.C. Sorry.
Kirill Eremenko: Okay. Got you. Awesome. Well, and finally, if
somebody is interested, how do they get in touch? I'm
assuming you would make the referral. How do they
get in touch with you?
Morgan Mendis: So they can reach me obviously on LinkedIn. So
Morgan Andrew Mendis, M-E-N-D-I-S. That's my last
name. But yeah, please feel free to send me a message
on LinkedIn. I'm happy to share more information.
Kirill Eremenko: Got you.
Morgan Mendis: But if you also want to look up the Inspire website, it's
www.inspire.com.
Kirill Eremenko: Nice. Very cool. And we will of course show those
things in the show notes. So if anybody out there is
interested in a VP of data science position, which we
don't talk that often about on the podcast, there's your
opportunity. And also I think this was very cool to
even, if you're not interested, I think it shows an
example of what people are looking for in a VP of data
science role and what those roles are like. So yes,
thanks a lot Morgan for sharing that. Hopefully you'll
get some really cool applicants for this role. And it
sounds like the company is doing some very good
things for the community.
Kirill Eremenko: And so tell us a bit about your dream now,
congratulations first of all, you're pursuing your
dreams. You're in Haiti, you've completely changed the
course of your career now and your life, sounds really
exciting. Tell us like, how did all transpire?
Morgan Mendis: I was just a little kid, back in Maryland and I was
dreaming of, how can I eventually grow up to give back
to the world? And I was telling you earlier that it all
stemmed from the idea that I want there to be more
economic opportunity in the world, economic
development. And it kind of stems from, in order to
have economic opportunity, you need to have some
kind of education. You need to have some kind of skill.
But before you could pursue education, you need to be
able to pursue, have good health. So that's why I really
got into healthcare.
Morgan Mendis: My goal was to get into global health. However, the
positions are extremely competitive. And that's why I
was board on an insurance company and I was
working part-time during my final semester of college.
So I started learning R and that's kind of how I got into
data science. And that kind of led me down the road to
eventually finding myself in a position where here in
Haiti they needed somebody who was able to do data
analysis and data visualization and really string
together all of their information systems. And that's
how I ended up here in Haiti.
Kirill Eremenko: And you're now the principal data scientist at, what's
the company called?
Morgan Mendis: So yeah, I'm actually doing two things down here in
Haiti. One is that I'm working with a nonprofit
organization called the Caris Foundation. And there I
took the title as health informatics consultant.
However, I'm also working to start the first data
science lab here in Haiti called Ayiti Analytics. And
there we're trying to train the first generation of data
scientists in Haiti. So we're really excited to hopefully
push forward the opportunities of Haitians to pursue
analytics and to use data science to improve the state
of the nation.
Kirill Eremenko: Wow. That's another very noble cause. And yeah, what
kind of challenges, you only started this like a few
weeks ago. What kind of challenges are you expecting
along the way?
Morgan Mendis: So I was mentioning to you earlier that there's a lot of
current political instability in Haiti. So it's tough to get
people to come into the office every day. Internet
access might be intermittent here. Sometimes people
lose power. So it's really difficult to have some kind of
cadence in terms of scheduling. So we understand that
for people sometimes to get to a location where we're
having the onsite trading, that could be a challenge,
getting around the city, there's often roadblocks. So
it's common here for you to be in many WhatsApp
groups in order to get information on which streets are
available. So that's one challenge, is transportation
and for people coming in.
Morgan Mendis: Another challenge obviously is that for people to take
the time off, to spend, to learn data science, to invest
that time, they need to potentially not be working or
they might need to be working twice as hard as other
professionals in other countries where they might be
able to have a safe location where they can have
internet access, to learn the code. A lot of data science
learning requires online content and if you can't get
people physically in your proximity to help mentor
you, it's difficult. You need to be able to reach them via
the internet and that's a challenge here. So we are
experiencing that challenge, another challenge is that
there's not a lot of resources in Haitian Creole and in
French for data science. So I was very excited that you
are telling me that SuperDataScience has some of their
courses in French because that's definitely something
we're going to be interested in.
Kirill Eremenko: For sure. And as I mentioned, I would be more than
happy to support a mission like that and provide free
SDS accounts to your community to make sure that
they're learning and like we can do as much. And I
always love when people are doing things like this and
this was like what was surprising me when you were
speaking just now that you effectively turned down a
vice president of data science opportunity in a up and
growing company which is making a massive impact in
the world. It's something that people would just love to
have, a lot of people are aspiring to have and build
their career towards the VP of data science position.
You evicted turned that down to move to a country
where you're facing lots and lots of challenges and at
the same time like by your voice, I'm sure our listeners
will agree, I can tell you are happy, you're like excited,
you're living your dream.
Kirill Eremenko: How does that add up? Like you turned down a
massive progression in your career in order to follow
your dreams in a completely different environment,
which seems like much less secure, much less safe,
and yet you're very happy. Tell us, how do you feel
about all this?
Morgan Mendis: So one thing I have to say is that Haiti is one of the
most beautiful countries in the world. And I know
Kirill that you were talking earlier about that we both
are avid travelers. I'm just blown away being here. I
love the culture, I love the food, I love the music. It's
such a beautiful place. And I think that when you talk
about the opportunities of like people in their career
progressions, some people, they want to be in a
management position, they want to be in a position
where they're being able to bake big, high level
strategic decisions. I have always been on more of the
community organizing side of things where I want to
be down with the people. I love learning languages
because I like talking to people and I like learning
about their problems. And generally what I've noticed
is that providing simple solutions that aren't
complicated, that can change somebody's life to me, is
much more rewarding than building these really
complicated tools and models that kind of sit behind,
sit in a server somewhere that nobody ever sees and
potentially doesn't change their life.
Morgan Mendis: It might make somebody, a few more percentage points
of ROI on their investment vehicle, or it might save a
couple dollars for a supply chain. However, for me it's
about changing somebody's life. It's about talking to
people, it's about hearing about how can I make it so
they can live a more rewarding life. Because, data
science has given me the opportunity to live a
rewarding life. Our education as a society, our
development as a civilization has all been towards,
pushing the whole race forward. It's not about us
individually or us and tribes, it's us about, us coming
together. So I'm really excited to be here and yeah, I
think that's about different interests, different
passions. You have to choose what do you find
valuable in life?
Kirill Eremenko: Mm-hmm (affirmative). Mm-hmm (affirmative). Yeah,
no, that's absolutely true. Your fulfillment, happiness,
they don't really come from accomplishment or going
up the career ladder. And sometimes it's necessary,
sometimes you might find exciting but sometimes you
just want something else in life and that's totally
normal. It's important to be able to let go of things and
move on to, as you said, you're following your dreams.
It's like no better place to be. So very excited for you.
Very pumped up. Can't wait to hear some of the great
things you'll do. Maybe like in a year, a year and a half
we can do another podcast where you'll tell us about
all those things you've created in Haiti and how many
people you've gotten up to speed of data science. I
think that'll be very cool.
Morgan Mendis: Yeah. Well I have to push you is that, I would argue for
me success would be that you are inviting somebody
from Ayiti Analytics who we've trained and who we've
got off the ground to be on the show to talk about what
they've done in maybe the last nine months.
Kirill Eremenko: Nice.
Morgan Mendis: That would be to me success. That would make me
happy.
Kirill Eremenko: That was awesome. All right. Okay. Let's blend that in.
Sounds like a good idea. Okay. So let's take this
opportunity that we're on the podcast and what I
would love to do is you are by far one of the most
advanced data scientists I've encountered in my,
speaking with data scientists, meeting people,
traveling around the world. And I want to... That's one
of the other reasons why I invited you on the show. I
really want to share your experience of advanced data
science with our audience and be like what kind of
takeaways they can get. So the things that you're
teaching in Ayiti and the people that you get up to
speed, I think they're going to be very lucky to have
such a great mentor as you leading them. So I would
love to see what kind of insights you can also share
with our audience here. How does that sound? Do you
mind going through a couple of your case studies or
use cases of data science that you've done in the past?
Morgan Mendis: Sure, I'll have to preface it with the fact that I don't
consider myself even within the top 10% of data
scientists. So I appreciate the compliment, but I think
that what I do is that I take models and I take
whatever tools best fits the situation and hey,
sometimes that's Excel. I make that argument all the
time is that you don't need advanced tools. Sometimes
you just have to use the tools to the full capability or
full extent. But yeah, I'm happy to go through some of
the case studies that we've discussed earlier.
Kirill Eremenko: Mm-hmm (affirmative). Sounds good. All right, well
let's get started. So are you going to be fine if we go
through your experiences like post-graduation, like
one by one like ChenMed, Aledade and so on.
Morgan Mendis: Sure.
Kirill Eremenko: Is that better?
Morgan Mendis: Yeah, yeah. No, that's fine.
Kirill Eremenko: Awesome. So in that case, let's get started, and
perhaps let's just go through your experiences after
graduation one by one. You mentioned that, in your
email that the first role that you had post collage
graduation was at ChenMed. So like what did you do
there and what kind of tools did you use?
Morgan Mendis: Sure. So I first, when I got out of college, I started as a
business analyst at ChenMed. So they are a medical
provider. They run several different facilities across
South Florida and they've opened up across the
Southeast. But my position really was using Excel to
do a lot of business analytics. So that was writing
reports, copying charts, and putting them into
PowerPoint and all of this. I was a little frustrated. I
was like, we can be automating a lot of this before.
And so what I started doing was that I started
automating a lot of the work that I was doing in R, and
I would actually show up really early in the morning
before everybody else, seven o'clock. So I could start
automating the reports and my manager had no idea
that I was secretly automating the reports. But the
reason why I would stay late is because I was then
using the data to then explore different tools and
different libraries in R.
Morgan Mendis: So one of the things was, we had a challenge of
understanding how to move different... One of the
patient offices was closing and we wanted to reallocate
people to different offices within the geographic region.
Very simple. Took the data, converted into longitude,
latitude, place it with Gigi Maps onto a map. And then
I was able to calculate the distances to the local offices
and say, "Hey, we can just put all of the patients to
this office and they won't reach capacity. We can put
the other patients this office and they won't reach
capacity." And based on proximity, they'll still be able
to do it within their regular commute. And it won't be
that much of a transition for the patient populations.
So very, very simple things like that. Taking existing
tools, right? But my key thing was, "Hey, I already
know these tasks exist out there. Let me just try and
automate them and that way I can get more access to
the data to explore what else we can do."
Kirill Eremenko: Mm-hmm (affirmative). What's an example of the
automation that you were performing in Excel with R?
Morgan Mendis: So there was a lot of opening up different
spreadsheets, setting up linkages, lots of different,
what we would consider typically to be unions or
concatenations of the datasets. And so that was a
simple thing is that I would have to wait potentially for
new data set to come in or new Excel file to come in
before I could potentially do my full reporter, run my
aggregations, like the pivot tables and to me it was
like, "Pivot tables are cool, but there's nothing special
about them. We were familiar with them in Group By
and SQL. So I started just saying writing up scripts. I
was like, okay, once the data comes in, boom, run the
script and the data's going to be pumped down to a
new Excel spreadsheet. I just have to then move a
table into a PowerPoint and focus more of my time on
analyzing the data for the key insights that we'd be
providing back to the office professionals rather than
spending my time trying to make sure that my indexes
and my VLOOKUPs match up.
Kirill Eremenko: Mm-hmm (affirmative). Mm-hmm (affirmative). Okay,
got you.
Morgan Mendis: So it's about taking the time so that you can focus on
analysis and understanding the data rather than
focusing on, "Hey, am I looking at the right file? Hey, is
there any kind of data validation errors going on?"
That's another example is that, if you need to validate
the data or if you necessarily had issues with
potentially the formatting in Excel, with R, with
Python, you can automate most of that duel, do unit
testing in order to validate the data and so that you
don't necessarily need to spend your valuable time as
an analyst checking these small little boxes. You can
spend more of your time understanding the data,
understanding what was the process of which the data
got to you by and potentially how can you make it
more valuable when you send it off to the next person.
Kirill Eremenko: Mm-hmm (affirmative). I love it. So you're freeing up
your time from checking small things or concatenating
data, doing these recurring tasks in order to have
more time for exploration. That's a really, really cool
thing. And it sounds a lot like a robotic process
automation, this type of automation, were they, the
scripts that you wrote, did you need to like run them
yourself or were they running in the background like
every night or something like that?
Morgan Mendis: So again, this is part of me actually, I had to write the
scripts and there was no automation. I couldn't set up
a cron job at the time. So they, at this organization at
ChenMed, they were pretty tight about what tools you
could use and what access you could get to the
internal systems. So there was no way that I could
just, okay, I'm going to go hack into their system and
set up this cron job or set up this automation process.
I wrote the scripts, so that's why I would come in early
and I didn't mind coming in early because then I said,
"Hey, I get to spend the rest of the day exploring and
innovating with the data. So I don't mind coming in
early to just run the script. I can do it while I'm having
my cafecito." This was while I was in Miami.
Kirill Eremenko: Nice.
Morgan Mendis: So I love the Cuban coffee.
Kirill Eremenko: Got you. Okay, cool. Very cool. I think it's a very
valuable insight or career advice for people. Like if you
find yourself doing recurring things where stuff can be
automated. And I love your dedication. Like already
you can tell you're loving what you do. You come in
early, you stay late, you're having fun along the way.
Fantastic. What was next, where'd you go after
ChenMed?
Morgan Mendis: So after that, because I wanted to get more access to
actually deploying more advanced data science and
actually using more tools, I took a job at Aledade as a
data analyst. And so the first day when they hired me,
they were like, yeah, you can use R. That's great. And
once I started the engineers looked at me and said,
"No, this is a Python shop. You got to learn Python."
Kirill Eremenko: So they didn't know what they use at the interview?
Morgan Mendis: No. So originally the rest of the analytics team at the
time was using SAS and I had studied econometrics of
strata in my undergrad, so I was kind of familiar with
the idea of SAS, but I was like, "Hey, it's not open
source. It's not my go to tool." But when they
suggested I learn Python, I was like, I've been looking
for an excuse to really ramp myself up on that. And so
luckily I had a really good mentor at the company, a
gentleman by the name of Jim Fulton, who really
provided a lot of guidance to me in learning Python
and learning some very good standards for software
development. And I don't consider myself a software
developer by any means, but he definitely helped to
guide me along and help me learn about a lot of the
tools, even though he wasn't working in data science
about how you could use Python for data science. He'd
been using Python for so long, he's like, "No, of course,
this makes sense. It's a great tool for data analytics."
Morgan Mendis: So, at that point I started exploring Python and
PostgreSQL and really got excited about it because I
was like, "Oh man, this opens up a whole new set of
tools and opportunities for me to either automate or
explore new modeling techniques and connecting
different tools." So again, for me it's all about finding
the right set of tools to improve the process and then,
as you improve the process, you're going to get a lot of
gains across the organization. So I was really enjoying
at that point, but a lot of my most enjoyable
experiences there was actually more leveraging my
training and economics to do econometrics.
Morgan Mendis: So at Aledade they are working for what's called
accountable care organizations. And they were, in a
similar vein to ChenMed. They were trying to push
patients centricity in a new movement that was called
value based care. And so we started exploring how can
we analyze the data to make, make the system more
effective to providing optimal care to patients while
also reducing the cost of care. Right. So that's a huge
issue is that, healthcare is very expensive. So is there
potential, not necessarily to reduce the medical
services but instead to say which kinds of medical
services are going to have the greatest gains for
patients.
Kirill Eremenko: Okay, very interesting. Okay. And so you were able to
actually leverage econometrics and combine that with
data science. What are your takeaways from that? Not
often do those two get combined by data scientists.
Morgan Mendis: Yeah. So I think the really interesting aspect was that,
and this is what I really started really understanding
the enjoyment of doing data science, was that we are
using statistics to do these evaluations and to better
understand, what we call medical interventions or
different treatments for different patient populations.
And the really interesting thing is that, I was able to do
like some complex analysis and make some strategic
decisions which changed our program delivery in
terms of how we suggested our health system work.
And I remember years later after doing this analysis, I
was reading, and health affairs, which is a really
popular journal for health data and analysis. And a
group of researchers had published some findings that
I had already uncovered earlier at this company. The
difference though was that after I did my analysis, I
had found some discrepancies in what our original
hypothesis was.
Morgan Mendis: And so I went down and did deep dives into the data.
Then I went up and I called actual people who are
working in the sites to come up with a patient
narrative like so to better understand the data and
then we were able to use that narrative to explain the
model and explain the results to other people. Because
a lot of people don't want to see, your regression
outputs, right?
Kirill Eremenko: Mm-hmm (affirmative).
Morgan Mendis: That's not the thing that's going to change their mind.
What's going to change their mind is if you give them a
story, give them a story that they can remember so
they can better understand, in the future be like, "Oh,
this kind of sounds like what Morgan was telling me
about the story." And so I thought it was really
interesting that I remember reading through the article
and at the end they identified this area but they didn't
provide any narrative or any explanation. And that was
the benefit to me was that, having this opportunity to
actually within a data science capacity, to actually be
able to use this advanced methods and get to the same
conclusions as academics, but being able to work with
actual practitioners to come up with a narrative that's
going to change the system. That's the exciting thing.
Kirill Eremenko: Is that why you used Tableau in this role to explain
these things in a more visual way?
Morgan Mendis: Yes, yes and no. So Tableau was one of the tools that
we use to obviously help tell the narratives. We also
use Tableau as a business intelligence tool and it just
allowed us to rapidly take all of the massive amounts
of data we had and quickly turn it out. So when I
think about Tableau and I think about its value, I
think about how quickly you can transform data into
insights rather than, you still need to, it's not going to
replace the opportunity to spend time with people
talking to people. And I think that that's the one thing
I want to emphasize here is that, Tableau is great for
being able to present information, but it's not going to
replace the storyteller. It's just an aid or tool to, help
you as a storyteller.
Kirill Eremenko: Mm-hmm (affirmative). Okay. Absolutely.
Morgan Mendis: You got to make sure you have a story, story tell and
there's a quote that I'm reminded of is that "stories
happen to those who knew how to tell them".
Kirill Eremenko: Wow. Nice. Very beautiful. Very beautiful quote. Okay,
so you manage to combine these two fields,
econometrics and data science. What else did you
experience in this role? Because there are some other,
tools you use not just Python, also is R. You also use
both Python and R in this role. Is that right?
Morgan Mendis: So I really preferred actually the statistical output from
R, so there are some regression models at the time
that I couldn't find in Python and I just felt a little bit
more comfortable with the robustness of all the output
that R was giving me. And as your listeners may well
know is that, R was designed really by statisticians. It
was born out of S. So there's a lot of model
development historically that's been done in R, and
there's a lot of really interesting modeling that's or
innovative modeling that's been done in R previously
before it got into Python or before data science blew up
to what it is today. So at the time, I really liked using
Python because it helped me connect my different
solutions, but I like to use R when I was actually doing
the evaluations. But if I was building a new data
product for example, or connecting some SQL to build
a web application, then I was going to go to Python
and I might then just have Python execute an R script
if needed.
Kirill Eremenko: Okay. Yeah. Makes sense. I've heard of that done
before. But since then, have you moved completely to
Python or are you still using a combination of the two?
Morgan Mendis: So I definitely go to Python. Like, I just got, at my new
position, I haven't installed R, R studio yet, but for
example, I was working on the side doing a research
project with a former colleague and he asked me, he
was like, "Would you mind doing this in R so that I can
follow along with your code?" And I said, "Of course."
That's 100% of valid reason to use a language is that
you can collaborate with others. If your team isn't
using Python, don't force Python on them. Use the tool
that's going to best enable you guys to work together.
Because to me, collaboration is way more important
than your personal speed in a language.
Kirill Eremenko: Mm-hmm (affirmative). Yeah, no, totally. Totally agree.
So as I understand you're mostly using Python now,
but do you still use both from time to time?
Morgan Mendis: Yeah.
Kirill Eremenko: Oh, okay. Got you. Anything else? Did you use any
advanced, I don't know, maybe like deep learning in
that role?
Morgan Mendis: Oh no, no. I have not actually had the opportunity to
explore deep learning in production. I've only been able
to play around with it in my like personal side
projects, but nothing in production yet.
Kirill Eremenko: Okay, got you. And so when you would deploy models
into production, would you then later maintain them,
and make sure that, like check up on them that
they're working well?
Morgan Mendis: Yeah, I think that's a key thing of course is that every,
depending on the data interval of your data. So for
example, we might be reevaluating models on a
monthly basis because for example at Aledade they
were getting data from the government at a monthly
basis. And because of the lack in terms of getting
complete records and claims data, you might only be
getting a portion of a given month. So you're going to
have to wait almost three or four months before you
have a full picture of all of the events that transpired
in a given month. So you have to keep updating your
models regularly to incorporate the new data or
potentially corrections to the data that might be
coming. So, especially if you're dependent on data
coming from a third party, right? They might change,
they might change their methodology for how the data
is being sent over to you. Right?
Kirill Eremenko: Mm-hmm (affirmative)
Morgan Mendis: Or some other kinds of procedures. So you need to be
able to quickly, you need to be regularly, I'm sorry, not
quickly. You need to be able to regularly go back and
reevaluate.
Kirill Eremenko: Mm-hmm (affirmative). Okay. Got you. Awesome. Well
thanks for sharing this role at Aledade. It sounds like
a very important step in your career where you've got
to learn Python and also apply econometrics in
combination with data science. Yeah. What was the
next step after that?
Morgan Mendis: So I joined Inspire the position we were just talking
about, which they're hiring for the VP of data science
and I was actually the first data science hire at the
company.
Kirill Eremenko: Oh, okay.
Morgan Mendis: So, yeah. So it was a really, I was kind of a green field
opportunity and I actually got to learn a lot about all
the different aspects of data science in terms of
building up first key analytics and then moving to
challenges such as buy versus build. So, I think I ran
the gamut at Inspire the different things I did. So I had
some fun, building out like time series regressions to
forecast member growth, obviously that's fun
modeling. The other thing was, working with AWS and
setting up different systems to build a data lake. So
just to start there is that because it was a green field
opportunity and I was the first data science hire, I had
a different perception of the way systems need to work
in order to produce high quality analytics. And so the
company had existing reports that they needed. But
for me I wanted to get into the exciting work of "Hey,
let's start building exciting data science products."
However, I need to have the data in a format and in an
environment that's accessible and is going to allow for
development.
Morgan Mendis: So the first thing I had to do was start designing a
data warehouse and building out an ETL process. So I
want to say that probably half of my job was actually
data engineering and then maybe a quarter was
actually doing exciting data modeling in data science.
And then another quarter was actually doing much
more of like the analytics management of saying these
are the tools that we need to put in the place, these
are the kinds of resources we need to do and this is
how we need to prioritize all of these different
objectives and opportunities.
Kirill Eremenko: Mm-hmm (affirmative). Okay. Interesting so that you
separate them like that. So data engineering would
involve putting the right datasets together and making
sure that all data is flowing properly. Is that about
right or is there other elements to that role?
Morgan Mendis: No, so I would say that it really encapsulates much
more than just building the right datasets. It's about
identifying where the data's coming from and then
mapping out the best processes to putting them into
an environment that's accessible for data scientists or
data analysts. So I say that more, it's more than just
creating data sets because they have, I think a good
data engineers, is like critical to any team. Again, I
wouldn't consider myself a good data engineer. It was
more of, you have to get this work done in order to do
the fun stuff. So it's actually more much more related
to software engineering, I think in the sense that you
need to be able to put these systems together in a
sustainable way so that the data analysts, the data
scientists don't need to worry so much about cleaning
and validating the data. They can spend more of their
time analyzing it and talking with stakeholders to
building products.
Morgan Mendis: So in that role I was, I first took the data, I explored
AWS data pipelines, that didn't work. I thought about
just writing a bunch of my own scripts and setting it
up with cron jobs. And I was like, that's not
sustainable. So eventually I settled on Airflow. And
this is Apache project that's, I think it was originally
developed by Airbnb and it's pretty amazing. It just
allows you to set up a lot of jobs that happen in
parallel so you can move data from one system to
another, process it multiple times, and you're actually
able to create a directed acyclic graphs. So you can see
in a network how your data's flowing and potentially
where there's bottlenecks or potentially where there's
errors that are going to break down your ETL process.
Kirill Eremenko: Okay. Wow. And so you settled down on Airflow and
that solution is still running now?
Morgan Mendis: Yeah.
Kirill Eremenko: Wow.
Morgan Mendis: And yeah, I'm hoping to set up a lot of other jobs of
Airflow in the future. I think it's a great tool. I hope
that AWS catches up in their data pipeline, but I
always keep my fingers crossed. I think AWS, they're
always going to produce something amazing. But right
now I think Airflow is doing a great job. And especially
if you're looking for something to quickly get started
with to build your own processing. Highly suggested.
Kirill Eremenko: Okay. So tell me a bit more, how does it work? So you
have lots of data sources, you have an ETL process.
How does Airflow facilitate that?
Morgan Mendis: Okay. Yeah. So Airflow uses what's called operators.
So you might be writing scripts and writing them as
functions. It's callable. So in order to move data,
transform data, or to produce reports, right? So you
can use Airflow to send messages, you can use Airflow
to, like to send emails, like automated reports to end
users. But the key thing is that you're writing
functional scripts in Python or you could actually use
other languages too. I know it basically allows you to
use whatever language is best suited for you. So you
could write bash scripts if you wanted to. And they use
what's called these operators to then call these
functions to do these transformations.
Morgan Mendis: So Airflow also has, I believe integrations with Redshift
and Postgres and many other like popular data tools
that you can then say, "Hey, I've got this data in my
SQL database, I want to federate it across, let's say put
it into elastic search index, right? And I'm going to set
this index up. So I would say, all right, Airflow, copy
this data down, put it into CSVs for MySQL, then store
those flat files into S3 then from S3 I'm going to load
them into elastic search and or separately into
Redshift.
Kirill Eremenko: Mm-hmm (affirmative).
Morgan Mendis: Right? And each one of those tasks, moving the data
from MySQL to CSV, from the CSV to S3, from S3 to
whatever data system or storage other system you
want for analytical queering, each one of those tasks
could be a Python function and you would then have
different notes set up. And so then you can create
dependencies, right? So that one task will only execute
after successful completion of a previous task, right?
So as you imagine, there can be splits, right? Or there
could be different tasks happening in parallel and all
of that can then be represented via a directed acyclic
graph. So if you're a big fan of, network theory and
optimization as I am, you are really excited because
then you have a bunch of graphs and nodes and you
can watch them all move and execute and you just get
a little giddy about watching it all.
Kirill Eremenko: That's so cool. It sounds like a very advanced version
of SSIS that Microsoft provides.
Morgan Mendis: Yeah, I can't say I've worked at SSIS but it, yeah, it's
open source. So I think you can probably-
Kirill Eremenko: It's just like a really cool advanced ETL tool.
Morgan Mendis: Yeah.
Kirill Eremenko: Mm-hmm (affirmative).
Morgan Mendis: Yeah, exactly. Like so I said before, it's like Amazon
has their data pipeline process and it's very similar is
that you can write scripts and you can even leverage it
with Amazon Lambda functions in order to execute
different things in the same way like Lambda is just,
you're hitting different functions and you have inputs
and outputs. Similar, I just thought that it was a lot
easier to work with Airflow. It's all in one system.
Rather than going to AWS where you have to work in
their ecosystem, you have to configure everything via
JSON. The nice thing is that Airflow gives you, it'll spin
up like a flask instance so you have a web app that
you can interact with so you can turn on and off
different operation jobs. And I was actually, I'm really
impressed. One of my friends showed me that his
company wrote their own custom operators in Airflow
in which they were then able to use what they call
juper centric development.
Morgan Mendis: And this is a really interesting idea is that to make
data scientists, able to quickly iterate and prototype
and put things into production, they would just make
it so that if you write the Jupiter notebook so that
executes and builds a model or processes the data,
whichever way the data analysts and the data
scientists want, all they need to do is to have a
notebook that's clean enough that can run from start
to finish successfully and boom, you just take that
notebook, you send it to your data engineer, they put
into Airflow and you have readable code. Right?
Kirill Eremenko: Wow.
Morgan Mendis: Right. It's a beautiful thing is that you have readable
code living and working in production. You don't need
to take your code from the Jupiter notebook copy to a
PI script, wrap it in another function. No, you just
have a Jupiter notebook. It's all there.
Kirill Eremenko: Very cool. Very cool. That's the way it should be, right?
Like why should you have to go through all these
hoops and finally create potential for additional errors
where you can just, you already have the code, just
run it.
Morgan Mendis: Yeah. And again, like I said, because it's readable and
you have all the formatting and the benefits, like for
example, you might have it so that your Jupiter
notebook is going to produce some kind of
visualization of your metrics or evaluation of your
models. I'm not sure if you guys have talked previously
about using data visualization to evaluate models in
development, but if you have that in your script, then
you could quickly go into Airflow or go into your
system, look up the notebook and say, "Oh Hey, that
model that we've been, checking up on, we can see
how well it's performing with the new data via Airflow
by just checking the notebook rather than having to go
through another all these other hoops to evaluate the
model.
Kirill Eremenko: Mm-hmm (affirmative). And to validating the model,
you just mean like the lift curve and things like that.
Morgan Mendis: Yeah, you could have all of those visualizations right
there.
Kirill Eremenko: Yeah, I said that they're just updated within your data.
You don't have to rerun them specifically.
Morgan Mendis: Yeah.
Kirill Eremenko: Very cool. Very useful tool. Thanks. Thanks a lot for
the debrief on how Airflow works. It makes me like
wonder like Excel, R, Gigi Maps, SQL, Python,
Tableau, Scikit-learn, AWS, Airflow, Plotly, Plus. All
those you've already mentioned except for Plotly
probably. You've already mentioned on this podcast
once or twice or many more. Is there a limit to the
number of tools an advanced data scientists should
know? Are you just like, just keep picking up new ones
all the time?
Morgan Mendis: No, I think, yeah, you've got to constantly be learning.
like I didn't even touch on like using spark or anything
like that or trying, like right now I'm very interested in
learning Scala just because I'm like, "Oh, this is going
to be a great language." Then maybe move on to Kotlin
or something else. But I think that as you learn new
tools, you think of problems in a different way and I
think that's the key thing is that, for example, I love,
as I mentioned earlier, I love learning languages
because I like talking to people. But I also know that
what's really interesting in learning languages is that
you start thinking in a different way. You start
thinking culturally or you start thinking as you
approach people in a different way than you might
have thought before.
Morgan Mendis: There's a popular theory, I don't know actually how
popular it is, but it's called Whorf Theory which from
linguistics, which says that you are limited to what
you can think based on the language that you know.
Right? So for example, we might not be able to think of
a certain solution because we don't have the words to
envision it or describe it. So again, I think I was
actually listening one of your previous podcasts about,
the future of AI and I thought this was good as you
guys were mentioning, can you explain an ant
language to a monkey, I mean, can you explain in ant
language what a monkey is to an ant? Right? And I
think that that's a key thing is that as you learn new
tools and languages, you're going to start thinking
about problems in a different way. Right?
Morgan Mendis: And I don't think it's about, like I said, I still go back
to Excel to do certain tasks, not because I'm like, "Oh,
Excel is the most advanced tool." But instead I tell
people, I was like, "Instead of learning necessarily all of
these advanced tools, it's like have you mastered
Excel? Because Excel might be the key tool you need
to be successful or for your organization to find value
in the work you're doing."
Kirill Eremenko: Absolutely. Absolutely. And what languages do you
speak out of curiosity?
Morgan Mendis: So English, Spanish and Portuguese. And I'm working
now on my French and Haitian Creole.
Kirill Eremenko: Nice. Very nice. And so what I wanted to like, to your
point, this is a great example that sometimes you
cannot think of a solution, just because we are limited
by the language we speak. So I was born in Russia and
I speak Russian. And like recently I've been interested
in Eastern European languages. Now I'm learning
Spanish as well and I've found this very peculiar
phenomenon that for example, in Russian, we don't
have a separate word for hands and arms, we just
have one word, ruka. And that means hand and arms.
So we'll would be saying like, instead of shaking your
hands in English, which would translate to shake your
arms. We don't have a separate word for feet versus
legs. We just have one word called nega. Like put your
shoes on your Russian feet would sound in English,
put your shoes on your legs. And things like that.
Kirill Eremenko: So, and that is the same in Czech language and Polish
language in as far as I know in some other Eastern
European languages there's just no separate word to
distinguish between arm versus hand, foot versus leg.
And I think maybe like, I'm not sure you might
comment on this like, is that the same in Spanish or
not? But like it just shows that's like there are certain
limitations as you say. And that also as I can see how
you're making this a point that in data science as well,
the more tools you know Python, R, Excel, Tableau,
whatever it is, AWS, Scala, Spark, the more
opportunities you have to think of different solutions.
Morgan Mendis: Yeah. I think that that's the key thing is also culturally
relevant. Right? I don't know if in Russian it's popular
to be obviously I think you do shake hands obviously
how popular it is called truly to think about these
things. I was talking to one of my colleagues yesterday
about the idea that in some languages they don't have,
like they don't have left and right. They only think in
cardinal directions.
Kirill Eremenko: Interesting.
Morgan Mendis: Right? So yeah, when they talk about something they
use like the idea of East to West, right? To describe
progression or as time booze. Right? Or in a story. And
it's like really interesting to think about that. And I
think it also goes into also like what's most relevant to
a given culture, to a given group of people. Right? Like
we know that in some cultures, some native American
cultures in Alaska, they have multiple, they have over
10 words for snow, right? To describe all the different
types of snow. Right? And for as English speakers,
we're just like "Hey, there's snow." You get snow.
Right?
Kirill Eremenko: Yeah.
Morgan Mendis: I actually am wondering, I'm like, I wonder what the
Creole word is for snow right now.
Kirill Eremenko: Yeah.
Morgan Mendis: Because I don't know how often they get to see it.
Kirill Eremenko: Yeah. That's right. Wow. Wow. Very cool. And in
addition to like the words or use cases in the case of
data science, so for different languages, the
programming languages in this case, in addition to
that variety, that enriches your problem solving
abilities. It also enhances the neural pathways in your
brain. If you keep speaking in Spanish and then in
English you're going to use different neural pathways
or slightly different neural pathways. The greater the
variance between the languages, the greater you're
going to have to engage your brain. Sometimes like
learning a language, sometimes I'm sitting there, my
head actually hurts because I feel that something is
changing. I have to overcome these long neural
pathways and new ones have to be formed for me to
think faster in your language. Same thing with
programming languages. Same thing with all these
tools that we use. The more than we use, the more
neural pathways I believe your brain going to develop
in order for you to use them faster, and that's going to
help you, aid you in coming up with solutions faster as
well.
Morgan Mendis: Yeah, no. Honestly, that's I think some of the most
exciting thing if you're a forever learner is that you
know when you're struggling that, "Oh wait, things are
changing in your head." right? You've got to maybe
throw out this, this previous limitation because there
was no connection points though that neural pathway,
right? There was no connection point, but now
because you've added in this language or you've added
in this concept, it allows you to rethink things and it's
like, "Oh wait, now I can send information down this
way. I can send my elec. Right? I can send those
neural electrodes down that way so I can think of it
this way."
Kirill Eremenko: Yeah.
Morgan Mendis: Oh, I think that that's really, really exciting. And I
mean that's why I think learning languages is a crucial
thing to help people conceive of new problems. And as
you're talking about like in Spanish, one thing that I'm
always kind of reminded about when I'm speaking with
a native Spanish speaker is sometimes my expressions
of like, "Oh me encanta eso." They're like, wait, why is
it that you don't have such a strong, affection towards
such an item or affinity towards something. And I
don't realize it because in English we say love to
anything. "Oh I love this coffee." But you wouldn't say
that in Spanish or in French because they're like, that
word has a lot of significance. It has a lot of meaning
behind it and it makes you start thinking a little bit
more about the word choices.
Morgan Mendis: And it's actually sometimes interesting to hear people
multilingual people speak because I think that they
have a different perception of words potentially. And
they're more sensitive times than people who are, only
speaking one language because they're like, "This is
the word that we know and it's common
understanding." But people are like, "No, it's not that
common." It's all about context and place. And it's the
same way with data science tools. For example, me
using Python might be advanced to somebody who's
only using Excel. Right? But me using Python might
be, trivial to somebody who's only working in Scala or
who knows, only working in Julia down the road or...
Kirill Eremenko: Yeah. Yeah, I totally get what you mean. Yeah, it's all
very relative and the more you explore, the more
variety, the more things you can see about your past
experiences, your future experiences, how others are
working on tools. I think it's a very exciting, exciting
thing to constantly be learning. And it's great to hear
from you, like, because obviously at the start of your
career, it's very exciting to be learning. But in your
situation, I know you say you don't think of yourself as
a highly advanced data scientist. I really think of you
as a very advanced data scientist. I think from what
we discussed on this podcast, our listeners will agree
on. It's very inspiring to hear that coming from you
that even once you've accomplished so much and that
you're now pursuing your dreams and passion projects
and things like that, you're still very excited about
learning. So that never stops. And I truly admire all of
that. So thank you a lot for that inspiration in just the
way that you approach data science yourself.
Morgan Mendis: I appreciate the compliments, but I hope that I can live
up to them.
Kirill Eremenko: For sure. And so unfortunately we're running out of
time. And just before we wrap up, I wanted to do one
more shout out to this amazing undertaking that
you're doing in Haiti, Ayiti Analytics and how you're
changing or bring up this new generation of data
scientists who are going to be changing a lots of things
locally and in the world and bringing good to the
world. How can people support this course? Is there
anything that our listeners can do in order to
participate or just spread the word about what you're
doing in Haiti?
Morgan Mendis: Yeah, definitely take a look at ayitianalytics.org, which
I'm sure Kirill is going to have posted on this podcast
description, but we're definitely looking for
collaborators in the U.S., in Europe, in Africa
especially as well. We're really trying to advance data
science in the country and we want to find other
practitioners, other advanced data scientists who are
willing to give back in terms of their time to help
others grow, especially in giving opportunities for
people who are potentially going to interact with
exciting data to help develop their own country in a
way. And so we're definitely looking for collaborators,
mentors, especially if you speak Haitian, Creole and
French. We're also looking for people to help us in
terms of translating some of the content that is online
into the local language.
Morgan Mendis: But yeah, if you just want to get involved and talk with
some of the students and other practitioners in our
group, that is also, very welcome. So we hope that
people are interested and they want to collaborate and
they want to give back. And I think that that's a key
thing is that we're an organization that wants to use
data science for the benefit of everybody and we are
really looking for collaborators to come in and try to
help us with that.
Kirill Eremenko: Fantastic. Thank you so much for sharing. And I will
share the website link in the show notes, but just do
mention it here. It's Ayiti Analytics and Ayiti is spelled
A-Y-I-T-I. As you mentioned it's a play on words and
how Haiti pronounced, right?
Morgan Mendis: Yeah. The phonetic pronunciation.
Kirill Eremenko: Of Haiti. Okay. Awesome. So ayitinalytics.org if you
want to check out. Morgan, once again, thank you so
much. How can our listeners get in touch with you
and you said LinkedIn, right? Is the best way to get in
touch with you?
Morgan Mendis: Yeah, and if they want to also send me an email
directly, they can also hit me up at
Kirill Eremenko: Mm-hmm (affirmative). Awesome, awesome. So
definitely people get in touch. Morgan, you're doing
some fantastic work. Before I let you go, one more final
question for you. What's a book that you can
recommend to our listeners to help Inspire their
careers?
Morgan Mendis: I'm going to hit you back with two books actually.
Kirill Eremenko: Nice.
Morgan Mendis: One book that I think transformed my life in data
science early on was a book called Manga's Guide to
Databases.
Kirill Eremenko: Mm-hmm (affirmative).
Morgan Mendis: So go out, buy this book, give it to, you can give it to
somebody who's in middle school. They can learn
databases, they can learn SQL. They can understand
how it works from this book. This is how I learned
SQL. I learned SQL on the fly at an internship via this
book, 100% believe in it. It's really, really easy to learn
and right. It's almost for any level of reader. The other
book is, I really want to push some human centered
design, so it's The Design of Everyday Things. It's
another great book to give you an understanding into
how you can use design principles in whatever you're
hoping to do. Whatever it is, you can always benefit
from, interacting with your clients or interacting with
your stakeholders and really understanding how you
can afford them greater value. So The Design of
Everyday Things.
Kirill Eremenko: Fantastic. Thank you so much. So Manga's Guide to
Databases and The Design of Everyday Things.
Morgan, once again, thank you so much for coming on
the show and sharing all your insights, inspiration
with us and also just showing us what's it like to live
the life of an advanced data scientist. I think we can
all aspire towards that and it's really cool to see how
you follow your dreams and hopefully those of us who
can help, I insist we will get in touch and assist you in
pursuing your passions and making a difference in
other people's lives in data science. Thank you so
much for being here today.
Morgan Mendis: Thanks for having me Kirill.
Kirill Eremenko: So there you have it ladies and gentlemen, that was
Morgan Mendis and thank you so much for being part
of our conversation here today. I know we went a bit
over but I hope you enjoyed it as much as I did and got
as many useful takeaways from the conversation. For
me, the biggest takeaway in terms of contributing to
society was of course what Morgan is doing with Ayiti
Analytics. So if you can help in any way, please get in
touch. I've already spoken to Morgan. I've told and, as
I mentioned on the podcast, we're going to supply as
many SuperDataScience courses as needed, absolutely
free of charge to this cause in order to help get
Morgan's team and the people that they're coaching up
to speed with the concepts of data science, machine
learning, artificial intelligence.
Kirill Eremenko: We just counted this out to the podcast. We have
about four courses that we already have in French and
we are going to be supplying these to Ayiti Analytics in
order for them, as they need these courses, in order for
them to help bring people up to speed with data
science and make an impact in the world. So get
involved if you can. The website is ayitinalytics.org and
we'll definitely share that on the show notes and in
terms of technical aspects. My third takeaway was
Airflow and the whole concept of this advanced ETL
tool. I was very excited to hear about that. In fact, we
spoke with Morgan after the podcast and I invited
Morgan to come and present at DataScienceGO 2020
and he agreed.
Kirill Eremenko: So Morgan is going to be presenting at DSGO 2020.
We ready to have the topic, it's most likely going to be
Airflow and visual diagnostics for machine learning is
going to be an advanced workshop, a workshop for the
advanced track of our advanced practitioners. So if
you can make it to DataScienceGO, you will get to see
Morgan there and perhaps even attend his workshop.
The dates are 6th, 7th, 8th of November. So as usual
you already by now you can get your tickets
datasciencego.com.
Kirill Eremenko: So that is our episode for today. As usual, you can get
all the materials that we mentioned, any links to
Morgan's LinkedIn to his email, to the projects that
he's working on. Also to the position that we talked
about the VP of data science, you can get all of those
things at the show notes at
www.superdatscience.com/321. That's
superdatascience.com/321. Check it out and get what
you are most interested in, what you are most curious
about.
Kirill Eremenko: And of course hit Morgan up, connect with him on
LinkedIn. It's great to have advanced data scientists in
your network, people who you can come to with
questions, ask about their careers or just follow their
careers and see where it takes them. It's always
inspiring to see what an advanced data scientist, how
they choose to structure their career going forward.
Kirill Eremenko: On that note, once again, thank you so much for being
here today. If you'd like to meet Morgan, check out
datasciencego.com and I look forward to seeing you
back here next time. Until then, happy analyzing.