Post on 12-Apr-2017
transcript
Igor Perisic, Ph.D. VP Engineering, LinkedIn
RecSys, Vienna 2015
Recommendations Within a Social Network: One step at a time
Member value propositions
CONNECT
with your professional world
STAY INFORMED
through professional news and knowledge
GET HIRED
and build your career
As an Data Engineer what is your job?
Train the fanciest model
possible?
Design elegant,scalable models?
Scale ML algorithms to Billions of
features?
Argue with Hadoop Dev and Ops about cluster usage?
Engage in righteous debates on
the right model family?
Delight
Product Metrics
Relevance Metrics
Objective Functions
Infer
Relate
Step 2: What is the question?
≠
15
Step 3: Where is the data coming from?
O(n2) point-to-point data integration complexity
LinkedIn (circa 2010)
Step 4: What is the Data that I am getting? The nightmare of Tracking
• Pain points• Payload on Clients, Intractable ETL
dependencies, Consistency through multitude of use cases (Producers and Consumers)
Finding JobViews in PageViewEvent:
trackingInfo#’job_id’ = 123trackingInfo#’jobId’ = 123trackingInfo#’jobID’ = 123trackingInfo#’11’ = ‘jobId=123’trackingInfo#’13’ = ‘job_id=123’
• Asymmetry of concern between Producers and Consumers
• Hundreds of producers and consumers; 1000+ individual developers
• Brittle
• Compatibility (forward)• Tracking data persists for X years; all versions
must be read for y/y analysis
Step 5: Training/Evaluation/Testing/Deploying Scale, repeatability
• Prepare your data• Join across multiple types• Feature Experimentation & Engineering• Snapshots
• Training• Offline replays• Consistency• Speed of iteration• Time consuming for a Modeler
• Deploy• Config• Online evaluation paradigm• Availability of features
Step 6: Experimentation A/B Testing
• Latency in Model performance • 3 Phases, RAM;
Ramp up, Aggressive development, Maintenance.
• Offline performance is not a guarantee of online performance
• A/B Testing. A/B Testing. A/B Testing
20
Step 7: Engineering best practices• Top Complaints from Relevance Engineers/DS/ML/…
• Discovery: where is the data?• Wrangling: can I make sense of the data?• Verifying: is the data correct?• Scaling: how can I scale my computation?• Workflow: how can I operate my processing?• Publishing: how can I get my results into production?• Process: I want to try things fast, you are slowing me down!
• And an unfortunate tendency to shy away from • Documenting• Updating and Maintaining • Automating
Scientificreviews, source safe, leverage, …
• nanos gigantum humeris insidentes• You are not alone. A lot of individuals will work within your models• Any great feature will be reinvented a gazillion times (Connection strength)• You will turn off your laptop. Production servers are 24/7 with x 9’s
Some Definitions
EngagedEmployees who work with passion and feel a profound connections to their company. They drive innovation and move the organization forward.
Not Engaged
Actively Disengaged
Employees are essentially “checked out”.
Employees aren’t just unhappy at work; they’re busy acting out their unhappiness. Every day, these workers undermine what their engaged coworkers accomplish
Source: Gallup; State of the global workplace, 2013
Some Numbers
Engaged Of the global workforce is Engaged at their work.
Actively Disengaged
There are twice as many Actively Disengaged members in the global workforce than Engaged individuals!
13%
Engaged= 2
Israel Japan France Austria Germany Australia US Costa Rica0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Actively DisengagedNot EngagedEngaged
Can’t fill postitions
Research shows that declined labor market fluidity has a negative effect on employment, real wage and productivity. On the contrary, increase in labor market fluidity has a huge positive impact on employment especially for young workers and the less educated work force
•Alignment • Vision: I care about what we are
trying to do• Culture: We do it right
•Empowerment• I can make a difference
•Growth• Personal• Career
• Confirmed Hires
• Application per Impression• Not Views per Impression• Qualified applications per impression?
• Reducing Time to Hire
• Minimize Outrage!
Delight
Product Metrics
Relevance Metrics
Objective Functions
Oracle MSFT Yahoo! Google FBOracle 1535 211 186 16 1948MSFT 2156 708 737 88 3686
Yahoo! 227 369 256 48 900Google 1000 3638 1119 436 6193
FB 208 1223 514 1237 31823591 6765 2552 2416 588
Yahoo! : 35%Oracle : 54%MSFT : 54%Google : 256%FB : 541%
Curr
ent
(em
ploy
men
t)
Previous (employment)
Want to find/switch to a new job?
Interested in this position?
• Is the location convenient?• Are compensation/perks
attractive?• Is the company interesting?• Is the project interesting?• Is it a good opportunity to
move up in my career? • Is there any stellar employee in
this company whom I want to work with?
• Do I have many friends/connections working in this company?
• …
Interest Model
Profile, experience match job requirement?
• job title member title• job seniority member
seniority• job skills member skills• job description member
profile
• …
profile-job matching
a Job position
A good fit?
Have similar background as current employee in this company/position (to avoid either under-qualification or over-qualification)?• Educational history• School ranking• Previous employers’ prestige• Career growth in previous
companies• Length of employment• Seniority• Patents• Publications• Awards• Certificates• Endorsements
Organizational Fit Model
A job poster’s viewA job candidate’s view