Date post: | 08-Jan-2017 |
Category: |
Software |
Upload: | elena-kolevska |
View: | 211 times |
Download: | 2 times |
REDIS FOR YOUR BOSSELENA KOLEVSKA
Who am I ?A nomad earthling. Lead developer @ www.speedtocontact.com
BLOG.ELENAKOLEVSKA.COM@ELENA_KOLEVSKA
(Really) SHORT INTRONo, seriously!
REDIS IS AN OPEN SOURCE (BSD licensed), IN-MEMORY DATA STRUCTURE STORE, USED
AS DATABASE, CACHE AND MESSAGE BROKER
The 'Definition' on redis.io
BASIC FEATURES:▸ Different data structures
▸ Keys with a limited time-to-live▸ Transactions▸ Pipelining▸ Lua scripting▸ Pub/Sub
▸ Built-in replication▸ Different levels of on-disk persistence
SPEED
AVAILABLE CLIENTS IN:ActionScript bash C C# C++ Clojure Common lisp Crystal D Dart Elixir emacs lisp Erlang Fancy gawk GNU Prolog Go Haskell Haxe Io Java Javascript Julia Lua Matlab mruby Nim Node.js Objective-C OCaml Pascal Perl PHP Pure Data Python R Racket Rebol Ruby Rust Scala Scheme Smalltalk Swift Tcl VB VCL
AVAILABLE DATA STRUCTURES▸ Strings (Binary safe, can be anything from "hello world" to a jpeg file)
▸ Lists (Collections of string elements sorted according to the order of insertion)
▸ Sets (Collections of unique, unsorted string elements)
▸ Sorted sets (It's like Sets with a score)
▸ Hashes (Maps of fields associated to values. Think non-nested json objects)
▸ Bitmaps (Manipulate Strings on a bit level)
▸ HyperLogLogs (Probabilistic data structure used to estimate the cardinality of a set)
Imagine...
TWITTER ANALYSIS TOOL▸ Track a selected group of hashtags (#gameofthrones, #got, #gotseason7)
▸ Count mentions of certain keywords ('winter is coming', 'tyrion', 'jon snow', 'stark', 'targaryen', 'cersei', 'asha greyjoy', 'Khaleesi', 'sansa', 'arya')
METRICS:▸ A feed of all tweets containing one of the hashtags
▸ Total number of tweets with one or more of the selected hashtags▸ A leaderboard of keyword frequency▸ A feed of tweets per keyword
[1] CONNECTING TO REDIS
[1] CONNECTING TO REDIS
Install the PRedis package using composer composer require predis/predis
... // Initialize the client $parameters = [ 'scheme' => 'tcp', 'host' => '127.0.0.1', 'port' => 6379 ]; $client = new Predis\Client($parameters, ['prefix' => 'twitter_stats:']);
[2] SET TRACKED DATA
[2] SET TRACKED DATA
Use sets to store all the hashtags we'll be looking at and all the keywords as well
$client->sadd('hashtags', 'gameofthrones','got', 'gotseason7'); // hashtags | 'gameofthrones' // | 'got' // | 'gotseason7'
$client->sadd('keywords', 'winter is coming', 'winterfell', 'jon snow', 'stark', 'targaryen', 'cersei', 'asha greyjoy', 'dorne', 'Khaleesi' 'hodor', 'sansa', 'arya', 'white walkers', 'the night king');
[3] GET THE DATA
[3] GET THE DATA
Use Twitter Stream API to receive notifications for tweets containing any of the hashtags we're following
$hashtags = $client->smembers('hashtags'); // array (size=3) // 0 => string 'got' (length=3) // 1 => string 'gameofthrones' (length=13)
Save every new tweet from the stream as a separate String. $keyname = 'tweet_id:' . $tweet_id; $tweet_contents = "Winter is coming Khaleesi! #gameofthrones";
$client->set($keyname, $tweet_contents) // 'tweet_id:45645656' | 'Winter is coming Khaleesi! #gameofthrones'
And then push to a queue to be processed asynchronously // Use the list data structure as a queue $client->lpush('message_queue', $keyname); // 'message_queue' | 'tweet_id:45645656' // | 'tweet_id:44645234' // | 'tweet_id:43645232'
[4] WORKER TO PROCESS THE QUEUED JOBS
[4] WORKER TO PROCESS THE QUEUED JOBS
A separate worker will be grabbing jobs off the top of the queue and processing them: $message_queue = $client->rpop('message_queue'); // 'message_queue' | 'tweet_id:45645656' // | 'tweet_id:44645234' // | 'tweet_id:43645232'
Reliable queue: RPOPLPUSH, BRPOPLPUSH
Blocking queue: BLPOP, BRPOP
[5] PROCESS THE TWEET CONTENT
[5] PROCESS THE TWEET CONTENT
$tweet_contents = $client->get($keyname); $keywords = $client->smembers('keywords');
foreach ($keywords as $keyword) { $tweet_contents = strtolower($tweet_contents); $keyword = strtolower($keyword); if (strpos($tweet_contents,$keyword) !== false){ $client->zincrby('mention_counter', 1, $keyword); // Increase the counter for this specific keyword // mention_counter | 'tyrion' => 9.00 // | 'the wall' => 5.00 // | 'arya' => 4.00 $keyword_feed_keyname = 'keyword_feeds:'. $keyword; $client->lpush($keyword_feed_keyname, $tweet_contents); // Add the tweet to the keyword's feed $client->ltrim($keyword_feed_keyname, 0, 50); } }
$client->incr('total_count'); // Increase the general tweet count
$client->lpush('main_feed', $tweet_contents); $client->ltrim('main_feed', 0, 100);
[6] SHOW THE STATS
$total_count = $client->get('total_count'); // 'total_count' | 259
$scores = $client->zrevrangebyscore('mention_counter', '+inf', '-inf', ['withscores'=>1]); // mention_counter | 'tyrion' => 9.00 // | 'the wall' => 5.00 // | 'arya' => 4.00
// Feed by keyword foreach ($scores as $keyname => $score) { $keyword_feeds[$keyname] = $client->lrange('keyword_feeds:' . $keyname, 0, -1); }
// Feed of all tweets containing one of the specified hashtags $main_feed = $client->lrange('main_feed', 0, -1);
[7] USEFUL EXTRAS
[7] USEFUL EXTRASAPI RATE LIMIER
$ip = $_SERVER['REMOTE_ADDR'] ; $timestamp = time(); //unix timestamp $key = 'api_rate_limits:' . $timestamp . ':' . $ip; // $key = 'api_rate_limits:1473613000:192.168.10.1 '
$api_requests = $client->get($keyname);
if (!is_null($api_requests) && $api_requests >= 3){ throw new Exception('Too many requests per second'); }else{ $client->multi(); $client->incr($key); $client->expire($key,10); $client->exec(); }
[7] USEFUL EXTRASPIPELINING
$keyname = 'tweet_id:' . $tweet_id; $keywords = $client->smembers('keywords');
$pipe = $client->pipeline(); $pipe->set($keyname, $tweet_contents);
foreach ($keywords as $keyword) { [...] }
$pipe->incr('total_count'); $pipe->lpush('main_feed', $tweet_contents); $pipe->ltrim('main_feed', 0, 20); $replies = $pipe->execute();
[7] USEFUL EXTRASLUA SCRIPTING
Thank you!
Questions?@ELENA_KOLEVSKA
HTTPS://JOIND.IN/TALK/C68B2