+ All Categories
Home > Documents > Improving Online Social Network collection and … OSN...Athens University of Economics and Business...

Improving Online Social Network collection and … OSN...Athens University of Economics and Business...

Date post: 05-Mar-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
1
D. Tsagkarakis, A. Vavakos, V. Stavrou, M. Kandias Improving Online Social Network collection and processing mechanisms Athens University of Economics and Business Problems Time-consuming data management due to conventional relational databases. Delays in the data mining mechanisms due to lack of parallel processing. Need to upgrade existing mechanisms in order to make use of the latest API versions. Dimitris Tsagkarakis, Alexandros Vavakos, Vasilis Stavrou, Miltiadis Kandias {d.tsagkarakis, alexandros.vavakos, stavrouv, kandias}@aueb.gr Information Security and Critical Infrastructure Protection Laboratory Dept. of Informatics, Athens University of Economics & Business (AUEB) Improving Online Social Network collection and processing mechanisms Introduction Rapid explosion of Online Social Networks. Users transfer their offline behavior to the online world. Extraction of information from social networks contributes to the profiling of users. Open Source INTelligence (OSINT) to mitigate the insider threat. References 1. Amichai-Hamburger, Y., Vinitzky, G., Social Network Use and Personality”, 2010. 2. Gritzalis, D., Kandias, M., Stavrou, V., Mitrou, L., "History of Information: The case of Privacy and Security in Social Media", in Proc. of the History of Information Conference, pp. 283-310, Law Library Publications, Greece, 2014. 3. Gritzalis, D., Stavrou, V., Kandias, M., Stergiopoulos, G., “Insider Threat: Εnhancing BPM through Social Media”, in Proc. of the 6 th IFIP International Conference on New Technologies, Mobility and Security, Springer, UAE, 2014. 4. Kandias, M., Mylonas, A., Virvilis, N., Theoharidou, M., Gritzalis, D., “An Insider Threat Prediction Model”, in Proc. of the 7 th Internation- al Conference on Trust, Privacy, and Security in Digital Business, pp. 26-37, Springer (LNCS-6264), Spain, 2010. 5. Kandias, M., Stavrou, V., Bosovic, N., Gritzalis, D., “Proactive Insider Threat Detection Through Social Media: The YouTube Case”, in Proc. of the 12 th Workshop on Privacy in the Electronic Society, Berlin, 2013. 6. Kandias, M., Mitrou, L., Stavrou, V., Gritzalis, D., “Which side are you on? A new Panopticon vs. Privacy”, in Proc. of the 10 th Internatio- nal Conference on Security and Cryptography, pp. 98-110, Iceland, 2013. 7. Kandias, M., Galbogini, K., Mitrou, L., Gritzalis, D., "Insiders trapped in the mirror reveal themselves in social media", in Proc. of the 7 th International Conference on Network and System Security, pp. 220-235, Springer (LNCS 7873), Spain, 2013. 8. Kandias, M., Stavrou, V., Bozovic, N., Mitrou, L., Gritzalis, D., "Can we trust this user? Predicting insider’s attitude via YouTube usage profiling", in Proc. of 10 th IEEE International Conference on Autonomic and Trusted Computing, pp. 347-354, IEEE Press, Italy, 2013. 9. Kandias, M., Virvilis, N., Gritzalis, D., “The Insider Threat in Cloud Computing”, in Proc. of the 6 th International Workshop on Critical Infrastructure Security, pp. 93-103, Springer, Switzerland, 2011. 10. Kotzanikolaou, P., Theoharidou, M., Gritzalis, D., “Interdependencies between Critical Infrastructures: Analyzing the Risk of Cascading Effects”, in Proc. of the 6 th International Workshop on Critical Infrastructure Security, pp. 107-118, Springer, Switzerland, 2011. 11. Mylonas, A., Kastania, A., Gritzalis, D., “Delegate the smartphone user? Security awareness in smartphone platforms”, Computers & Security, Vol. 34, pp. 47-66, May 2013. 12. Mylonas, A., Meletiadis, V., Mitrou, L., Gritzalis, D., “Smartphone sensor data as digital evidence”, Computers & Security, Vol. 38, pp. 51- 75, October 2013. 13. Stavrou, V., Kandias, M., Karoulas, G., Gritzalis, D., "Business Process Modeling for Insider threat monitoring and handling", in Proc. of the 11 th International Conference on Trust, Privacy & Security in Digital Business, pp. 119-131, Springer (LNCS 8647), Germany, 2014. 14. Shaw, E., Ruby, K., Post, J., “The insider threat to information systems: The psychology of the dangerous insider”, Security Awareness Bulletin, pp. 1-10, 1998. 15. Theoharidou, M., Kotzanikolaou, P., Gritzalis, D., “Risk assessment methodology for interdependent critical infrastructures”, Internatio- nal Journal of Risk Assessment and Management, Vol. 15, No. 2-2, pp. 128-148, 2011. Use of a distributed cluster of machines to store and manage large amounts of data. Need for parallelized data collection due to the constantly increasing amounts of data that social networks process. Ability to connect to a social network using accounts from different networks. Ability to simultaneously collect user’s data from all the social networks in which they use the same account. Proactive critical infrastructure protection capability. Ability to enhance organizational monitoring systems to mitigate the insider threat. Hadoop Ecosystem Figure 3: Hadoop ecosystem OLTP vs. OLAP OLTP System OLAP System Inserts and Updates Short and fast inserts and updates initiated by end users Periodic long-running batch jobs refresh the data Queries Relatively standardized and simple queries that return relatively few records Often complex queries involving aggregations Processing Speed Typically very fast Depends on the amount of data involved Space Requirements Relatively small Relatively large Database Design Highly normalized with many tables Typically de-normalized with fewer tables; use of star and snowflake schemas Final Twitter Crawler Figure 2: Social media connectivity Figure 5: Twitter Crawler configuration window Twitter User Privacy: Ability to identify a user from a comment or image by third parties. Option to display the geographical location where a comment or image was posted from. Utilization of users’ personal information in order to associate certain advertisements with them. Improvements: Parallelization using multithreading. Design of a Graphical User Interface. Crawler update to sequentially gather users using a file. Crawler update to modify the tool’s configuration from within the application. Crawler update to store incidents in a log file for later use (analysis or debugging). Youtube User Privacy: Ability to display user’s activity to third parties. Ability to display video’s information (view count, likes, etc). Connection with Google accounts. Shared accounts with Facebook and Twitter. Improvements: Updates and improvements on YouTube’s API responses. Parallelization using multithreading. Changes on the data stored in the data warehouse. Figure 1: OLTP vs OLAP Systems Conclusions Figure 4: Twitter Crawler root window
Transcript
Page 1: Improving Online Social Network collection and … OSN...Athens University of Economics and Business Improving Online Social Network collection and processing mechanisms D. Tsagkarakis,

D. Tsagkarakis, A. Vavakos, V. Stavrou, M. Kandias Improving Online Social Network collection and processing mechanisms Athens University of Economics and Business

Problems

Time-consuming data management due to conventional relational databases.

Delays in the data mining mechanisms due to lack of parallel processing.

Need to upgrade existing mechanisms in order to make use of the latest API versions.

Dimitris Tsagkarakis, Alexandros Vavakos, Vasilis Stavrou, Miltiadis Kandias {d.tsagkarakis, alexandros.vavakos, stavrouv, kandias}@aueb.gr

Information Security and Critical Infrastructure Protection Laboratory

Dept. of Informatics, Athens University of Economics & Business (AUEB)

Improving Online Social Network collection and processing mechanisms

Introduction

Rapid explosion of Online Social Networks.

Users transfer their offline behavior to the online world.

Extraction of information from social networks contributes to the profiling of users.

Open Source INTelligence (OSINT) to mitigate the insider threat.

References

1. Amichai-Hamburger, Y., Vinitzky, G., Social Network Use and Personality”, 2010. 2. Gritzalis, D., Kandias, M., Stavrou, V., Mitrou, L., "History of Information: The case of Privacy and Security in Social Media", in Proc. of

the History of Information Conference, pp. 283-310, Law Library Publications, Greece, 2014. 3. Gritzalis, D., Stavrou, V., Kandias, M., Stergiopoulos, G., “Insider Threat: Εnhancing BPM through Social Media”, in Proc. of the 6th IFIP

International Conference on New Technologies, Mobility and Security, Springer, UAE, 2014. 4. Kandias, M., Mylonas, A., Virvilis, N., Theoharidou, M., Gritzalis, D., “An Insider Threat Prediction Model”, in Proc. of the 7th Internation-

al Conference on Trust, Pri­vacy, and Security in Digital Business, pp. 26-37, Springer (LNCS-6264), Spain, 2010. 5. Kandias, M., Stavrou, V., Bosovic, N., Gritzalis, D., “Proactive Insider Threat Detection Through Social Media: The YouTube Case”, in Proc.

of the 12th Workshop on Privacy in the Electronic Society, Berlin, 2013. 6. Kandias, M., Mitrou, L., Stavrou, V., Gritzalis, D., “Which side are you on? A new Panopticon vs. Privacy”, in Proc. of the 10th Internatio-

nal Conference on Security and Cryptography, pp. 98-110, Iceland, 2013. 7. Kandias, M., Galbogini, K., Mitrou, L., Gritzalis, D., "Insiders trapped in the mirror reveal themselves in social media", in Proc. of the 7th

International Conference on Network and System Security, pp. 220-235, Springer (LNCS 7873), Spain, 2013. 8. Kandias, M., Stavrou, V., Bozovic, N., Mitrou, L., Gritzalis, D., "Can we trust this user? Predicting insider’s attitude via YouTube usage

profiling", in Proc. of 10th IEEE International Conference on Autonomic and Trusted Computing, pp. 347-354, IEEE Press, Italy, 2013. 9. Kandias, M., Virvilis, N., Gritzalis, D., “The Insider Threat in Cloud Computing”, in Proc. of the 6th International Workshop on Critical

Infrastructure Security, pp. 93-103, Springer, Switzerland, 2011. 10. Kotzanikolaou, P., Theoharidou, M., Gritzalis, D., “Interdependencies between Critical Infrastructures: Analyzing the Risk of Cascading

Effects”, in Proc. of the 6th International Workshop on Critical Infrastructure Security, pp. 107-118, Springer, Switzerland, 2011. 11. Mylonas, A., Kastania, A., Gritzalis, D., “Delegate the smartphone user? Security awareness in smartphone platforms”, Computers &

Security, Vol. 34, pp. 47-66, May 2013. 12. Mylonas, A., Meletiadis, V., Mitrou, L., Gritzalis, D., “Smartphone sensor data as digital evidence”, Computers & Security, Vol. 38, pp. 51-

75, October 2013. 13. Stavrou, V., Kandias, M., Karoulas, G., Gritzalis, D., "Business Process Modeling for Insider threat monitoring and handling", in Proc. of

the 11th International Conference on Trust, Privacy & Security in Digital Business, pp. 119-131, Springer (LNCS 8647), Germany, 2014. 14. Shaw, E., Ruby, K., Post, J., “The insider threat to information systems: The psychology of the dangerous insider”, Security Awareness

Bulletin, pp. 1-10, 1998. 15. Theoharidou, M., Kotzanikolaou, P., Gritzalis, D., “Risk assessment methodology for interdependent critical infrastructures”, Internatio-

nal Journal of Risk Assessment and Management, Vol. 15, No. 2-2, pp. 128-148, 2011.

Use of a distributed cluster of machines to store and manage large amounts of data.

Need for parallelized data collection due to the constantly increasing amounts of

data that social networks process.

Ability to connect to a social network using accounts from different networks.

Ability to simultaneously collect user’s data from all the social networks in which

they use the same account.

Proactive critical infrastructure protection capability.

Ability to enhance organizational monitoring systems to mitigate the insider threat.

Hadoop Ecosystem

Figure 3: Hadoop ecosystem

OLTP vs. OLAP

OLTP System OLAP System

Inserts and Updates Short and fast inserts and

updates initiated by end users

Periodic long-running batch jobs refresh

the data

Queries Relatively standardized and

simple queries that return

relatively few records

Often complex queries involving

aggregations

Processing Speed Typically very fast Depends on the amount of data involved

Space Requirements Relatively small Relatively large

Database Design Highly normalized with many

tables

Typically de-normalized with fewer

tables; use of star and snowflake

schemas

Final Twitter Crawler

Figure 2: Social media connectivity

Figure 5: Twitter Crawler configuration window

Twitter

User Privacy: Ability to identify a user from a comment or image by third parties. Option to display the geographical location where a comment or image was posted from. Utilization of users’ personal information in order to associate certain advertisements with them.

Improvements: Parallelization using multithreading. Design of a Graphical User Interface. Crawler update to sequentially gather users using a file. Crawler update to modify the tool’s configuration from within the application. Crawler update to store incidents in a log file for later use (analysis or debugging).

Youtube

User Privacy: Ability to display user’s activity to third parties. Ability to display video’s information (view count, likes, etc). Connection with Google accounts. Shared accounts with Facebook and Twitter.

Improvements: Updates and improvements on YouTube’s API responses. Parallelization using multithreading. Changes on the data stored in the data warehouse.

Figure 1: OLTP vs OLAP Systems

Conclusions

Figure 4: Twitter Crawler root window

Recommended