The Facebook ad platform generates a lot of unstructured data on a daily basis. To deal with this problem, Facebook developed its own platform called Scuba, which allows Hadoop developers to dive into massive data sets and perform ad-hoc analyses in real-time. Hadoop is a powerful data processing framework, so Facebook has decided to adopt it. But is it really worth it?
The social networking giant has long maintained that it will only share information with your permission and will anonymize it if it is sold to advertisers. But there have been concerns about privacy. Facebook users have long complained that their privacy settings are too confusing and complicated, making it easy for them to accidentally share information with third-parties. This is a significant issue, as Facebook users’ privacy is vital and sensitive information must remain private.
Facebook’s approach to big data is far from novel. It uses the Hadoop general-purpose computing framework and the Flume agent, which helps process large data sets. The Facebook data processing platform allows developers to write map-reduce programs in any language. Its data analysis capabilities are a key part of Facebook’s competitive advantage, and are set to continue growing in the future. Its success is the result of a highly collaborative approach, as it allows developers to build applications and make changes in the system as needed.
Facebook hasn’t fully deployed Prism yet, and the social networking site declined to comment on the timeframe it will take to implement it. But executives are hopeful that Prism will become a widely adopted open source project. It is hoped that Facebook Prism will ultimately become the next step in Facebook’s data analytics strategy. However, the company isn’t ready to take on the challenge alone.
