Cloud computing Answers should be based on well cited article/videos – name the references used in your answer 1. Big Data Products Google is a master at creating data products. Below are few examples from Google. Describe the below products and explain how the large scale data is used effectively in these products. (100 words each) a. Google’s PageRank b. Google’s Spell Checker c. Google’s Flu Trends d. Google’s Trends e. Like Google – Facebook and LinkedIn also uses large scale data effectively. How? 2. Big Data Tools a. Briefly explain why a traditional relational database (RDBS) is not effectively used to store big data? (100 words) b. What is NoSQL Database? (100 words) c. Name and briefly describe at least 5 NoSQL Databases (200 words) d. What is MapReduce and how it works? (100 words) e. Briefly describe some notable MapReduce products (at least 5) (250 words) f. Amazon’s S3 service lets to store large chunks of data on an online service. List some 5 features for Amazon’s S3 service. (200 words) g. Getting the concise, valuable information from a sea of data can be challenging. We need statistical analysis tool to deal with Big Data. Name and describe some (at least 3) statistical analysis tools. (150 words) 3. Big Data Application Name 3 industries that should use Big Data – justify your claim in 250 words for each industry using proper references. 4. Storage Design Design Storage Solution for New Application (words as required) Scenario An organization is deploying a new business application in their environment. The new application requires 1TB of storage space for business and application data. During peak workload, application is expected to generate 4900 IOPS (I/O per second) with typical I/O data block size of 4KB. The vendor available disk drive option is 15,000 rpm drive with 100 GB capacity. Other specifications of the drives are: Average Seek time = 5 millisecond and data transfer rate = 40 MB/sec. You are required to calculate the required number of disk drives that can meet both capacity and performance requirements of an application. Hint: In order to calculate the IOPS from average seek time, data transfer rate, disk rpm and data block size refer slide 28 in week 6 lecture slide. Once you have IOPS, refer slide 29 in week 6 to calculate the required number of disks. 5. Storage Evolution Watch the following videos for Fiber Channel over Ethernet and answer the questions that follow: − − a. What is FCoE and why we need FCoE? (150 words) b. In your opinion how FCoE is cost effective than traditional connection – give brief explanation. (100 words) c. What is a Virtual SAN? (150 words) d. What is IP SAN protocols and FibreChannel over IP (FCIP)? (200 words)

