BA03 Introduction to Big Data Assignment 1

WARNING - Clicking on the "SUBMIT ASSIGNMENT" button will submit the

Assignment. Be sure that you have reviewed your answers before clicking it.

Subject Code: BA03

Subject Name:

INTRODUCTION TO BIG

DATA

Component name:

ASSIGNMENT1

Question 1:- For analyzing the data, Jennifer, needs a system that collects, aggregates, and moves large amounts of log data from many different sources to a centralized data store. Which of the following tools would you suggest Jennifer to use?

a)         MapReduce                        

b)         ZooKeeper                       

c)         Oozie                       

d)         Flume                       

Question 2:- How does Hadoop use computing resources?

a)         It only distributes data to computing resources.                       

b)         It distributes software to computing resources.                       

c)         It distributes data and computing tasks to computing resources.                       

d)         It creates shared memory for computing resources.                       

Question 3:- With the enhancement in technology, companies are using different ways for marketing their products and services. The new sensors are being used with new marketing campaigns, and this result in new type of data and information. Which element of Big Data is being discussed here?

a)         Volume                       

b)         Velocity                       

c)         Variety                       

d)         Both volume and velocity                       

Question 4:- Which of the following can be tracked using the RFID tags?

a)         Raw materials                       

b)         Scrap materials                       

c)         Finished goods inventory                       

d)         Insurance fraud                       

Question 5:- You are the Marketing Head of an organization. You plan to increase your market outreach by converting prospective customers to actual customers. Which of the following analysis approaches would you consider as the best to adopt?

a)         Data interpretation                        

b)         Behavioral analytics                       

c)         Data visualization                       

d)         Data collection                       

Question 6:- What are the two disadvantages of public cloud, compared to in-house analysis?

/

  Latency and risk to data security                       

  Latency and software incompatibility                       

  Higher cost and risk to data security                       

Question 7:- What is Metadata is defined as?

  Data about data                       

  Pattern framework                       

  Link analysis                          Text mining                       

Question 8:- Sam is seeking a career as a data analyst. Which of these is a key responsibility of a data analyst?

  Determine what data means and recommend ways to search the data                       

  Specialize in collecting data from different sources, organizing it in suitable format, and making

analysis                       

  Design, create, manage and interpret large datasets to achieve business goals                       

  Develops codes and images to automate data reports                       

Question 9:- In the MapReduce framework, map and reduce functions can be run in any order. Do you agree and

  Yes, because in functional programming, the order of execution is not important.                       

  Yes, because the functions use KVP as input and output; order is not important.                       

  No, because the output of the map function is the input for the reduce function.                       

  No, because the output of the reduce function is the input for the function.                        

Question 10:- Which of the following components of Hadoop provides SQL-like access to structured data and sophisticated Big Data analysis with MapReduce?

  Hive                       

  HDFS                       

  Hbase                        

  MapReduce                       

Question 11:- Predictive models based on both historical and real-time data can help which of these businesses to identify suspected cases of fraud in early stages?

  Marketing companies                       

  Medical claims companies                       

  Construction companies                       

  CRM based manufacturing companies                       

Question 12:- Why are Big Data applications susceptible to latency?

  The volume of Big Data is too large to be analyzed rapidly.                       

  Big Data may reside in a different location from the application.                       

  Big Data cannot use in-memory computing.                       

  Big Data applications are still in early stages of development.                       

Question 13:- ABC is a retailer organization that conducts its business through e-commerce. The organization offers customized online shopping experience to their customers with an attractive and responsive Web page user interface. Now the company wants to collect the data about customers’ activities on the Internet. What can be the best source for such data?

  Transactional database                       

  Social media                       

  Weblogs of customers                       

  All of the above                        

Question 14:- Which of the following is a RFID reader action?

  Text mining                       

  Credit notes management                       

  Insurance fraud detection                          Inventory management                        

Question 15:- What could be the biggest challenge, for the production or operations unit of an organization?

  Determining the data to be used for making business decisions                       

  Determining the best Big Data technology to be used                       

  Securing Big Data initiatives from unauthorized access                       

  Determining the best way to present Big Data findings to enable decision-making                       

Question 16:- Which of the following options was one of the factors driving the creation of MapReduce?

  Increasing processing power of new hardware                        

  Business need for complex analysis of structured data                       

  Increasing number of Web users                       

  Spread of distributed computing                       

Question 17:- In designing the MapReduce framework, which of the following needs did the engineers consider?

  It should be cheap and distributed free of cost.                       

  Processing should expand and contract automatically.                       

  Processing should be stopped in the case of network failure.                       

  Developers should be able to create new languages.                       

Question 18:- Which of the following describes the map function?

  It processes data to create a list of key-value pairs.                       

  It indexes the data to list all the words occurring in it.                       

  It converts a relational database to key-value pairs.                       

  It tracks data across multiple tables and clusters in Hadoop.                       

Question 19:- Which of the following describes the reduce function?

  It analyzes the map function results to show the most frequently occurring values.                       

  It combines the map function results to return a list of the best matches for the query.                       

  It adds the results of the map function to convert the KVP lists to columnar databases.                       

  It processes map function results and creates a new KVP list to answer the query.                       

Question 20:- How does MapReduce achieve co-location?

a)            The scheduler sends the code to the machine where the relevant data resides.                       

b)            The process scheduler distributes data of the same type to machines in the same cluster.                       

c)            The master JobTracker sends map and reduce functions to the same machines or nodes in acluster.                       

d)            The slave TaskTrackers copy related data and code to adjacent clusters in case of processingfailure.                       

hihi


Want latest solution of this assignment

Want to order fresh copy of the Sample Template Answers? online or do you need the old solutions for Sample Template, contact our customer support or talk to us to get the answers of it.