1) HDFS file can ...
- ... be duplicated on several nodes
- ... compressed
- ... combine multiple files
- ... contain multiple blocks of different sizes
2) How does HDFS ensure the integrity of the stored data?
- by comparing the replicated data blocks with each other
- through error logs
- using checksums
- by comparing the replicated blocks to the master copy
3) HBase is ...
- ... column oriented
- ... key-value oriented
- ... versioned
- ... unversioned
- ... use zookeeper for synchronization
- ... use zookeeper for electing a master
4) An HBase table ...
- ... need a scheme
- ... doesn't need a scheme
- ... is served by only one server
- ... is distributed by region
5) What does a major_compact on an HBase table?
- It compresses the table files.
- It combines multiple existing store files to one for each family.
- It merges region to limit the region number.
- It splits regions that are too big.
6) What is the relationship between Jobs and Tasks in Hadoop?
- One job contains only one task
- One task contains only one job
- One Job can contain multiple tasks
- One task can contain multiple tasks
7) The number of Map tasks to be launched in a given job mostly depends on...
- the number of nodes in the cluster
- property mapred.map.tasks
- the number of reduce tasks
- the size of input splits
8) If no custom partitioner is defined in Hadoop then how is data partitioned before it is sent to the reducer?
- One by one on each available reduce slot
- By hash
9) In Hadoop can you set
- Number of map
- Number of reduce
- Both map and reduce number
- None, it's automatic
10) What is the minimum number of Reduce tasks for a Job?
- As many as there are nodes in the cluster
11) When a task fails, hadoop....
- ... try it again
- ... try it again until a failure threshold stops the job
- ... stop the job
- ... continue without this particular task
12) How can you debug map reduce job?
- By adding counters.
- By analyzing log.
- By running in local mode in an IDE.
- You can't debut a job.