PySpark MCQ Solution - Part 4

PySpark MCQ Solution Part 4

PySpark MCQ Solution Part 4

1.Which of the following specialized data structures allows you to interact with PySpark?

a.RDD (correct)
b.Distributed RDD
c.PySpark Cluster
d.PySpark Core


2. Which of the following libraries allows PySpark to communicate with the Spark Scala-based API?

a.PySpark4J Library
b.PyJ Library
c.Py4J Library (correct)
d.PyS4J Library


3. You must submit PySpark code to a cluster using the command line. Which of the following commands will you use to perform this task?

a.spark submit
b.spark-submit (correct)
c.spark-cluster
d.spark cluster


4. Which of the following statements is correct if you start Python Spark Shells without options?

a.It can kill the shell instance. (correct)
b.It can kill SparkContext in the shell.
c.It can kill the Spark master on local.
d.It may kill the shell instance and the Spark master on local.


5. Which of the following parameters must be supplied to create an FP-Growth model using the fpm module in PySpark 2.4.4?

a.data
b.minSupport
c.numPartitions
d.All of these (correct)

 6.You want to use a partition as an array while working with a DStream in PySpark 2.4.4. Which of these functions can be used to do perform this task?

a.combine()
glom() (correct)
persist()
rowSet()



7.You want to process data by using SQL and HiveQL. Which of the following can you use for this?

a.PySpark SQL (correct)
b.Dataframe
c.PySpark Core
d.RDD


8.You want to convert an internal SQL object into a native Python object. Which of these methods will you use to do so?

a.fromInternal(obj) (correct)
b.needConversion()
c.toInternal(obj)
d.None of these


9.Which of the following can be used to create a Column instance by selecting a column out of a DataFrame? 

a. df["colName"]
b. df.colName + 1 1 / df.colName
c. df.colName df["colName"] (correct)
d. df.colName



10.You have specified the hint parameter in PySpark SQL to optimize planning decisions. What does the hint parameter influence in this scenario:

  1. The selection of join strategies.
  2. The repartitioning of the data.
  3. The selection of join strategies and repartitioning of the data.

a.1
b.2
c.3 (correct)
d.None of these

Post a Comment

Post a Comment (0)

Previous Post Next Post