PySpark MCQ Solution - Part 3

PySpark MCQ Solution Part 3

PySpark MCQ Solution Part 3

1.Which of these is the correct way to use a Date Literal in PySpark SQL:

  1. SELECT DATE '1997' AS col;
  2. SELECT DATE '1997-01-20' AS col;
  3. SELECT DATE '1997-01' AS col;

a. 1
b.2
c.3
d.All of these (correct)

 2.You have started a streaming job and you get to see a Streaming tab in the SparkUI of the attached cluster. Which of the following is true in the given context?

a.A streaming job is running in this cluster (correct)
b.No streaming job is running in this cluster
c.You can view the driver logs in this cluster
d.You can view the driver logs and streaming jobs in this cluster

 3.In PySpark, you are working on the pyspark.mllib.classification module.

If you have implemented the following class, then which parameter will you use to determine when to terminate the iterations?

Class
class StreamingLogisticRegressionWithSGD(stepSize=0.1, numIterations=50, miniBatchFraction=1.0, regParam=0.0, convergenceTol=0.001)

a.stepSize
b.numIterations
c.convergenceTol (correct)
d.miniBatchFraction

4.Which of the following PySpark Shell actions can you use to see the contents of an RDD?

a.RDD. Collect ()
b.RDDread. Collect () (correct)
c.RDDread. ContentCollect ()
d.RDDread. Content () 

5.During the installation of PySpark, what is the value of the User and System variables?

a.User Variable: SPARK_HOME System Variable: PATH (correct)
b.User Variable: HOME System Variable: PATH
c.User Variable: HOME System Variable: SPARK_PATH
d.User Variable: SPARK_HOME System Variable: SPARK_PATH

6.Which of the following is a feature of PySpark:

  1. It is a hundred times faster than traditional large-scale data processing frameworks.
  2. The complex programming layer provides powerful caching.
  3. It provides real-time computation and low latency because of in-memory computation.

a.1
b.1 and 2
c.1 and 3 (correct)
d.2 and 3

Post a Comment

Post a Comment (0)

Previous Post Next Post