tau_spark-submit — Launches PySpark applications with TAU instrumentation
Tau can profile PySpark applications using Spark 2.2 or later and Python 2.7 or later with the numpy package installed. TAU must be configured with the -pythoninc and -pythonlib options specifying an appropriate Python installation.
The SPARK_HOME environment variable must be set to the location of your Spark installation. Replace spark-submit in your normal Spark application invocation with tau_spark-submit. Options for tau_spark-submit can be set using the TAU_SPARK_PYTHON_ARGS environment variable.
A PySpark application profiled using tau_spark-submit will generate one profile file per task executed.