[en]
Apache Spark Code Tool
[en] The Apache Spark Code tool is a code editor that creates an Apache Spark context and executes Apache Spark commands directly from Designer. This tool uses the R programming language.
[en] For additional information, see Apache Spark Direct, Apache Spark on Databricks, and Apache Spark on Microsoft Azure HDInsight.
[en] Connect to Apache Spark
[en] Option 1
[en] Connect to your Apache Spark cluster directly.
[en] Drag a Connect In-DB Tool or Data StreamIn Tool onto the canvas.
[en] Select the Connection Name dropdown arrow and select Manage connection.
[en] Option 2
[en] Alternatively, connect directly with the Apache Spark Code tool.
[en] Drag the Apache Spark Code tool onto the canvas.
[en] Under Data Connection, select the Connection Name dropdown arrow and select Manage connection.
[en] Both methods bring up the Manage In-DB Connections window. In Manage In-DB Connections, select a Data Source.
[en] Code Editor
[en] With an Apache Spark Direct connection established, the Code Editor activates. Use Insert Code to generate template functions in the code editor.
[en] Import Library creates an import statement.
[en] import package
[en] Read Data creates a readAlteryxData function to return the incoming data as an Apache SparkSQL DataFrame.
[en] val[en] dataFrame = readAlteryxData(1)
[en] Write Data creates a writeAlteryxData function to output an Apache SparkSQL DataFrame.
[en] writeAlteryxData([en] dataFrame, 1)
[en] Log Message creates a logAlteryxMessage function to write a string to the log as a message.
[en] logAlteryxMessage("Example message")
[en] Log Warning creates a logAlteryxWarning function to write a string to the log as a warning.
[en] logAlteryxWarning("Example warning")
[en] Log Error creates a logAlteryxError functions to write a string to the log as an error.
[en] logAlteryxError("Example error")
[en] Import Library creates an import statement.
[en] from module import library
[en] Read Data creates a readAlteryxData function to return the incoming data as an Apache SparkSQL DataFrame.
[en] dataFrame[en] = readAlteryxData(1)
[en] Write Data creates a writeAlteryxData function to output an Apache SparkSQL DataFrame.
[en] writeAlteryxData([en] dataFrame, 1)
[en] Log Message creates a logAlteryxMessage function to write a string to the log as a message.
[en] logAlteryxMessage("Example message")
[en] Log Warning creates a logAlteryxWarning function to write a string to the log as a warning.
[en] logAlteryxWarning("Example warning")
[en] Log Error creates a logAlteryxError functions to write a string to the log as an error.
[en] logAlteryxError("Example error")
[en] Import Library creates an import statement.
[en] library([en] jsonlite)
[en] Read Data creates a readAlteryxData function to return the incoming data as an Apache SparkSQL DataFrame.
[en] dataFrame[en] <- readAlteryxData(1)
[en] Write Data creates a writeAlteryxData function to output an Apache SparkSQL DataFrame.
[en] writeAlteryxData([en] dataFrame, 1)
[en] Log Message creates a logAlteryxMessage function to write a string to the log as a message.
[en] logAlteryxMessage("Example message")
[en] Log Warning creates a logAlteryxWarning function to write a string to the log as a warning.
[en] logAlteryxWarning("Example warning")
[en] Log Error creates a logAlteryxError functions to write a string to the log as an error.
[en] logAlteryxError("Example error")
[en] Import Code
[en] Use ImportCode to pull in code created externally.
[en] From File opens a File Explorer to browse to your file.
[en] From Jupyter Notebook opens a File Explorer to browse to your file.
[en] From URL provides a field to type or paste a file location.
[en] Click the gear icon to change cosmetic aspects of the code editor.
[en] Use the Text Size buttons to increase or decrease the size of the text in the editor.
[en] Use Color Theme to toggle between a dark and light color scheme.
[en] Select Wrap Long Lines causes long lines to remain visible within the code editor window instead of requiring a horizontal scroll.
[en] Select Show Line Numbers to see line numbers for the editor.
[en] Output Metainfo
[en] Select the output channel metainfo you want to manage. Manually change the Apache Spark Data Type of existing data.
[en] Select the plus icon to add a data row.
[en] Enter the Field Name.
[en] Select the Apache Spark Data Type.
[en] Enter the Size in bits.