Precision Match Tool
The Precision Match tool performs a fuzzy match-like operation on string data to standardize different variations of the same phrase to a single value. Use the Precision Match tool when your data contains multiple spellings of the same phrase (for example, color and colour, or US and United States).
注意
The GenAI Tools are currently in Public Preview. Learn how to join the Public Preview and get started with AI-powered workflows!
Tool Components
The Precision Match tool has 5 anchors (3 input and 2 output):
M input anchor: Use the M input anchor to connect the model connection settings from the LLL Override tool.
D input anchor: Use the D input anchor to connect the string data you want to standardize.
R input anchor (Optional): Use the R input anchor to connect to a reference dataset that contains standardized phrases you’d like the LLM to use.
D output anchor: Use the D output anchor to pass your matched input data downstream.
M output anchor: Use the M output anchor to pass the mapping table output from the LLM downstream.
Configure the Tool
Add a Precision Match tool to the canvas.
Connect the D input anchor to the categorical string data you want to use in your workflow. Note that the Precision Match tool is only intended for categorical data (for example, names or places).
(Optional) Connect the R input anchor to a reference dataset that contains a list of standardized phrases. Use this anchor if you have a preference for the standardized phrases you want. Otherwise, the LLM makes its own decision based on its built-in prompting.
Connect the M input anchor to an LLM Override tool.
Select the column that contains the data you want to standardize from the Choose Field dropdown.
From the How would you like the results to be output? section, you can select either…
Replace the selected column: Replace the column you selected with the standardized phrases.
Append as a new column: Create a new column in the dataset with the standardized phrases. (Optional) Enter a name for the new column.
(Optional) If your input data hasn’t changed and you want to use a cached mapping table, select the Use Cached Mapping Table? checkbox. Use this option to save on LLM requests when working on other parts of your workflow.
(Optional) If your workflow has a dynamic input and you want to avoid potentially high LLM requests, you can set a row count threshold that causes your workflow to stop with an error. Enter a row threshold in the Error if number of categories exceeds value parameter.
Run the workflow.
Output
The Precision Match tool has 2 output anchors that both pass standardized phrases downstream in 2 ways:
The D output anchor includes your matched input data downstream. Depending on which option you select from the How would you like the results to be output? section, the D output anchor either…
Updates the selected string column with standardized phrases when you select Replace the selected column.
Appends standardized phrases to your data when you select Append as a new column.
The M output anchor includes the mapping table from the LLM used to standardize your data. The mapping table includes a column for the original string value and the standardized string value.