top of page

What's the Difference Between the Append, Join and Union Tools?

When I teach Alteryx to new users, this is a question that comes up a lot. I modified the below picture from the internet to help explain these tools.



With the Union Tool you stack data sets. In the picture above, the data from Table 1 is read first and then the data from Table 2 is added underneath the data from Table 1.


With the Join Tool, especially the J output anchor, you combine data from both tables. You will have new columns containing the data from the second table. It joins the tables on the similar column they have in common which is the Fruit column. In the example below it combined the data from Table 1 and Table 2 to have Color and Shape in a new data set for the corresponding Fruit.




If you took these same tables and used the Union Tool, you would have the results below;



Notice Table 2 data is directly after Table 1 data. There are Null values for the Shape column for Table 1, since Table 1 did not have this column in it. The same is true for the Color column for Table 2.


What about the Append Tool? This tool simply adds whatever data you connect your Source data (Table 2 in pic below) set to your Target data (Table 1). If you have one column it will add the field for every single row of your data set in a new column as shown below;



If you were to join the original two tables together using the Append Tool you would get every possible combination of columns and rows for both data sets. This is called a Cartesian Product. It joins the data from all tables without requiring any joining references. See the below picture for details.



There are certain circumstances where you may want to use the Append Tool in this way.


What if you were a car company and you needed to combine every car with every color interior and exterior. You would use this tool to create that data.


Hopefully these examples above help with the distinction between these tools.

706 views0 comments

コメント


bottom of page