SSIS Tutorial: Difference between the control flow and the data flow

Introduction

In SSIS, the Control Flow and Data Flow are two essential parts of an ETL package, each serving a unique purpose in managing and processing data. Understanding their differences is key to designing efficient and well-organized integration workflows.

In this tutorial, we’ll explore the differences between Control Flow and Data Flow using ZappySys SSIS PowerPack components, with a practical example featuring the REST API Task in the Control Flow and the JSON Source in the Data Flow.

Prerequisites

Control Flow

The Control Flow is the top-level workflow of an SSIS package. It determines which tasks will run, in what order, and how they interact based on success, failure, or completion. We have an article about how to control execution flow here

Think of the Control Flow as the scheduling and orchestration layer of SSIS. It can include tasks such as executing SQL commands, calling APIs, looping through files, or managing conditional logic.

Example Using ZappySys REST API Task

In this example, we will use the REST API Task to send a request to an external web service and store the result in a file.

  1. Add a REST API Task to the Control Flow.

  2. Configure the HTTP connection and specify the API endpoint, for example:

    https://services.odata.org/V3/Northwind/Northwind.svc/Customers?$format=json
    
  3. Set the HTTP method to GET and define the output file in the Response Settings tab.

  4. Run the component. Once executed, the task will download data from the API and save it to the specified file.

The REST API Task operates entirely within the Control Flow. It executes, waits for completion, and passes control to the next task in sequence. It does not process individual data rows directly.

Data Flow

The Data Flow is where the actual data processing happens. It allows you to extract data from a source, transform it, and load it into a destination. Unlike the Control Flow, which manages tasks, the Data Flow works at the row level, streaming and transforming records as they move through the pipeline.

Example Using ZappySys JSON Source and destination components

Now that we’ve downloaded the data file using the REST API Task, let’s process it in the Data Flow using the JSON Source.

  1. Add a Data Flow Task to the Control Flow and connect it after the REST API Task.
  2. Inside the Data Flow, drag and drop a JSON Source component.
  3. In the JSON Source, set the file path to the JSON file saved by the REST API Task.
  4. Click Preview Data to confirm that the data loads correctly.
  5. Connect the JSON Source to a destination component. You can also use a Multicast to send data to multiple destinations such as Upsert Destination, XML File Destination, MongoDB Destination, or Excel Destination.
  6. Configure each destination component as needed.
  7. Run the package to process and save the data in your selected destination.

When executed, the JSON Source reads the data row by row and sends it to the configured destinations. This is where the transformation and flow logic take place.

Between source and destination components, you can also use transformations such as JSON generator transforms or XML generator transforms. For example, if you need to send JSON data to an API but your input is in XML, you can first read the XML and then convert it to JSON. For a detailed guide, see this tutorial.

Best Practices

  • Use the Control Flow for high-level orchestration tasks such as calling APIs, managing loops, or executing SQL statements.
  • Use the Data Flow for row-by-row data operations such as reading, transforming, and writing data.
  • Combine both flows for complete ETL solutions. For example, the REST API Task can download data, and the JSON Source can process it in the Data Flow.

Conclusion

In SSIS, the Control Flow and Data Flow work together to build robust ETL solutions. The Control Flow manages the package workflow, orchestrating tasks such as API calls via the REST API Task. At the same time, the Data Flow handles detailed data movement and transformation using components such as the JSON Source.

By combining both flows, you can design packages that efficiently manage high-level operations and perform detailed data processing. This structure not only simplifies maintenance but also improves performance and scalability across your ETL workflows.

Explore our SSIS PowerPack for more information and download and install it to start building smarter, faster, and more scalable solutions.

References

Contact us

If you encounter any issues or have specific questions, reach out to our support team via live chat or support ticket using our email support@zappysys.com.