How to read large PostgreSQL table in SSIS

In this post we will cover how to read PotgreSQL table in SSIS. We will use SSIS PostgreSQL Source and SSIS Upsert Destination provided by ZappySys.

Requirements

Generate sample data with many rows in PostgreSQL

You can try the following script if you do not have a sample table with many rows. Below example used DBeaver tool to connect and execute SQL script for PostgrqSQL connection but you can use any tool you like.
You can adjust repeat('ABCDE ', 10) to generate rows with many characters (e.g., repeat('ABCDE ', 1000) will generate each row with approximately 6000 characters).

--Step#1: Create table
--drop table sample_data 
CREATE TABLE sample_data (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(150),
    age INTEGER,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    status VARCHAR(20),
    notes VARCHAR(7000)  -- Large string column
);

--truncate table sample_data;

--Step#2: Insert 100000 sample rows in PostgreSQL Table
DO $$
BEGIN
    FOR i IN 1..100000 LOOP
        INSERT INTO sample_data (name, email, age, status, notes)
        VALUES (
            'User_' || i,
            'user_' || i || '@example.com',
            20 + (i % 50),
            CASE WHEN i % 2 = 0 THEN 'active' ELSE 'inactive' END,
            'This is a sample note for user ' || i || '. ' || repeat('ABCDE ', 1000)
        );
    END LOOP;
END $$;

--Step#4: Verify count
select count(*) from sample_data

How to read PotgreSQL Table in SSIS

Lets look at how to read data from PostgreSQL (Table mode or Query mode)

  1. Double click on the Data Flow task to see Data Flow designer surface.
  2. Here, In Visual Studio, drag and drop the PostgreSQL Source in the design surface.
  3. Double-click to configure, click New connection, and setup that
  4. On Source UI, change to Query mode and type SQL as shown below.
SELECT id, "name", email, age, created_at, status, notes
FROM public.sample_data;
  1. Preview data and click OK to save.
  2. Connect the source with the target like Upsert Destination
  3. Run the package

Using Variable / Function Placeholders for dynamic Query

You can also insert variables or use direct placeholders to make your query dynamic. For more information on available functions check this link.

Example#1: Using SSIS variable for dynamic SQL

SELECT id, "name", email, age, created_at, status, notes
FROM public.sample_data
WHERE Email like '{{User::EmailPrefix}}%';

Example#2: Using SSIS variable with DateTime format
Below example is for Variable with DateTime type. You can use Date format specifiers like below. (y, M, d, H, h, m, s, f, t, tt etc allowed).

SELECT id, "name", email, age, created_at, status, notes
FROM public.sample_data
WHERE created_at > '{{System::ContainerStartTime,yyyy-MM-dd}}';

Example#3: Using Placeholder function for DateTime

SELECT id, "name", email, age, created_at, status, notes
FROM public.sample_data
WHERE created_at > '<<today-7d,FUN_TO_DATETIME>>';
--WHERE created_at > '<<monthstart,FUN_TO_DATETIME>>';
--WHERE created_at > '<<monthend+24h,FUN_TO_DATETIME>>';
--WHERE created_at > '<<monthend+2d,FUN_TO_DATETIME>>';
--You can use now, today, yesterday, monthstart, monthend, yearstart, yearend';
-- Also +/- y,d,m,h,min,sec also allowed in expression (E.g. now-7h) 

Conclusion

In this post we saw how easy it is to read PostgreSQL data with options like Variable Placeholders (Dynamic SQL). You can Download SSIS PowerPack from here and try these steps yourself.

Reference Links