
Designing SQL Server Agent scheduled jobs for ETL processes
Extracting, Transforming, and Loading (ETL) are key processes for any data management. ETL activities include an import and export of data from various data sources into an SQL Server table format. The data volume will vary from one row to billions of rows. To handle such import/export active ETL operations, a tool is essential. SQL Server Business Intelligence tools provide a strong foundation by the way of services (SSA) & tools (BIDS).
SQL Server provides many processes for the ETL management, such as BCP (Bulk Copy), BI (Bulk Insert), DTS (Data Transformation Services) and SSIS (SQL Server Integration Services). Since SQL Server 2005 DTS has been transformed into SSIS, which is a step ahead in terms of functionality providing more intuitive design and management for ETL operations, henceforth DTS functionality has been discontinued. SQL Server 2008 and SQL Server 2008 R2 enhances the native .NET execution environment to continue the ETL process with significant pipeline scalability enhancements such as persistent lookups, which is a significant improvement for data warehousing environment. The SQL Server Agent service manages many different objects on the server, each one responsible for its own function in the larger automation infrastructure. Mastering these objects is critical to implement an effective automation plan. For backward compatibility, SQL Server 2008 R2 supports the DTS packages. Using the command-line arguments, we can initiate the Import and Export wizard.
In this recipe, we will look at the SQL Server 2008 R2 tools and agent service capabilities to design an effective scheduled activity to handle critical ETL processes. SQL Server 2008 R2 consists of multiple windows services, for scheduling and automation activities. SQL Server agent service is useful, as it contains various artifacts such as jobs, schedules, operators, and alerts.
Getting ready
For ETL management within these job steps, the SQL Server Integration Service (SSIS) package is the key. Since these SSIS packages contain the corresponding administrative step, the entire package can be defined as a single-step to execute it within the job. Not only can we create the job steps from the SQL Server agent, we can also create the ETL processes using SQL Server Management Studio using SQL Server's import and export wizard. For desktop-based environments, the ETL activities can be accomplished using the SQL Server Express with the Advanced Services edition that consists of a basic installation of SQL Server Management Studio.
Although, SQL Server 2008 R2 extends the support to run DTS packages and Integration Services packages on the same server, the setup does not install the run-time support for DTS packages. You have to install this run-time support during the setup on the Feature Selection page by selecting Integration Services to install ActiveX script task and DTS package migration wizard and Client Tools Backward Compatibility. However, these components are not fully functional without the manual installation of additional, optional components that are not available during Setup.
Further, on a 64-bit environment, there is no 64-bit run-time support for DTS packages and Integration Services packages that run DTS packages, they can run only in 32-bit mode. To run packages in 32-bit mode outside BI Development Studio on a 64-bit computer, during Setup, select Business Intelligence Development Studio or Management Tools | Complete.
How to do it...
There are two ways to begin designing SQL Server Agent scheduled jobs for ETL processes.
- For the SQLExpress environment, use SQL Server Import and Export wizard and for the server environment, use SQL Server agent.
- There are two ways to open up the SQL Server Import and Export wizard, either using GUI tools or command line operations. Let us do this by using the command line operation:
Open a command prompt window with Administrative privileges and type
dtswizard.exe
. - The file can be located within the
%\program files\Microsoft SQL Server\100\DTS\binn
folder, as shown in the following screenshot: - From your local machine, using the Start menu point to All Programs.
- As per the following screenshot, point to Microsoft SQL Server 2008 R2 and click Import and Export Data.
- Similarly, using SQL Server 2008 R2 client tools, you can choose SQL Server Management Studio (SSMS) or Business Intelligence Development Studio (BIDS).
- Right-click the
SSIS Packages
folder and select the SSIS Import and Export Wizard. - Using BIDS tool, go to Project menu, choose the SSIS Import and Export Wizard.
- For the ease of this recipe, we will skip to the screen where you need to save the package (see the next screenshot) as the Import and Export wizard, which has the self-explanatory screens to proceed.
- We need to provide all the necessary information to create an import or export task and once we get to the save and Run Package screen, you need to choose Run immediately, if it is a one-off requirement to perform ETL activities.
- In case the referred ETL process is a continuous task to schedule for administrative processes, we need to choose the Save SSIS package option with two more options to choose from.
- We can choose either to save a package in SQL Server (package) or File System (as a file) with a protection level of 'Encrypt data' with a password/key, or rely on windows operating system roles for access control.
Note
The package store includes
msdb
database and folders in the filesystem, for the better administration of the ETL process using a backup strategy it is essential to save a copy of the package to SQL Server, which is saved to thesysssispackages
table.
For this recipe, we chose to create a simple ETL process to export Department table data from AdventureWorks2008R2
database to a Flat-file destination (text file) with a protection level of 'encrypt all data with password' option. The package has been saved to SQL Server with the name Export Department Data (as shown in the next screenshot) and not chosen to'Run immediately'.

- Now, we are at the stage where we need to schedule the saved package using SQL Server Management Studio. Under the
Management
folder expand SQL Server Agent, the option which will present theJobs
folder. - Right-click on
Jobs
and choose New job to create a job. - Name the job and choose the Steps options to create a new job step.
- Now choose the step type as SQL Server Integration Services Package.
- Choose the package source, which is SQL Server in our case.
- Enter the SQL Server 2008 R2 instance name to choose the relevant package Export Department Data, as shown in the following screenshot.
- Now, let us look into how to automate the package execution for administrative purposes without choosing any other tool or process. This SSIS job step presents various execution options, such as Configurations.
- SSIS service relies on a configuration, which stands as settings for all the packages on that SQL Server instance. By default, the configuration file is stored as an XML file, which can be located under the folder
%ProgramFiles%\Microsoft SQL Server\100\DTS\Binn
with the nameMsDtsSrvr.ini.xml
. - You need to modify the configuration file, if your packages are stored in a named instance, or a remote instance of the Database Engine, or in multiple instances of the Database Engine.
- If the configuration file is moved to a different location other than the earlier specified default location, you have to modify the registry to specify the new location. The registry key can be located in:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\100\SSIS\ServiceConfigFile
.
How it works...
After you create the job, you must add at least one step and set the type of the step to the SQL Server Integration Services Package. A job can include multiple steps, each running a different package. An SQL Server Agent job step can run Integration Services packages that are saved to the msdb
database, or to the filesystem.
Running an Integration Services package from a job step is similar to running a package using the dtexec and DTExecUI utilities. If you run a package from the SSIS Designer, the package always runs immediately. While a package is running, the SSIS Designer displays the progress of the package execution on the Progress tab. You can view the start and finish time of the package and its tasks and containers, in addition to information about any tasks or containers in the package that failed. After the package has finished running, the run-time information remains available on the Execution Results tab. The account that runs an Integration Services package as an SQL Server Agent job step requires all the permissions as an account that runs the package directly.
There's more
Using the PowerShell
platform, we can run scripts which are designed for best administration needs, such as Backup and Restore, Agent Job Creation, Server Configuration, and Database Object Scripting tasks.