Featured Database Articles
Posted April 23, 2019
Importing Data into Azure SQL Database
By Marcin Policht
Data imports represent some of the more commonly performed database management tasks. When importing data into Azure SQL Database, you can leverage a number of traditional SQL Server data import techniques. In this article, you will get an overview of these techniques and learn about the cloud-specific aspects of importing.
The most straightforward methods that you can use in order to import data into tables hosted by an instance of Azure SQL Database include:
- Transact-SQL statements: You have the option of invoking import directly from the target Azure SQL Database instance by running either of the following:
BULK INSERT: loads raw data into a target table from a designated flat file. You can, for example, import content of a blob residing in an Azure Storage account (constituting an external data source). Note that in order to provide security context in such scenario (assuming that the blob is not accessible anonymously), you need to create a new or use an existing database master key (for encrypting secrets necessary to authorize access to the storage account) and a database scoped credential (for authenticating Azure SQL Database to the external data source). When importing data from Azure Storage, such credential would include the target storage account key or a Shared Access Signature token. The credential is part of the external data source definition (which, in this case, would include also the URI representing the location of the Azure Storage blob container).
OPENROWSET(BULK...): offers more advanced capabilities (comparing with
BULK INSERT), which allow for parsing content of a data source (such as a blob residing in an Azure Storage account) and executing T-SQL statements on returned rows before initiating the load (when implementing the
BULK INSERT-based approach, you could use for this purpose a temporary table). Just as with
BULK INSERT, you have to define an external data source, including credentials necessary for authorization purposes.
- The bcp utility: This command line utility facilitates importing large volumes of data into Azure SQL Database. The bcp utility is part of the package of Microsoft Command Line Utilities for SQL Server, available from Microsoft Download Center and compatible with every current version of 32-bit and 64-bit Windows operating system. It is also included in Microsoft SQL Server 2017 tools. The latest versions of the bcp utility support Azure AD authentication (including its Multi-Factor Authentication functionality), in addition to SQL Server authentication.
When performing data import, you either need to ensure that data residing in a source file matches the structure of the target table or you have the option of defining this structure in an auxiliary format file that you reference during import. You should also keep in mind that the bcp utility does not support UTF-8 format (data must be in formatted as ASCII or UTF-16).
- Azure Data Factory:This cloud-based, managed data integration service facilitates data movement and transformation. In order to take advantage of its capabilities, you implement pipelines that represent data-drive workflows, consisting primarily of linked services and activities. Linked services represent data stores (containing datasets that are used as inputs or outputs of activities) and external compute resources (handling data transformation). Azure Data Factory also relies on Integration Runtime that constitutes its own compute infrastructure, responsible for data movement and dispatch of activities to other compute services. In addition, Integration Runtime makes possible to execute SQL Server Integration Services (SSIS) packages. When dealing with publicly accessible data stores, you can use the managed, Azure-resident Integration Runtime. In order to handle data residing within boundaries of a private network (on-premises or in the cloud), you need to implement self-hosted Integration Runtime.
One of the simplest scenarios that illustrates the process of importing data into Azure SQL Database by using Azure Data Factory leverages Copy Activity, which executes exclusively in Integration Runtime. To account for possible discrepancies between the data source and its destination, you need to configure schema and data type mapping. Copy Activity also allows for incremental copies, reading and writing partitioned data, as well as interactive and programmatic monitoring.
Azure Data Factory offers a high degree of flexibility, with a wide range of supported data stores, including:
- Azure data services: Azure Blob, Table, and File storage, Azure Cosmos DB with SQL and MongoDB APIs, Azure Data Lake Storage Gen1 and Gen2, Azure Database for MySQL, MariaDB, and PostgresSQL, Azure SQL Database, Azure SQL Data Warehouse, and Azure Search Index
- Relational databases: Amazon Redshift, DB2, Google BigQuery, Greenplum, HBase, Hive, Informix, MySQL, MariaDB, PostresSQL, Oracle, SAP HANA, Spark, Sybase, Teradata, SQL Server, Microsoft Access
- Non-relational databases: Cassandra and MongoDB
- Flat files: Amazon S3, file system, FTP, Google Cloud Storage, hDFS, SFTP
- Services and applications: Dynamics 365, Dynamics CRM, Office 365, QuickBooks, Salesforce, SAP ECC, ServiceNow.
- SQL Server Management Studio: This primary SQL Server management tool available from Microsoft Downloads simplifies imports into Azure SQL Database by offering wizard-driven interface:
- Import Flat File Wizard: (included in SQL Server Management Studio starting with v17.3) copies data from a flat file in a delimited format. It leverages an intelligent framework called PROSE (an acronym representing the term Program Synthesis using Examples), which analyzes the input file in order to determine with high degree of probability the intended data format.
- SQL Server Import and Export Wizard: supports a number of different data sources, including .NET Framework Data Provider for ODBC, .NET Framework Data provider for Oracle, .NET Framework Data Provider for SQLServer, Flat File Source, Microsoft Access, Microsoft Excel, Microsoft OLE DB Provider for Analysis Services 14.0, Microsoft OLE DB Provider for Oracle, Microsoft OLE DB Provider for Search, OLE DB Provider for SQL Server Integration Services, SQL Server Native Client 11.0, and Microsoft OLE DB Provider for SQL Server (which also allows you to configure Azure SQL Database as the destination of the import process). The wizard relies on SQL Server Integration Services to perform the data copy (it automatically generates a SSIS package, which you can optionally store for future use).
This concludes our overview of different methods that can be used to import data into Azure SQL Database. In the upcoming articles published on this site, I will cover in more details the process of importing data into Azure SQL Database by using Azure Data Factory.
See All Articles by Marcin Policht
MS SQL Archives