This article describes how to use the SnowSQL client to bulk load data into Snowflake. Any delimited plain-text file, such as Comma-delimited CSV files, can be used to bulk load data using SQL. Furthermore, semi-structured data from JSON, AVRO, Parquet, or ORC files can be bulk loaded. However, the emphasis of this article is loading from CSV files.
First we will look at how you can stage your data. Staging can be done as follows:
- First, you upload your data files to a location where Snowflake can access your files. This is referred to as staging your files.
- Then you load your data from these staged files into your tables.
Then we will learn how to use the SnowSQL SQL client to load CSV files from a local machine into a table named Contacts in the demo Database demo_db. CSV files are easier to import into Database systems like Snowflake because they can represent relational data in a plain-text file.
We will use a named internal stage to store the files before loading. Now the following steps are required for Loading Data to Snowflake:
Step 1: Use the demo_db Database
USE DATABASE demo_db;
Step 2: Create the Contacts Table
Use the following SQL command to create the Contacts table:
CREATE OR REPLACE TABLE contacts ( id NUMBER (38, 0) first_name STRING, last_name STRING, company STRING, email STRING, workphone STRING, cellphone STRING, streetaddress STRING, city STRING, postalcode NUMBER (38, 0));
Step 3: Populate the Table with Records
The Contacts table should contain records like this;
1, Chris, Harris, BBC Top Gear, email@example.com, 606-237-0055, 502-564-8100, PO Box 3320 3 Queensbridge, Northampton, NN4 7BF 2, Julie, Clark, American Aerobatics Inc, firstname.lastname@example.org, 530-677-0634, 530-676-3434, 3114 Boeing Rd, Cameron Park, CA 95682 3, Doug, Danger, MotorCycle Stuntman LA, email@example.com, 413-239-7198, 508-832-9494, PO Box 131 Brimfield, Massachusetts, 01010 4, John, Edward, Get Psyched, firstname.lastname@example.org, 631-547-6043, 800-860-7581, PO Box 383 Huntington, New York, 11743 5, Bob, Hope, Bob Hope Comedy, email@example.com, 818-841-2020, 310-990-7444, 3808 W Riverside Dr-100, Burbank, CA 91505
Step 4: Create an Internal Stage
Next, you will create an internal stage called CSV files.
CREATE STAGE csvfiles;
Step 5: Execute a PUT Command to Stage the Records in CSV Files
PUT file:///tmp/load/contacts0*.csv @csvfiles;
Step 6: Confirm that the CSV Files have been Staged
To see if the files are staged you can use the LIST command.
Step 7: Specify a Virtual Data Warehouse to Use
Now we are ready to load the files from the staged files into the CONTACTS table. First, you will specify a virtual warehouse to use.
USE WAREHOUSE dataload;
Step 8: Load the Staged Files Into a Snowflake Table
COPY INTO contacts; FROM @csvfiles PATTERN = ‘.*contacts0[1-4].csv.gz’ ON_ERROR = ‘skip_file’;
- INTO specifies where the table data will be loaded.
- PATTERN specifies the data files to load. In this case, we are loading files from data files with names that include the numbers 1-4.
- ON_ERROR tells the command what to do when it encounters errors in the files.
Snowflake also provides powerful options for error handling as the data is loading. You can check out the Snowflake documentation to learn more about these options.
If the load was successful, you can now query your table using SQL: