ChatGPT解决这个技术问题 Extra ChatGPT

Import CSV file into SQL Server

I am looking for help to import a .csv file into SQL Server using BULK INSERT and I have few basic questions.

Issues:

The CSV file data may have , (comma) in between (Ex: description), so how can I make import handling these data? If the client creates the CSV from Excel then the data that have comma are enclosed within "" (double quotes) [as the below example] so how do the import can handle this? How do we track if some rows have bad data, which import skips? (does import skips rows that are not importable)

Here is the sample CSV with header:

Name,Class,Subject,ExamDate,Mark,Description
Prabhat,4,Math,2/10/2013,25,Test data for prabhat.
Murari,5,Science,2/11/2013,24,"Test data for his's test, where we can test 2nd ROW, Test."
sanjay,4,Science,,25,Test Only.

And SQL statement to import:

BULK INSERT SchoolsTemp
FROM 'C:\CSVData\Schools.csv'
WITH
(
    FIRSTROW = 2,
    FIELDTERMINATOR = ',',  --CSV field delimiter
    ROWTERMINATOR = '\n',   --Use to shift the control to next row
    TABLOCK
)
May be SSMS: How to import (Copy/Paste) data from excel can help (If you don't want to use BULK NSERT or don't have permissions for it).
This is beside the point, but your sample CSV file should load into MS Access without trouble.

s
sawa

Based SQL Server CSV Import

1) The CSV file data may have , (comma) in between (Ex: description), so how can I make import handling these data?

Solution

If you're using , (comma) as a delimiter, then there is no way to differentiate between a comma as a field terminator and a comma in your data. I would use a different FIELDTERMINATOR like ||. Code would look like and this will handle comma and single slash perfectly.

2) If the client create the csv from excel then the data that have comma are enclosed within " ... " (double quotes) [as the below example] so how do the import can handle this?

Solution

If you're using BULK insert then there is no way to handle double quotes, data will be inserted with double quotes into rows. after inserting the data into table you could replace those double quotes with ''.

update table
set columnhavingdoublequotes = replace(columnhavingdoublequotes,'"','')

3) How do we track if some rows have bad data, which import skips? (does import skips rows that are not importable)?

Solution

To handle rows which aren't loaded into table because of invalid data or format, could be handle using ERRORFILE property, specify the error file name, it will write the rows having error to error file. code should look like.

BULK INSERT SchoolsTemp
    FROM 'C:\CSVData\Schools.csv'
    WITH
    (
    FIRSTROW = 2,
    FIELDTERMINATOR = ',',  --CSV field delimiter
    ROWTERMINATOR = '\n',   --Use to shift the control to next row
    ERRORFILE = 'C:\CSVDATA\SchoolsErrorRows.csv',
    TABLOCK
    )

Thanks for the help. Reg the Solution#1: Can we create || separated value file from Excel? Because around 20% of the source files are created using Excel by the client.
@Prabhat How you're loading Excel files into SQL Server?
These are not Excel files that I am loading. Client is using Excel to create .CSV files (for 20% of the source data that our application import). And I was asking if we create csv files using Excel how can we have || as column value separator?
The file has to be ON THE SERVER. Not on your local machine.
@Jess the file specified can be a UNC path (e.g., \\machinename\public) as long as permissions are configured correctly: dba.stackexchange.com/questions/44524/…
T
TylerH

From How to import a CSV file into a database using SQL Server Management Studio, from 2013-11-05:

First create a table in your database into which you will be importing the CSV file. After the table is created: Log into your database using SQL Server Management Studio Right click on your database and select Tasks -> Import Data... Click the Next > button For the Data Source, select Flat File Source. Then use the Browse button to select the CSV file. Spend some time configuring how you want the data to be imported before clicking on the Next > button. For the Destination, select the correct database provider (e.g. for SQL Server 2012, you can use SQL Server Native Client 11.0). Enter the Server name; Check Use SQL Server Authentication, enter the User name, Password, and Database before clicking on the Next > button. On the Select Source Tables and Views window, you can Edit Mappings before clicking on the Next > button. Check the Run immediately check box and click on the Next > button. Click on the Finish button to run the package.


It would be nice if you gave attribution to the page where you copy/pasted this answer from...
It is not necessary to pre-create the table, it can be created during the import process
I love that you just cut & paste from a web page with the oh-so-useful line "Spend some time configuring how you want the data to be imported". That was everything I'm looking for: I don't seem to be able to configure it at all!
Oh, and "Check the Use SQL Server Authentication radio button" is wrong, as you may very well want to use Windows Authentication. It's whichever works for you.
thanks found a step by step procedure with images for implement above procedure, worth a look : qawithexperts.com/article/sql/…
O
Oleg

2) If the client create the csv from excel then the data that have comma are enclosed within " ... " (double quotes) [as the below example] so how do the import can handle this?

You should use FORMAT = 'CSV', FIELDQUOTE = '"' options:

BULK INSERT SchoolsTemp
FROM 'C:\CSVData\Schools.csv'
WITH
(
    FORMAT = 'CSV', 
    FIELDQUOTE = '"',
    FIRSTROW = 2,
    FIELDTERMINATOR = ',',  --CSV field delimiter
    ROWTERMINATOR = '\n',   --Use to shift the control to next row
    TABLOCK
)

Note that the FORMAT specifier is only available since SQL Server 2017.
S
Sachin Kainth

The best, quickest and easiest way to resolve the comma in data issue is to use Excel to save a comma separated file after having set Windows' list separator setting to something other than a comma (such as a pipe). This will then generate a pipe (or whatever) separated file for you that you can then import. This is described here.


B
Burgi

Because they do not use the SQL import wizard, the steps would be as follows:

https://i.stack.imgur.com/aVKs3.png

Right click on the database in the option tasks to import data, Once the wizard is open, we select the type of data to be implied. In this case it would be the

Flat file source

We select the CSV file, you can configure the data type of the tables in the CSV, but it is best to bring it from the CSV.

Click Next and select in the last option that is

SQL client

Depending on our type of authentication we select it, once this is done, a very important option comes.

We can define the id of the table in the CSV (it is recommended that the columns of the CSV should be called the same as the fields in the table). In the option Edit Mappings we can see the preview of each table with the column of the spreadsheet, if we want the wizard to insert the id by default we leave the option unchecked.

Enable id insert

(usually not starting from 1), instead if we have a column with the id in the CSV we select the enable id insert, the next step is to end the wizard, we can review the changes here.

On the other hand, in the following window may come alerts, or warnings the ideal is to ignore this, only if they leave error is necessary to pay attention.

This link has images.


B
Brad Larson

Firs you need to import CSV file into Data Table

Then you can insert bulk rows using SQLBulkCopy

using System;
using System.Data;
using System.Data.SqlClient;

namespace SqlBulkInsertExample
{
    class Program
    {
      static void Main(string[] args)
        {
            DataTable prodSalesData = new DataTable("ProductSalesData");

            // Create Column 1: SaleDate
            DataColumn dateColumn = new DataColumn();
            dateColumn.DataType = Type.GetType("System.DateTime");
            dateColumn.ColumnName = "SaleDate";

            // Create Column 2: ProductName
            DataColumn productNameColumn = new DataColumn();
            productNameColumn.ColumnName = "ProductName";

            // Create Column 3: TotalSales
            DataColumn totalSalesColumn = new DataColumn();
            totalSalesColumn.DataType = Type.GetType("System.Int32");
            totalSalesColumn.ColumnName = "TotalSales";

            // Add the columns to the ProductSalesData DataTable
            prodSalesData.Columns.Add(dateColumn);
            prodSalesData.Columns.Add(productNameColumn);
            prodSalesData.Columns.Add(totalSalesColumn);

            // Let's populate the datatable with our stats.
            // You can add as many rows as you want here!

            // Create a new row
            DataRow dailyProductSalesRow = prodSalesData.NewRow();
            dailyProductSalesRow["SaleDate"] = DateTime.Now.Date;
            dailyProductSalesRow["ProductName"] = "Nike";
            dailyProductSalesRow["TotalSales"] = 10;

            // Add the row to the ProductSalesData DataTable
            prodSalesData.Rows.Add(dailyProductSalesRow);

            // Copy the DataTable to SQL Server using SqlBulkCopy
            using (SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=dbProduct;Integrated Security=SSPI;Connection Timeout=60;Min Pool Size=2;Max Pool Size=20;"))
            {
                dbConnection.Open();
                using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
                {
                    s.DestinationTableName = prodSalesData.TableName;

                    foreach (var column in prodSalesData.Columns)
                        s.ColumnMappings.Add(column.ToString(), column.ToString());

                    s.WriteToServer(prodSalesData);
                }
            }
        }
    }
}

a maybe more user-friendly wrapper around the BulkCopy classes busybulkcopy.codeplex.com
N
Neil

Here's how I would solve it:

Just Save your CSV File as a XLS Sheet in excel(By Doing so, you wouldn't have to worry about delimitiers. Excel's spreadsheet format will be read as a table and imported directly into a SQL Table) Import the File Using SSIS Write a Custom Script in the import manager to omit/modify the data you're looking for.(Or run a master script to scrutinize the data you're looking to remove)

Good Luck.


Downvote: Importing XLS files with SSIS is terrible. SSIS will try to guess at the datatypes of the Excel data, but can guess wrong and there's nothing you can do about it. Much better to use CSV.
Well, I'd suggest csv too, but if you had read the OP's scenario, he had some special scenarios especially with delimiters which are not an issue with xls sheets. Usually special case scenarios like these do not require an extensive solution, but a fix that preserves the data. While uploading the file, SSIS lets you choose the data mapping between source and destination tables which again, eases the effort involved. Which is why this method was suggested as a quick hack.
SSIS can already handle CSV text delimiters. If you're using SSIS anyway, going to the trouble of saving your CSV as an XLS first just strikes me as adding potential breakage for no reason.
Also, I routinely have CSV files too large for Excel.
A
Arsen Khachaturyan

I know this is not the exact solution to the question above, but for me, it was a nightmare when I was trying to Copy data from one database located at a separate server to my local.

I was trying to do that by first export data from the Server to CSV/txt and then import it to my local table.

Both solutions: with writing down the query to import CSV or using the SSMS Import Data wizard was always producing errors (errors were very general, saying that there is parsing problem). And although I wasn't doing anything special, just export to CSV and then trying to import CSV to the local DB, the errors were always there.

I was trying to look at the mapping section and the data preview, but there was always a big mess. And I know the main problem was comming from one of the table columns, which was containing JSON and SQL parser was treating that wrongly.

So eventually, I came up with a different solution and want to share it in case if someone else will have a similar problem.

What I did is that I've used the Exporting Wizard on the external Server.

Here are the steps to repeat the same process:
1) Right click on the database and select Tasks -> Export Data...

2) When Wizard will open, choose Next and in the place of "Data Source:" choose "SQL Server Native Client".

https://i.stack.imgur.com/mKn5w.png

In case of external Server you will most probably have to choose "Use SQL Server Authentication" for the "Authentication Mode:".

3) After hitting Next, you have to select the Destionation.
For that, select again "SQL Server Native Client".
This time you can provide your local (or some other external DB) DB.

https://i.stack.imgur.com/6axAM.png

4) After hitting the Next button, you have two options either to copy the entire table from one DB to another or write down the query to specify the exact data to be copied. In my case, I didn't need the entire table (it was too large), but just some part of it, so I've chosen "Write a query to specify the data to transfer".

https://i.stack.imgur.com/RLKez.png

I would suggest writing down and testing the query on a separate query editor before moving to Wizard.

5) And finally, you need to specify the destination table where the data will be selected.

https://i.stack.imgur.com/aAmOc.png

I suggest to leave it as [dbo].[Query] or some custom Table name in case if you will have errors exporting the data or if you are not sure about the data and want further analyze it before moving to the exact table you want.

And now go straight to the end of the Wizard by hitting Next/Finish buttons.


W
William Herrmann

All of the answers here work great if your data is "clean" (no data constraint violations, etc.) and you have access to putting the file on the server. Some of the answers provided here stop at the first error (PK violation, data-loss error, etc.) and give you one error at a time if using SSMS's built in Import Task. If you want to gather all errors at once (in case you want to tell the person that gave you the .csv file to clean up their data), I recommend the following as an answer. This answer also gives you complete flexibility as you are "writing" the SQL yourself.

Note: I'm going to assume you are running a Windows OS and have access to Excel and SSMS. If not, I'm sure you can tweak this answer to fit your needs.

Using Excel, open your .csv file. In an empty column you will write a formula that will build individual INSERTstatements like =CONCATENATE("INSERT INTO dbo.MyTable (FirstName, LastName) VALUES ('", A1, "', '", B1,"')", CHAR(10), "GO") where A1 is a cell that has the first name data and A2 has the last name data for example. CHAR(10) adds a newline character to the final result and GO will allow us to run this INSERT and continue to the next even if there are any errors. Highlight the cell with your =CONCATENATION() formula Shift + End to highlight the same column in the rest of your rows In the ribbon > Home > Editing > Fill > Click Down This applies the formula all the way down the sheet so you don't have to copy-paste, drag, etc. down potentially thousands of rows by hand Ctrl + C to copy the formulated SQL INSERT statements Paste into SSMS You will notice Excel, probably unexpectedly, added double quotes around each of your INSERT and GO commands. This is a "feature" (?) of copying multi-line values out of Excel. You can simply find and replace "INSERT and GO" with INSERT and GO respectively to clean that up. Finally you are ready to run your import process After the process completes, check the Messages window for any errors. You can select all the content (Ctrl + A) and copy into Excel and use a column filter to remove any successful messages and you are left with any and all the errors.

This process will definitely take longer than other answers here, but if your data is "dirty" and full of SQL violations, you can at least gather all the errors at one time and send them to the person that gave you the data, if that is your scenario.


S
Steve Yo

Import the file into Excel by first opening excel, then going to DATA, import from TXT File, choose the csv extension which will preserve 0 prefixed values, and save that column as TEXT because excel will drop the leading 0 otherwise (DO NOT double click to open with Excel if you have numeric data in a field starting with a 0 [zero]). Then just save out as a Tab Delimited Text file. When you are importing into excel you get an option to save as GENERAL, TEXT, etc.. choose TEXT so that quotes in the middle of a string in a field like YourCompany,LLC are preserved also...

BULK INSERT dbo.YourTableName
FROM 'C:\Users\Steve\Downloads\yourfiletoIMPORT.txt'
WITH (
FirstRow = 2, (if skipping a header row)
FIELDTERMINATOR = '\t',
ROWTERMINATOR   = '\n'
)

I wish I could use the FORMAT and Fieldquote functionality but that does not appear to be supported in my version of SSMS


M
Michael Tomar

As it was stated above, you need to add FORMAT and FIELDQUOTE options to bulk insert .CSV data into SQL Server. For your case SQL statement will look like this:

BULK INSERT SchoolsTemp
FROM 'C:\CSVData\Schools.csv'
WITH
(
    FORMAT = 'CSV', 
    FIELDQUOTE = '""',
    FIRSTROW = 2,
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n',
    TABLOCK
)

Though BULK INSERT in SSMS is great for a one-time import job, depending on your use case you may need some other options inside SSMS or using 3rd parties. Here is a detailed guide describing various options to import CSV files to SQL Server, including ways to automate (I mean schedule) the process and specify FTP or file storages for CSV location.


C
Chameleon

I know that there are accepted answer but still, I want to share my scenario that maybe help someone to solve their problem TOOLS

ASP.NET

EF CODE-FIRST APPROACH

SSMS

EXCEL

SCENARIO i was loading the dataset which's in CSV format which was later to be shown on the View i tried to use the bulk load but I's unable to load as BULK LOAD was using

FIELDTERMINATOR = ','

and Excel cell was also using , however, I also couldn't use Flat file source directly because I was using Code-First Approach and doing that only made model in SSMS DB, not in the model from which I had to use the properties later.

SOLUTION

I used flat-file source and made DB table from CSV file (Right click DB in SSMS -> Import Flat FIle -> select CSV path and do all the settings as directed) Made Model Class in Visual Studio (You MUST KEEP all the datatypes and names same as that of CSV file loaded in sql) use Add-Migration in NuGet package console Update DB


B
BdR

Maybe not exactly what you're asking, but another option is to use the CSV Lint plug-in for Notepad++

The plug-in can validate the csv data beforehand, meaning check for bad data like missing quotes, incorrect decimal separator, datetime formatting errors etc. And instead of BULK INSERT it can convert the csv file to an SQL insert script.

https://i.stack.imgur.com/HIPEf.png

The SQL script will contain INSERT statements for each csv line in batches of 1000 records, and also adjust any datetime and decimal values. The plug-in automatically detects datatypes in the csv, and it will include a CREATE TABLE part with the correct data types for each column.

https://i.stack.imgur.com/vmyre.png