I want to insert data into my table, but insert only data that doesn't already exist in my database.
Here is my code:
ALTER PROCEDURE [dbo].[EmailsRecebidosInsert]
(@_DE nvarchar(50),
@_ASSUNTO nvarchar(50),
@_DATA nvarchar(30) )
AS
BEGIN
INSERT INTO EmailsRecebidos (De, Assunto, Data)
VALUES (@_DE, @_ASSUNTO, @_DATA)
WHERE NOT EXISTS ( SELECT * FROM EmailsRecebidos
WHERE De = @_DE
AND Assunto = @_ASSUNTO
AND Data = @_DATA);
END
And the error is:
Msg 156, Level 15, State 1, Procedure EmailsRecebidosInsert, Line 11 Incorrect syntax near the keyword 'WHERE'.
insert
statement is always a single transaction. It's not as if the SQL Server evaluates the subquery first and then at some later point, and without holding a lock, goes on to do the insert.
UPDLOCK
and HOLDLOCK
query hints, the lock on EmailsRecebidos
will be released as soon as the check is done, momentarily before the write to the same table. In this split second, another thread can still read the table and assume records don't exist and encounter the race condition. By using the explicit transactions and the locking hints, and can stop the lock on the table being released after the select statement is finished. The lock will be held until the transaction is committed.
instead of below Code
BEGIN
INSERT INTO EmailsRecebidos (De, Assunto, Data)
VALUES (@_DE, @_ASSUNTO, @_DATA)
WHERE NOT EXISTS ( SELECT * FROM EmailsRecebidos
WHERE De = @_DE
AND Assunto = @_ASSUNTO
AND Data = @_DATA);
END
replace with
BEGIN
IF NOT EXISTS (SELECT * FROM EmailsRecebidos
WHERE De = @_DE
AND Assunto = @_ASSUNTO
AND Data = @_DATA)
BEGIN
INSERT INTO EmailsRecebidos (De, Assunto, Data)
VALUES (@_DE, @_ASSUNTO, @_DATA)
END
END
Updated : (thanks to @Marc Durdin for pointing)
Note that under high load, this will still sometimes fail, because a second connection can pass the IF NOT EXISTS test before the first connection executes the INSERT, i.e. a race condition. See stackoverflow.com/a/3791506/1836776 for a good answer on why even wrapping in a transaction doesn't solve this.
For those looking for the fastest way, I recently came across these benchmarks where apparently using "INSERT SELECT... EXCEPT SELECT..." turned out to be the fastest for 50 million records or more.
Here's some sample code from the article (the 3rd block of code was the fastest):
INSERT INTO #table1 (Id, guidd, TimeAdded, ExtraData)
SELECT Id, guidd, TimeAdded, ExtraData
FROM #table2
WHERE NOT EXISTS (Select Id, guidd From #table1 WHERE #table1.id = #table2.id)
-----------------------------------
MERGE #table1 as [Target]
USING (select Id, guidd, TimeAdded, ExtraData from #table2) as [Source]
(id, guidd, TimeAdded, ExtraData)
on [Target].id =[Source].id
WHEN NOT MATCHED THEN
INSERT (id, guidd, TimeAdded, ExtraData)
VALUES ([Source].id, [Source].guidd, [Source].TimeAdded, [Source].ExtraData);
------------------------------
INSERT INTO #table1 (id, guidd, TimeAdded, ExtraData)
SELECT id, guidd, TimeAdded, ExtraData from #table2
EXCEPT
SELECT id, guidd, TimeAdded, ExtraData from #table1
------------------------------
INSERT INTO #table1 (id, guidd, TimeAdded, ExtraData)
SELECT #table2.id, #table2.guidd, #table2.TimeAdded, #table2.ExtraData
FROM #table2
LEFT JOIN #table1 on #table1.id = #table2.id
WHERE #table1.id is null
I would use a merge:
create PROCEDURE [dbo].[EmailsRecebidosInsert]
(@_DE nvarchar(50),
@_ASSUNTO nvarchar(50),
@_DATA nvarchar(30) )
AS
BEGIN
with data as (select @_DE as de, @_ASSUNTO as assunto, @_DATA as data)
merge EmailsRecebidos t
using data s
on s.de = t.de
and s.assunte = t.assunto
and s.data = t.data
when not matched by target
then insert (de, assunto, data) values (s.de, s.assunto, s.data);
END
Try below code
ALTER PROCEDURE [dbo].[EmailsRecebidosInsert]
(@_DE nvarchar(50),
@_ASSUNTO nvarchar(50),
@_DATA nvarchar(30) )
AS
BEGIN
INSERT INTO EmailsRecebidos (De, Assunto, Data)
select @_DE, @_ASSUNTO, @_DATA
EXCEPT
SELECT De, Assunto, Data from EmailsRecebidos
END
Different SQL, same principle. Only insert if the clause in where not exists fails
INSERT INTO FX_USDJPY
(PriceDate,
PriceOpen,
PriceLow,
PriceHigh,
PriceClose,
TradingVolume,
TimeFrame)
SELECT '2014-12-26 22:00',
120.369000000000,
118.864000000000,
120.742000000000,
120.494000000000,
86513,
'W'
WHERE NOT EXISTS
(SELECT 1
FROM FX_USDJPY
WHERE PriceDate = '2014-12-26 22:00'
AND TimeFrame = 'W')
I did the same thing with SQL Server 2012 and it worked
Insert into #table1 With (ROWLOCK) (Id, studentId, name)
SELECT '18769', '2', 'Alex'
WHERE not exists (select * from #table1 where Id = '18769' and studentId = '2')
The INSERT
command doesn't have a WHERE
clause - you'll have to write it like this:
ALTER PROCEDURE [dbo].[EmailsRecebidosInsert]
(@_DE nvarchar(50),
@_ASSUNTO nvarchar(50),
@_DATA nvarchar(30) )
AS
BEGIN
IF NOT EXISTS (SELECT * FROM EmailsRecebidos
WHERE De = @_DE
AND Assunto = @_ASSUNTO
AND Data = @_DATA)
BEGIN
INSERT INTO EmailsRecebidos (De, Assunto, Data)
VALUES (@_DE, @_ASSUNTO, @_DATA)
END
END
Depending on your version (2012?) of SQL Server aside from the IF EXISTS you can also use MERGE like so:
ALTER PROCEDURE [dbo].[EmailsRecebidosInsert]
( @_DE nvarchar(50)
, @_ASSUNTO nvarchar(50)
, @_DATA nvarchar(30))
AS BEGIN
MERGE [dbo].[EmailsRecebidos] [Target]
USING (VALUES (@_DE, @_ASSUNTO, @_DATA)) [Source]([De], [Assunto], [Data])
ON [Target].[De] = [Source].[De] AND [Target].[Assunto] = [Source].[Assunto] AND [Target].[Data] = [Source].[Data]
WHEN NOT MATCHED THEN
INSERT ([De], [Assunto], [Data])
VALUES ([Source].[De], [Source].[Assunto], [Source].[Data]);
END
If your clustered index consists of only those fields then the simple, fast and reliable option is to use IGNORE_DUP_KEY
If you create the Clustered index with IGNORE_DUP_KEY ON
Then you can just use:
INSERT INTO EmailsRecebidos (De, Assunto, Data) VALUES (@_DE, @_ASSUNTO, @_DATA)
This should be safe in all cases!
If you want to check whether a key exists or not, you can use:
INSERT INTO tableName (...) VALUES (...)
ON DUPLICATE KEY
UPDATE ...
Using this, if there is already an entry for the particular key, then it will UPDATE, else, it will INSERT.
Just change your code to use SELECT instead of VALUES
INSERT INTO EmailsRecebidos (De, Assunto, Data)
SELECT @_DE, @_ASSUNTO, @_DATA
WHERE NOT EXISTS (SELECT * FROM EmailsRecebidos
WHERE De = @_DE
AND Assunto = @_ASSUNTO
AND Data = @_DATA);
You could use the GO
command. That will restart the execution of SQL statements after an error. In my case I have a few 1000 INSERT statements, where a handful of those records already exist in the database, I just don't know which ones. I found that after processing a few 100, execution just stops with an error message that it can't INSERT
as the record already exists. Quite annoying, but putting a GO
solved this. It may not be the fastest solution, but speed was not my problem.
GO
INSERT INTO mytable (C1,C2,C3) VALUES(1,2,3)
GO
INSERT INTO mytable (C1,C2,C3) VALUES(4,5,6)
etc ...
GO
is a batch separator? It doesn't assist with preventing duplicate records.
As explained in below code: Execute below queries and verify yourself.
CREATE TABLE `table_name` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
`address` varchar(255) NOT NULL,
`tele` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
Insert a record:
INSERT INTO table_name (name, address, tele)
SELECT * FROM (SELECT 'Nazir', 'Kolkata', '033') AS tmp
WHERE NOT EXISTS (
SELECT name FROM table_name WHERE name = 'Nazir'
) LIMIT 1;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
SELECT * FROM `table_name`;
+----+--------+-----------+------+
| id | name | address | tele |
+----+--------+-----------+------+
| 1 | Nazir | Kolkata | 033 |
+----+--------+-----------+------+
Now, try to insert the same record again:
INSERT INTO table_name (name, address, tele)
SELECT * FROM (SELECT 'Nazir', 'Kolkata', '033') AS tmp
WHERE NOT EXISTS (
SELECT name FROM table_name WHERE name = 'Nazir'
) LIMIT 1;
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
+----+--------+-----------+------+
| id | name | address | tele |
+----+--------+-----------+------+
| 1 | Nazir | Kolkata | 033 |
+----+--------+-----------+------+
Insert a different record:
INSERT INTO table_name (name, address, tele)
SELECT * FROM (SELECT 'Santosh', 'Kestopur', '044') AS tmp
WHERE NOT EXISTS (
SELECT name FROM table_name WHERE name = 'Santosh'
) LIMIT 1;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
SELECT * FROM `table_name`;
+----+--------+-----------+------+
| id | name | address | tele |
+----+--------+-----------+------+
| 1 | Nazir | Kolkata | 033 |
| 2 | Santosh| Kestopur | 044 |
+----+--------+-----------+------+
Success story sharing
select *
in this case makes no difference whatsoever because it's being used in anEXISTS
clause. SQL Server will always optimize it and has been doing it for ages. Since I'm very old I usually write these queries asEXISTS (SELECT 1 FROM...)
but it is not needed anymore.