ChatGPT解决这个技术问题 Extra ChatGPT

Generating CSV file for Excel, how to have a newline inside a value

I need to generate a file for Excel, some of the values in this file contain multiple lines.

there's also non-English text in there, so the file has to be Unicode.

The file I'm generating now looks like this: (in UTF8, with non English text mixed in and with a lot of lines)

Header1,Header2,Header3
Value1,Value2,"Value3 Line1
Value3 Line2"

Note the multi-line value is enclosed in double quotes, with a normal everyday newline in it.

According to what I found on the web this supposed to work, but it doesn't, at least not win Excel 2007 and UTF8 files, Excel treats the 3rd line as the second row of data not as the second line of the first data row.

This has to run on my customer's machines and I have no control over their version of Excel, so I need a solution that will work with Excel 2000 and later.

Thanks

EDIT: I "solved" my problem by having two CSV options, one for Excel (Unicode, tab separated, no newlines in fields) and one for the rest of the world (UTF8, standard CSV).

Not what I was looking for but at least it works (so far)

FYI: This all works perfectly in LibreOffice and importing a CSV is much easier in the first place.
The accepted answer about the extra spaces is incredibly confusing now that you've edited your questions and removed the spaces...

M
Marcel Gosselin

You should have space characters at the start of fields ONLY where the space characters are part of the data. Excel will not strip off leading spaces. You will get unwanted spaces in your headings and data fields. Worse, the " that should be "protecting" that line-break in the third column will be ignored because it is not at the start of the field.

If you have non-ASCII characters (encoded in UTF-8) in the file, you should have a UTF-8 BOM (3 bytes, hex EF BB BF) at the start of the file. Otherwise Excel will interpret the data according to your locale's default encoding (e.g. cp1252) instead of utf-8, and your non-ASCII characters will be trashed.

Following comments apply to Excel 2003, 2007 and 2013; not tested on Excel 2000

If you open the file by double-clicking on its name in Windows Explorer, everything works OK.

If you open it from within Excel, the results vary:

You have only ASCII characters in the file (and no BOM): works. You have non-ASCII characters (encoded in UTF-8) in the file, with a UTF-8 BOM at the start: it recognises that your data is encoded in UTF-8 but it ignores the csv extension and drops you into the Text Import not-a-Wizard, unfortunately with the result that you get the line-break problem.

Options include:

Train the users not to open the files from within Excel :-( Consider writing an XLS file directly ... there are packages/libraries available for doing that in Python/Perl/PHP/.NET/etc


Thanks, I fixed the leading spaces issue in the question, I typed the CSV example manually and didn't copy-paste from a real file, the real file doesn't include those spaces, good catch.
@Nir: Now let's talk about your real problem. So that means you had a UTF-8 BOM, and opened the file from within Excel and got the Text Import Wizard not recognising that your Value3 newline should be "protected" -- correct? Or perhaps you didn't have a UTF-8 BOM and you had to tell the TIW that your data was UTF-8 encoded and it still bungled the newline?
What if I want to use | as a field separator, new line as a record separator, use " to protect the content of text fields, and text fields might contain |, ", and new line. Is this possible?
FYI: I've got Excel2007 and CSV exported from Redmine system. After adding UTF-8 BOM (EFBBBF) at its begining, Excel opened the file perfectly. New lines embedded in "issue description" colunm are processed correctly, and row structure is not damaged, and all national characters are read properly (they were trashed when reading without UTF8 BOM). Excel have not even displayed the text-import wizard. Currently, that CSV now has EFBBBF header, uses 0A as row separator, and 0D0A as new line inside strings in text cells.
If you're trying to get Excel for OS X to read you CSV correctly, as well as Excel for Windows, here's a great resource: stackoverflow.com/questions/4348802/…
C
Community

After lots of tweaking, here's a configuration that works generating files on Linux, reading on Windows+Excel, though the embedded newline format is not according to the standard:

Newlines within a field need to be \n (and obviously quoted in double quotes)

End of record: \r\n

Make sure that you don't start a field with equals, otherwise it gets treated as a formula and truncated

In Perl, I used Text::CSV to do this as follows:

use Text::CSV;

open my $FO, ">:encoding(utf8)", $filename or die "Cannot create $filename: $!";
my $csv = Text::CSV->new({ binary => 1, eol => "\r\n" });

#for each row...:
$csv -> print ($FO, \@row);

Yes that \r\n did it. I can confirm this works with Windows+Excel, OSX+Numbers and Google Docs.
Using \n (also tried \n) in a field enclosed with " , and using \r\n to devide rows. Still didn't fix this problem for me in Excel 2010. I have tried ANSI and UTF8-with-BOM. No success
But this is because I use | as field separator. If I use ; as field separator, the problem still exists when importing CSV data, but the problem disappears when opening the CSV by double clicking it in the File Explorer.
Ian's answer isn't working for me in Excel 2003/2010 on Windows 7. I tried using a hex editor to edit my UTF-8 BOM file and removed 0D (\r) from the '0D0A' bits (\r\n) for newlines within fields. But it doesn't work.
This answer worked for me (with zero modifications!) using Excel 2010 and WIndows 7; also using perl v5.14.2 that ships with cygwin. My embedded newlines were all \n. Thanks
d
dtldarek

Recently I had similar problem, I solved it by importing a HTML file, the baseline example would be like this:

<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
  <head>
    <style>
      <!--
      br {mso-data-placement:same-cell;}
      -->
    </style>
  </head>
  <body>
    <table>
      <tr>
        <td>first line<br/>second line</td>
        <td style="white-space:normal">first line<br/>second line</td>
      </tr>
    </table>
  </body>
</html>

I know, it is not a CSV, and might work differently for various versions of Excel, but I think it is worth a try.

I hope this helps ;-)


@GusDeCooL The wording of OP's first sentence "I need to generate a file for Excel, some of the values in this file contain multiple lines." suggests that perhaps it does not need to be a CSV file. Besides, the Q&A format applies to other readers as well, and it might be a viable choice for some of them (even if the OP had to use CSV). I find your downvote unreasonable (still, thank you for explaining why).
This was the best option for me, in fact; thank you for suggesting it!
n
np8

In Excel 365 while importing the file:

https://i.stack.imgur.com/3zsTX.png

-> Select File > Transform Data:

https://i.stack.imgur.com/ihfL2.png

In the Power Query Editor, right hand side at "Query Settings", under APPLIED STEPS, on "Source" row, click the "Settings icon"

https://i.stack.imgur.com/fGp0m.png

-> In the line break dropdown select Ignore line breaks inside quotes.

https://i.stack.imgur.com/kWA1q.png

Then press OK -> File -> Close & Load


I'm happy I scrolled down for this! I also added screenshots from the steps and added terms from the English Excel version.
Nice answer. It's weird that Excel doesn't use this by default.
O
OneSkyWalker

It is worth noting that when a .CSV file has fields wrapped in double quotes which contain line breaks, Excel will not import the .CSV file properly if the .CSV file is written in UTF-8 format. Excel treats the line break as if it were CR/LF and begins a new line. The spreadsheet is garbled. That seems to be true even if semi-colons are used as field delimiters (instead of commas).

The problem can be resolved by using Windows Notepad to edit the .CSV file, using File > Save As... to save the file, and before saving the file, changing the file encoding from UTF-8 to ANSI. Once the file is saved in ANSI format, then I find that Microsoft Excel 2013 running on Windows 7 Professional will import the file properly.


E
Esben

Newline inside a value seems to work if you use semicolon as separator, instead of comma or tab, and use quotes.

This works for me in both Excel 2010 and Excel 2000. However, surprisingly, it works only when you open the file as a new spreadsheet, not when you import it into an existing spreadsheet using the data import feature.


yeah, but then i didn't found option to make end line with semicolon in excel
What if some of the actual text data contain semi colon? This would not work.
d
devuxer

On a PC, ASCII character #10 is what you want to place a newline within a value.

Once you get it into Excel, however, you need to make sure word wrap is turned on for the multi-line cells or the newline will appear as a square box.


D
Dror Bereznitsky

This will not work if you try to import the file into EXCEL.

Associate the file extension csv with EXCEL.EXE so you will be able to invoke EXCEL by double-clicking the csv file.

Here I place some text followed by the NewLine Char followed by some more text AND enclosing the whole string with double quotes.

Do not use a CR since EXCEL will place part of the string in the next cell.

""text" + NL + "text""

When you invoke EXCEL, you will see this. You may have to auto size the height to see it all. Where the line breaks will depend on the width of the cell.

2

DATE

Here's the code in Basic

CHR$(34,"2", 10,"DATE", 34)

L
Lisa Simpson

I found this and it has worked for me

$delimiter = ',';
$enc1 = '"';
$enc2 = '""';

Then where you need to have stuff enclosed

$myfile = ('/path/to/myfile.csv');
//erase any previous contents
$fp = fopen($myfile, 'w+');
fwrite($fp, $enc1 .  'Column Heading 1' . $enc1 . $delimiter );
//append to new file
$fp2 = fopen($myfile, 'a');
fwrite($fp2, $enc1 .  'Column Heading 2' . $enc1 . $delimiter );

.....

fwrite($fp2, $enc1 .  'Last Column Heading' . $enc1 . $delimiter. PHP_EOL );

Then when you need to write something out - like HTML that includes the " you can do this

fwrite($fp2, $enc2 .  $myhtmlstring . $enc2 . $delimiter);

New lines end with . PHP_EOL

The end of the script prints out a link so that the user can download the file.

echo 'Click <a href="myfile.csv">here</a> to download file';

S
Stephen

UTF files that contain a BOM will cause Excel to treat new lines literally even in that field is surrounded by quotes. (Tested Excel 2008 Mac)

The solution is to make any new lines a carriage return (CHR 13) rather than a line feed.


Excel 2016 seems to treat my CSV file correctly even if it has a UTF8 BOM. However what made all the difference is using ';' as a field separator (which is what Excel does for all locales that have ',' as a decimal separator).
D
Durgpal Singh

Test this: It fully works for me: Put the following lines in a xxxx.csv file

hola_x,="este es mi text1"&CHAR(10)&"I sigo escribiendo",hola_a

hola_y,="este es mi text2"&CHAR(10)&"I sigo escribiendo",hola_b

hola_z,="este es mi text3"&CHAR(10)&"I sigo escribiendo",hola_c

Open with excel.

in some cases will open directly otherwise will need to use column to data conversion. expand the column width and hit the wrap text button. or format cells and activate wrap text.

and thanks for the other suggestions, but they did not work for me. I am in a pure windows env, and did not want to play with unicode or other funny thing.

This way you putting a formula from csv to excel. It may be many uses for this method of work. (note the = before the quotes)

pd:In your suggestions please put some samples of the data not only the code.


D
Duncan Wallace

putting "\r" at the end of each row actually had the effect of line breaks in excel, but in the .csv it vanished and left an ugly mess where each row was squashed against the next with no space and no line-breaks


T
Tuntable

For File Open only, the syntax is

 ,"one\n
 two",...

The critical thing is that there is no space after the first ",". Normally spaces are fine, and trimmed if the string is not quoted. But otherwise nasty. Took me a while to figure that out.

It does not seem to matter if the line is ended \n or \c\n.

Make sure you expand the formula bar so you can actually see the text in the cell (got me after a long day...)

Now of course, File Open will not support UTF-8 Properly (unless one uses tricks).

Excel > Data > Get External Data > From Text

Can be set into UTF-8 mode (it is way down the list of fonts). However, in that case the new lines do not seem to work and I know no way to fix that.

(One might thing that after 30 years MS would get this stuff right.)


S
Sebastian

The way we do it (we use VB.Net) is to enclose the text with new lines in Chr(34) which is the char representing the double quotes and replace all CR-LF characters for LF.


T
Tam Tran

Normally a new line is "\r\n". In my CSV, I replaced "\r" with empty value. Here is code in Javascript:

cellValue = cellValue.replace(/\r/g, "")

When I open the CSV in MS Excel, it worked well. If a value has multiple lines, it will stay within 1 single cell in the Excel sheet.


M
Miroslav Glamuzina

you can do the next "\"Value3 Line1 Value3 Line2\"". It works for me generating a csv file in java


M
Matt

Here is an interesting approach using JavaScript ...

  String.prototype.csv = String.prototype.split.partial(/,\s*/);  

  var results = ("Mugan, Jin, Fuu").csv();                        

  console.log(results[0]=="Mugan" &&                                   
         results[1]=="Jin" &&                                     
         results[2]=="Fuu",                                       
         "The text values were split properly");                  

S
Shashi

Printing a HTML newline <br/> into the content and opening in excel will work fine on any excel


M
Meghana Chamarthy

You could use keyboard shortcut ALT+Enter.

Select the cell you wish to edit enter edit mode either by double clicking it or pressing F2 3.Press Alt+enter. This will create a new line in cell


How do you integrate that into CVS file generation?

关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now