ChatGPT解决这个技术问题 Extra ChatGPT

How can I output a UTF-8 CSV in PHP that Excel will read properly?

I've got this very simple thing that just outputs some stuff in CSV format, but it's got to be UTF-8. I open this file in TextEdit or TextMate or Dreamweaver and it displays UTF-8 characters properly, but if I open it in Excel it's doing this silly íÄ kind of thing instead. Here's what I've got at the head of my document:

header("content-type:application/csv;charset=UTF-8");
header("Content-Disposition:attachment;filename=\"CHS.csv\"");

This all seems to have the desired effect except Excel (Mac, 2008) doesn't want to import it properly. There's no options in Excel for me to "open as UTF-8" or anything, so … I'm getting a little annoyed.

I can't seem to find any clear solutions to this anywhere, despite a lot of people having the same problem. The thing I see the most is to include the BOM, but I can't exactly figure out how to do that. As you can see above I'm just echoing this data, I'm not writing any file. I can do that if I need to, I'm just not because there doesn't seem like a need for it at this point. Any help?

Update: I tried echoing the BOM as echo pack("CCC", 0xef, 0xbb, 0xbf); which I just pulled from a site that was trying to detect the BOM. But Excel just appends those three characters to the very first cell when it imports, and still messes up the special characters.

Excel doesn't provide an option to adjust the incoming file's character set? Are you 100% sure about that? I don't have a copy handy so I can't try out but I imagine there must be a drop down box somewhere.
This is Excel on a Mac - it seems more limited than Excel on the PC. There are no dropdowns at all in the Open dialog, beyond which file types to be able to open. I've looked everywhere. If it's there, it's obscure. I'd say 98% sure.
Microsoft office or openoffice ?
Microsoft is better in this regard, there is no way (that I've found) to make OpenOffice detect the charset, not even the BOM. Darn it.
BOM has no effect on Microsoft Excel 2008 for Mac, either.

t
the

I have the same (or similar) problem.

In my case, if I add a BOM to the output, it works:

header('Content-Encoding: UTF-8');
header('Content-type: text/csv; charset=UTF-8');
header('Content-Disposition: attachment; filename=Customers_Export.csv');
echo "\xEF\xBB\xBF"; // UTF-8 BOM

I believe this is a pretty ugly hack, but it worked for me, at least for Excel 2007 Windows. Not sure it'll work on Mac.


Ugly, but super helpful. Thank you.
I'm pretty sure this doesn't work well at all on OS X, unfortunately. I think it might actually display the BOM when you import (warning: hazy memories.)
This is correct. At least, I've tried prepending with a BOM, like in the example above, but it indeed just showed some funky characters at the beginning and kept on displaying the special chars wrong (tested in Excel 2011). This seemed to work best for me in all settings: stackoverflow.com/a/1648671/1005334 (using PHP).
I keep getting alerts on this question because apparently a lot of other people stumble across it. I wish I could mark this as correct, because it's been upvoted a lot. Unfortunately, I finally gave this solution a shot (abandoned that project long ago) and it's not working for me in Excel 2008 for Mac. Glad it's working for you. I've upvoted the solution. But it doesn't solve the Excel for Mac 2008 problem. If anyone has luck on the Mac, definitely let me (and apparently everyone) know.
I don't think Content-Encoding header should be used to tell the charset. It's rather about compression: developer.mozilla.org/en-US/docs/Web/HTTP/Headers/…
T
Tim Groeneveld

To quote a Microsoft support engineer,

Excel for Mac does not currently support UTF-8

Update, 2017: This is true of all versions of Microsoft Excel for Mac before Office 2016. Newer versions (from Office 365) do now support UTF-8.

In order to output UTF-8 content that Excel both on Windows and OS X will be able to successfully read, you will need to do two things:

Make sure that you convert your UTF-8 CSV text to UTF-16LE mb_convert_encoding($csv, 'UTF-16LE', 'UTF-8'); Make sure that you add the UTF-16LE byte order mark to the start of the file chr(255) . chr(254)

The next problem that appears only with Excel on OS X (but not Windows) will be when viewing a CSV file with comma separated values, Excel will render rows only with one row and all of the text along with the commas in the first row.

The way to avoid this is to use tabs as your separated value.

I used this function from the PHP comments (using tabs "\t" instead of commas) and it worked perfectly on OS X and Windows Excel.

Note that to fix an issue with an empty column as the end of a row, that I did have to change the line of code that says:

    $field_cnt = count($fields);

to

    $field_cnt = count($fields)-1;

As some of the other comments on this page say, other spreadsheet apps like OpenOffice Calc, Apple's own Numbers and Google Doc's Spreadsheet have no issues with UTF-8 files with commas.

See the table in this question for what works and doesn't work for Unicode CSV files in Excel

As a side note, I might add that if you are using Composer, you should have a look at adding League\Csv to your requires. League\Csv has a really nice API for building CSV files.

To use League\Csv with this method of creating CSV files, check out this example


Call me stupid, but I can't make this work. I converted my data using the mb_convert_encoding function and output the BOM preceding the file content, but I just get a single row of Chinese characters when I open the file in Excel. I'm still using commas rather than tabs for now, but presumably that should not cause what I'm seeing.
This solution also solved the UTF8 issue for me on Mac and Windows. I also found that adding a line with sep=; or sep=, to the CSV file tells Excel how the CSV is seperated and columns will be created correctly again.
I found that things came out garbled if I started the string as just the BOM, then on another line added the "sep=,\n", then on another line added the data. Things were fine if I did it all in one line ($csv = chr(255) . chr(254) /* BOM */ . "sep=,\n" . mb_convert_encoding($csv, 'UTF-16LE', 'UTF-8');). Everything looks great now on Mac. I don't have Windows to test with.
Only thing that fully worked for me. The inconsistency of Microsoft never ceases to amaze me. Saving a normal csv file uses "," as delimiter but if theres a BOM it saves using "\t" for some reason. The C in .csv stands for "COMMA" dammit. It really is not hard to do.
In my Excel Mac 2011, putting chr(255).chr(254) in the beginning of the file turns all characters into chinese (or korean, can't really tell the difference). The only solution that worked for me was the one using iconv
J
Jazzer

Here is how I did it (that's to prompt browser to download the csv file):

header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename=file.csv');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
echo "\xEF\xBB\xBF"; // UTF-8 BOM
echo $csv_file_content;
exit();

The only thing it fixed UTF8 encoding problem in CSV preview when you hit space bar on Mac.. but not in Excel Mac 2008... don't know why


UTF-8 BOM line of the code save my day with exporting to xls
I was looking for a way to output directly to the browser as utf8, when I came across this post, Here's my solution with credits to this post for the BOM and the binary transfer header. $outstream = fopen( "php://output", 'w' ); fputs( $outstream, "\xEF\xBB\xBF" ); foreach ( $export as $fields ) { fputcsv( $outstream, $fields ); } fclose( $outstream );
Typo - remove the double )).
C
Codemole

In my case following works very nice to make CSV file with UTF-8 chars displayed correctly in Excel.

$out = fopen('php://output', 'w');
fprintf($out, chr(0xEF).chr(0xBB).chr(0xBF));
fputcsv($out, $some_csv_strings);

The 0xEF 0xBB 0xBF BOM header will let Excel know the correct encoding.


big up for you bro <3
fprintf($out, chr(0xEF).chr(0xBB).chr(0xBF)); <-- this is exactly what I needed. It works like a charm.
thanks dude, this one just worked fine while other solutions didn't.
t
trojan

I just dealt with the same problem, and came up with two solutions.

Use the PHPExcel class as suggested by bpeterson76. Using this class generates the most widely compatible file, I was able to generate a file from UTF-8 encoded data that opened fine in Excel 2008 Mac, Excel 2007 Windows, and Google Docs. The biggest problem with using PHPExcel is that it's slow and uses a lot of memory, which isn't an issue for reasonably sized files, but if your Excel/CSV file has hundreds or thousands of rows, this library becomes unusable. Here is a PHP method that will take some TSV data and output an Excel file to the browser, note that it uses the Excel5 Writer, which means the file should be compatible with older versions of Excel, but I no longer have access to any, so I cannot test them. function excel_export($tsv_data, $filename) { $export_data = preg_split("/\n/", $tsv_data); foreach($export_data as &$row) { $row = preg_split("/\t/", $row); } include("includes/PHPExcel.php"); include('includes/PHPExcel/Writer/Excel5.php'); $objPHPExcel = new PHPExcel(); $objPHPExcel->setActiveSheetIndex(0); $sheet = $objPHPExcel->getActiveSheet(); $row = '1'; $col = "A"; foreach($export_data as $row_cells) { if(!is_array($row_cells)) { continue; } foreach($row_cells as $cell) { $sheet->setCellValue($col.$row, $cell); $col++; } $row += 1; $col = "A"; } $objWriter = new PHPExcel_Writer_Excel5($objPHPExcel); header('Content-Type: application/vnd.ms-excel'); header('Content-Disposition: attachment;filename="'.$filename.'.xls"'); header('Cache-Control: max-age=0'); $objWriter->save('php://output'); exit; } Because of the efficiency issues with PHPExcel, I also had to figure out how to generate a UTF-8 & Excel compatible CSV or TSV file. The best I could come up with was a file that was compatible with Excel 2008 Mac, and Excel 2007 PC, but not Google Docs, which is good enough for my application. I found the solution here, specifically, this answer, but you should also read the accepted answer as it explains the problem. Here is the PHP code I used, note that I am using tsv data (tabs as delimiters instead of commas): header ( 'HTTP/1.1 200 OK' ); header ( 'Date: ' . date ( 'D M j G:i:s T Y' ) ); header ( 'Last-Modified: ' . date ( 'D M j G:i:s T Y' ) ); header ( 'Content-Type: application/vnd.ms-excel') ; header ( 'Content-Disposition: attachment;filename=export.csv' ); print chr(255) . chr(254) . mb_convert_encoding($tsv_data, 'UTF-16LE', 'UTF-8'); exit;


i am facing same issue on Windows-7 Excel 2007 & tried all suggestion but fail :(
The UTF-16LE solution is the one that worked. Note that you have to format the tsv correctly, and it has to be tsc, so tab delimited, not comma delimited. See also Marc's post here: stackoverflow.com/a/1648671/1697370
R
Reza Mamun

I was having the same issue and it was solved like below:

    header('Content-Encoding: UTF-8');
    header('Content-Type: text/csv; charset=utf-8' );
    header(sprintf( 'Content-Disposition: attachment; filename=my-csv-%s.csv', date( 'dmY-His' ) ) );
    header('Content-Transfer-Encoding: binary');
    header('Expires: 0');
    header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
    header('Pragma: public');

    $df = fopen( 'php://output', 'w' );

    //This line is important:
    fputs( $df, "\xEF\xBB\xBF" ); // UTF-8 BOM !!!!!

    foreach ( $rows as $row ) {
        fputcsv( $df, $row );
    }
    fclose($df);
    exit();

j
j0k

Excel doesn't support UTF-8. You have to encode your UTF-8 text into UCS-2LE.

mb_convert_encoding($output, 'UCS-2LE', 'UTF-8');

Better use UTF-16LE as suggested above, that covers every character in existence.
I have noticed this mb_convert_encoding is working fine and showing diacritics correctly in my exported XLS file. I have mysql default configurations and apache + php set to work with utf-8
B
Ben Saufley

To follow up on this:

It appears that the problem is simply with Excel on the Mac. It's not how I'm generating the files, because even generating CSVs from Excel is breaking them. I save as CSV, and reimport, and all the characters are messed up.

So … there doesn't appear to be a correct answer to this. Thanks for all the suggestions.

I would say that from all I've read, @Daniel Magliola's suggestion about the BOM would probably be the best answer for some other computer. But it still doesn't solve my problem.


Since UTF8 support is inconsistent across Mac vs Windows, the best bet is to use a different encoding altogether. I've found cp1252 generally plays well with Microsoft stuff. en.wikipedia.org/wiki/Windows-1252
Ben, there is a correct answer to this question :) See mine!
C
Community

This works fine in excel for both Windows and also Mac OS.

Fix issues in excel that are not displaying characters containing diacritics, cyrillic letters, greek letter and currency symbols.

function writeCSV($filename, $headings, $data) {   

    //Use tab as field separator
    $newTab  = "\t";
    $newLine  = "\n";

    $fputcsv  =  count($headings) ? '"'. implode('"'.$newTab.'"', $headings).'"'.$newLine : '';

    // Loop over the * to export
    if (! empty($data)) {
      foreach($data as $item) {
        $fputcsv .= '"'. implode('"'.$newTab.'"', $item).'"'.$newLine;
      }
    }

    //Convert CSV to UTF-16
    $encoded_csv = mb_convert_encoding($fputcsv, 'UTF-16LE', 'UTF-8');

    // Output CSV-specific headers
    header('Set-Cookie: fileDownload=true; path=/'); //This cookie is needed in order to trigger the success window.
    header("Pragma: public");
    header("Expires: 0");
    header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
    header("Cache-Control: private",false);
    header("Content-Type: application/octet-stream");
    header("Content-Disposition: attachment; filename=\"$filename.csv\";" );
    header("Content-Transfer-Encoding: binary");
    header('Content-Length: '. strlen($encoded_csv));
    echo chr(255) . chr(254) . $encoded_csv; //php array convert to csv/excel

    exit;
}

this solution allowed me to make export working for Google Sheets also (some others failed)
This solution is almost perfect, I just found one problem when one cell includes a double quote " , it is skipped..
C
Cybot

The CSV File must include a Byte Order Mark.

Or as suggested and workaround just echo it with the HTTP body


R
Rejaul

Add:

fprintf($file, chr(0xEF).chr(0xBB).chr(0xBF));

Or:

fprintf($file, "\xEF\xBB\xBF");

Before writing any content to CSV file.

Example:

<?php
$file = fopen( "file.csv", "w");
fprintf( $file, "\xEF\xBB\xBF");
fputcsv( $file, ["english", 122, "বাংলা"]);
fclose($file);

R
Rvanlaak

Converting already utf-8 encoded text by using mb_convert_encoding is not needed. Just add three characters in front of the original content:

$newContent = chr(239) . chr(187) . chr(191) . $originalContent

For me this resolved the problem of special characters in csv files.


Unbelievable; I tried atleast 10 other solutions and the only solution for me seems to be this one, when combined with UTF-8 BOM via fputs.
What are these characters? Must be some magic spell. Only solution that worked for me after trying every single other answer here.
These characters combined are called the "Byte Object Mark", which tells applications that check for it the rest of the content should be respected as UTF-8.
t
the

Since UTF8 encoding doesn't play well with Excel. You can convert the data to another encoding type using iconv().

e.g.

iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $value),

This was the only solution that worked for me with Excel Mac 2011
S
Sagar72427
**This is 100% works fine in excel for both Windows7,8,10 and also All Mac OS.**
//Fix issues in excel that are not displaying characters containing diacritics, cyrillic letters, Greek letter and currency symbols.

function generateCSVFile($filename, $headings, $data) {

    //Use tab as field separator
    $newTab  = "\t";
    $newLine  = "\n";

    $fputcsv  =  count($headings) ? '"'. implode('"'.$newTab.'"', $headings).'"'.$newLine : '';

    // Loop over the * to export
    if (! empty($data)) {
      foreach($data as $item) {
        $fputcsv .= '"'. implode('"'.$newTab.'"', $item).'"'.$newLine;
      }
    }

    //Convert CSV to UTF-16
    $encoded_csv = mb_convert_encoding($fputcsv, 'UTF-16LE', 'UTF-8');

    // Output CSV-specific headers
    header('Set-Cookie: fileDownload=true; path=/'); //This cookie is needed in order to trigger the success window.
    header("Pragma: public");
    header("Expires: 0");
    header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
    header("Cache-Control: private",false);
    header("Content-Type: application/octet-stream");
    header("Content-Disposition: attachment; filename=\"$filename.csv\";" );
    header("Content-Transfer-Encoding: binary");
    header('Content-Length: '. strlen($encoded_csv));
    echo chr(255) . chr(254) . $encoded_csv; //php array convert to csv/excel
    exit;
}

Generally, answers are much more helpful if they include an explanation of what the code is intended to do, and why that solves the problem without introducing others.
f
foued611

you can convert your CSV String with iconv. for example:

$csvString = "Möckmühl;in Möckmühl ist die Hölle los\n";
file_put_contents('path/newTest.csv',iconv("UTF-8", "ISO-8859-1//TRANSLIT",$csvString) );

Great sample texts ;)
H
Ho Ha

As I investigated and I found that UTF-8 is not working well on MAC and Windows so I tried with Windows-1252 , it supports well on both of them but you must select type of encoding on ubuntu. Here is my code$valueToWrite = mb_convert_encoding($value, 'Windows-1252');

$response->headers->set('Content-Type', $mime . '; charset=Windows-1252');
    $response->headers->set('Pragma', 'public');
    $response->headers->set('Content-Endcoding','Windows-1252');
    $response->headers->set('Cache-Control', 'maxage=1');
    $response->headers->set('Content-Disposition', $dispositionHeader);
    echo "\xEF\xBB\xBF"; // UTF-8 BOM

r
roNn23

You have to use the encoding "Windows-1252".

header('Content-Encoding: Windows-1252');
header('Content-type: text/csv; charset=Windows-1252');
header("Content-Disposition: attachment; filename={$filename}");

Maybe you have to convert your strings:

private function convertToWindowsCharset($string) {
  $encoding = mb_detect_encoding($string);

  return iconv($encoding, "Windows-1252", $string);
}

C
Community

You may append the 3 bytes to the file before exporting, it works for me . Before doing that system only work in Windows and HP -UX but failed in Linux.

FileOutputStream fStream = new FileOutputStream( f );
final byte[] bom = new byte[] { (byte) 0xEF, (byte) 0xBB, (byte) 0xBF };
OutputStreamWriter writer = new OutputStreamWriter( fStream, "UTF8" );
fStream.write( bom );

Have a UTF-8 BOM (3 bytes, hex EF BB BF) at the start of the file. Otherwise Excel will interpret the data according to your locale's default encoding (e.g. cp1252) instead of utf-8

Generating CSV file for Excel, how to have a newline inside a value


S
Sebastien Horin

I'm on Mac, in my case I just had to specify the separator with "sep=;\n" and encode the file in UTF-16LE like this:

$data = "sep=;\n" .mb_convert_encoding($data, 'UTF-16LE', 'UTF-8');

B
Barani r

For me none of the solution above worked. Below is what i did to resolve the issue: modify the value using this function in the PHP code:

$value = utf8_encode($value);

This output values properly in an excel sheet.


A
Alain

I had this same problem when I had an Excel VBA routine that imported data. Since CSV is a plain text format, I was working around this by programatically opening the data in a simple file editor like wordpad, and re-saving it as unicode text, or copying it to the clipboard from there and pasting it to Excel. If excel doesn't automatically parse the CSV into cells, this is easily remedied using the built in "Text to Columns" feature.


I've tried that! But it's already encoded as unicode text … saving it as unicode text doesn't change anything. How would I save it as plain text without it improperly interpreting all of the special characters?
I think by "saving it as Unicode text", Alain might mean "UTF-16LE". Windows often (sadly) uses "Unicode" to incorrectly refer to UTF-16LE or UTF-16LE with a leading BOM. Notepad's File -> Save dialog uses "Unicode" in this manner.
A
Alain

Does the problem still occur when you save it as a .txt file and them open that in excel with comma as a delimiter?

The problem might not be the encoding at all, it might just be that the file isn't a perfect CSV according to excel standards.


As far as I know, the CSV formatting is fine. I can make a "perfect" CSV in TextEdit, no problem, or even rename a txt as a csv, so it's not missing anything secret - CSVs are just text files. Plus, the formatting is all perfect in Excel, it's just the special characters that break. But just in case, I tried your suggestion, and it's sadly producing the same problems.
From the "WTF were they thinking" dept.: be aware that according to Microsoft the correct field separator is locale-dependent. A 'comma-separated' file might so require fields to be actually semicolon-separated in some locales - and there is nothing you can do except suggesting OpenOffice
that's pretty crazy - unfortunately I'm not having trouble properly delimiting fields. I'm just having trouble with these special characters.
B
Brian

I just tried these headers and got Excel 2013 on a Windows 7 PC to import the CSV file with special characters correctly. The Byte Order Mark (BOM) was the final key that made it work.

header('Content-Encoding: UTF-8');
    header('Content-type: text/csv; charset=UTF-8');
    header("Content-disposition: attachment; filename=filename.csv");
    header("Pragma: public");
    header("Expires: 0");
    echo "\xEF\xBB\xBF"; // UTF-8 BOM

Yes, that is correct, it will work correctly in Excel for Windows, but as explained in my answer, it will not work on Excel for OS X.
P
Poul

EASY solution for Mac Excel 2008: I struggled with this soo many times, but here was my easy fix: Open the .csv file in Textwrangler which should open your UTF-8 chars correctly. Now in the bottom status bar change the file format from "Unicode (UTF-8)" to "Western (ISO Latin 1)" and save the file. Now go to your Mac Excel 2008 and select File > Import > Select csv > Find your file > in File origin select "Windows (ANSI)" and voila the UTF-8 chars are showing correctly. At least it does for me...


B
BoRnbeBaD

I use this and it works

header('Content-Description: File Transfer');
header('Content-Type: text/csv; charset=UTF-16LE');
header('Content-Disposition: attachment; filename=file.csv');
header('Content-Transfer-Encoding: binary');
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
// output headers so that the file is downloaded rather than displayed
// create a file pointer connected to the output stream
$output = fopen('php://output', 'w');
fputs( $output, "\xEF\xBB\xBF" );
// output the column headings
fputcsv($output, array('Thông tin khách hàng đăng ký'));
// fetch the data
$setutf8 = "SET NAMES utf8";
$q = $conn->query($setutf8);
$setutf8c = "SET character_set_results = 'utf8', character_set_client =
'utf8', character_set_connection = 'utf8', character_set_database = 'utf8',
character_set_server = 'utf8'";
$qc = $conn->query($setutf8c);
$setutf9 = "SET CHARACTER SET utf8";
$q1 = $conn->query($setutf9);
$setutf7 = "SET COLLATION_CONNECTION = 'utf8_general_ci'";
$q2 = $conn->query($setutf7);
$sql = "SELECT id, name, email FROM myguests";
$rows = $conn->query($sql);
$arr1= array();
if ($rows->num_rows > 0) {
// output data of each row
while($row = $rows->fetch_assoc()) {
    $rcontent = " Name: " . $row["name"]. " - Email: " . $row["email"];  
    $arr1[]["title"] =  $rcontent;
}
} else {
     echo "0 results";
}
$conn->close();
// loop over the rows, outputting them
foreach($arr1 as $result1):
   fputcsv($output, $result1);
endforeach;