The contents of file.txt are:
5 3
6 4
7 1
10 5
11 6
12 3
12 4
Where 5 3
is a coordinate pair. How do I process this data line by line in C++?
I am able to get the first line, but how do I get the next line of the file?
ifstream myfile;
myfile.open ("file.txt");
First, make an ifstream
:
#include <fstream>
std::ifstream infile("thefile.txt");
The two standard methods are:
Assume that every line consists of two numbers and read token by token: int a, b;
while (infile >> a >> b)
{
// process pair (a,b)
}
Line-based parsing, using string streams: #include
You shouldn't mix (1) and (2), since the token-based parsing doesn't gobble up newlines, so you may end up with spurious empty lines if you use getline()
after token-based extraction got you to the end of a line already.
Use ifstream
to read data from a file:
std::ifstream input( "filename.ext" );
If you really need to read line by line, then do this:
for( std::string line; getline( input, line ); )
{
...for each line in input...
}
But you probably just need to extract coordinate pairs:
int x, y;
input >> x >> y;
Update:
In your code you use ofstream myfile;
, however the o
in ofstream
stands for output
. If you want to read from the file (input) use ifstream
. If you want to both read and write use fstream
.
Reading a file line by line in C++ can be done in some different ways.
[Fast] Loop with std::getline()
The simplest approach is to open an std::ifstream and loop using std::getline() calls. The code is clean and easy to understand.
#include <fstream>
std::ifstream file(FILENAME);
if (file.is_open()) {
std::string line;
while (std::getline(file, line)) {
// using printf() in all tests for consistency
printf("%s", line.c_str());
}
file.close();
}
[Fast] Use Boost's file_description_source
Another possibility is to use the Boost library, but the code gets a bit more verbose. The performance is quite similar to the code above (Loop with std::getline()).
#include <boost/iostreams/device/file_descriptor.hpp>
#include <boost/iostreams/stream.hpp>
#include <fcntl.h>
namespace io = boost::iostreams;
void readLineByLineBoost() {
int fdr = open(FILENAME, O_RDONLY);
if (fdr >= 0) {
io::file_descriptor_source fdDevice(fdr, io::file_descriptor_flags::close_handle);
io::stream <io::file_descriptor_source> in(fdDevice);
if (fdDevice.is_open()) {
std::string line;
while (std::getline(in, line)) {
// using printf() in all tests for consistency
printf("%s", line.c_str());
}
fdDevice.close();
}
}
}
[Fastest] Use C code
If performance is critical for your software, you may consider using the C language. This code can be 4-5 times faster than the C++ versions above, see benchmark below
FILE* fp = fopen(FILENAME, "r");
if (fp == NULL)
exit(EXIT_FAILURE);
char* line = NULL;
size_t len = 0;
while ((getline(&line, &len, fp)) != -1) {
// using printf() in all tests for consistency
printf("%s", line);
}
fclose(fp);
if (line)
free(line);
Benchmark -- Which one is faster?
I have done some performance benchmarks with the code above and the results are interesting. I have tested the code with ASCII files that contain 100,000 lines, 1,000,000 lines and 10,000,000 lines of text. Each line of text contains 10 words in average. The program is compiled with -O3
optimization and its output is forwarded to /dev/null
in order to remove the logging time variable from the measurement. Last, but not least, each piece of code logs each line with the printf()
function for consistency.
The results show the time (in ms) that each piece of code took to read the files.
The performance difference between the two C++ approaches is minimal and shouldn't make any difference in practice. The performance of the C code is what makes the benchmark impressive and can be a game changer in terms of speed.
10K lines 100K lines 1000K lines
Loop with std::getline() 105ms 894ms 9773ms
Boost code 106ms 968ms 9561ms
C code 23ms 243ms 2397ms
https://i.stack.imgur.com/fKKqc.png
std::cout
vs printf
.
printf()
function in all cases for consistency. I have also tried using std::cout
in all cases and this made absolutely no difference. As I have just described in the text, the output of the program goes to /dev/null
so the time to print the lines is not measured.
getline
in C is a gnu extension (now added to POSIX). It's not a standard C function.
Since your coordinates belong together as pairs, why not write a struct for them?
struct CoordinatePair
{
int x;
int y;
};
Then you can write an overloaded extraction operator for istreams:
std::istream& operator>>(std::istream& is, CoordinatePair& coordinates)
{
is >> coordinates.x >> coordinates.y;
return is;
}
And then you can read a file of coordinates straight into a vector like this:
#include <fstream>
#include <iterator>
#include <vector>
int main()
{
char filename[] = "coordinates.txt";
std::vector<CoordinatePair> v;
std::ifstream ifs(filename);
if (ifs) {
std::copy(std::istream_iterator<CoordinatePair>(ifs),
std::istream_iterator<CoordinatePair>(),
std::back_inserter(v));
}
else {
std::cerr << "Couldn't open " << filename << " for reading\n";
}
// Now you can work with the contents of v
}
int
tokens from the stream in operator>>
? How can one make it work with a backtracking parser (i.e. when operator>>
fails, roll back the stream to previous position end return false or something like that)?
int
tokens, then the is
stream will evaluate to false
and the reading loop will terminate at that point. You can detect this within operator>>
by checking the return value of the individual reads. If you want to roll back the stream, you would call is.clear()
.
operator>>
it is more correct to say is >> std::ws >> coordinates.x >> std::ws >> coordinates.y >> std::ws;
since otherwise you are assuming that your input stream is in the whitespace-skipping mode.
Expanding on the accepted answer, if the input is:
1,NYC
2,ABQ
...
you will still be able to apply the same logic, like this:
#include <fstream>
std::ifstream infile("thefile.txt");
if (infile.is_open()) {
int number;
std::string str;
char c;
while (infile >> number >> c >> str && c == ',')
std::cout << number << " " << str << "\n";
}
infile.close();
This answer is for visual studio 2017 and if you want to read from text file which location is relative to your compiled console application.
first put your textfile (test.txt in this case) into your solution folder. After compiling keep text file in same folder with applicationName.exe
C:\Users\"username"\source\repos\"solutionName"\"solutionName"
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ifstream inFile;
// open the file stream
inFile.open(".\\test.txt");
// check if opening a file failed
if (inFile.fail()) {
cerr << "Error opeing a file" << endl;
inFile.close();
exit(1);
}
string line;
while (getline(inFile, line))
{
cout << line << endl;
}
// close the file stream
inFile.close();
}
Although there is no need to close the file manually but it is good idea to do so if the scope of the file variable is bigger:
ifstream infile(szFilePath);
for (string line = ""; getline(infile, line); )
{
//do something with the line
}
if(infile.is_open())
infile.close();
This is a general solution to loading data into a C++ program, and uses the readline function. This could be modified for CSV files, but the delimiter is a space here.
int n = 5, p = 2;
int X[n][p];
ifstream myfile;
myfile.open("data.txt");
string line;
string temp = "";
int a = 0; // row index
while (getline(myfile, line)) { //while there is a line
int b = 0; // column index
for (int i = 0; i < line.size(); i++) { // for each character in rowstring
if (!isblank(line[i])) { // if it is not blank, do this
string d(1, line[i]); // convert character to string
temp.append(d); // append the two strings
} else {
X[a][b] = stod(temp); // convert string to double
temp = ""; // reset the capture
b++; // increment b cause we have a new number
}
}
X[a][b] = stod(temp);
temp = "";
a++; // onto next row
}
Success story sharing
int a, b; char c; while ((infile >> a >> c >> b) && (c == ','))
while(getline(f, line)) { }
construct and regarding error handling please have a look at this (my) article: gehrcke.de/2011/06/… (I think I do not need to have bad conscience posting this here, it even slightly pre-dates this answer).