ChatGPT解决这个技术问题 Extra ChatGPT

How to read a text file into a string variable and strip newlines?

Code:

with open("data.txt", "r") as f:
    data = f.readlines()

Input file:

ABC
DEF

However, data contains trailing \ns:

data == ['ABC\n', 'DEF']

How do I get:

data == 'ABCDEF'
The title and the question are inconsistent. Do you really want to get rid of the \n as well?
do you really want to remove newlines from the file/string contents, or are you just confused about the many meta-characters in your print output and actually want to keep the newlines, but not have them display as "\n"?
Do you really want to read the entire text into one string variable? Do you really mean with "strip newlines" to replace them with an empty string? This would mean, that the last word of a line and the first word of the next line are joined and not separated. I don't know your use case, but this seems to be a strange requirement. I might have another answer if you explain what you intend to do with the read in data

O
OneCricketeer

You could use:

with open('data.txt', 'r') as file:
    data = file.read().replace('\n', '')

Or if the file content is guaranteed to be one-line

with open('data.txt', 'r') as file:
    data = file.read().rstrip()

Is there a downside in just writing open("data.txt").read().replace('\n','') instead?
Yes, your version does not explicitly close the file, that will then be delayed until the garbage collector runs or the program terminates. The 'with' statement usually encapsulates some setup/teardown open/close actions.
Thanks for the clarification. So, it seems that my version might be ok for small scripts - but OTOH it should preferably be avoided altogether to not make it a habit.
@tuomassalo it is a huge PITA in the test/debug process, as it won't clean up the open file handles if you have to terminate prematurely or it runs into an exception.
No, rstrip('\n') will only remove the newline from the last line, replace('\n','') removes it everywhere (essentially making the whole file one line)
J
Jonathan Sudiaman

In Python 3.5 or later, using pathlib you can copy text file contents into a variable and close the file in one line:

from pathlib import Path
txt = Path('data.txt').read_text()

and then you can use str.replace to remove the newlines:

txt = txt.replace('\n', '')

This is so far the most elegant solution. I prefer to have a oneliner solution like R's read_file
When I use pathlib, I get this error UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 148: character maps to <undefined>. And as far as I search, it seems I can't set encoding to pathlib, so I have to read the content traditionally with the os module.
@AliAkhtari Per the docs, the signature is read_text(encoding=None, errors=None). You should be able to pass the encoding there?
m
mit

You can read from a file in one line:

str = open('very_Important.txt', 'r').read()

Please note that this does not close the file explicitly.

CPython will close the file when it exits as part of the garbage collection.

But other python implementations won't. To write portable code, it is better to use with or close the file explicitly. Short is not always better. See https://stackoverflow.com/a/7396043/362951


This is anti-idiomatic and not recommended. open should be used within a with ... as statement.
@J.C can you explain the problem ? Is this just a question of custom or does the with ... as statement bring something ?
@Titou the issue is that open.read() doesn't close the file so we either need with ... as or str.close() as demonstrated in Pedro's answer. More on the importance of closing files here
@JBallin. This idiom clearly removes a source of error. Thanks !
this is also bad because you've just shadowed str() from builtins
P
Pedro Lobito

To join all lines into a string and remove new lines, I normally use :

with open('t.txt') as f:
  s = " ".join([l.rstrip("\n") for l in f]) 

It is giving UnicodeDecodeError in my code See this stackoverflow.com/q/18649512/9339242
you may need to specify the character encoding.
will remove trailing white space as well so perhaps better to s = " ".join([l.replace("\n", "") for l in f])
@gelonida rstrip also supports specifying the charter(s) you want to remove, thanks for the tip.
thanks as well. I forgot that rstrip() also has an argument. So your code is probably faster and operational except for some really weird cases where one line would contain a mix of trailing " " and "\n" like "funny line \n \n "
M
MagerValp
with open("data.txt") as myfile:
    data="".join(line.rstrip() for line in myfile)

join() will join a list of strings, and rstrip() with no arguments will trim whitespace, including newlines, from the end of strings.


L
Loochie

This can be done using the read() method :

text_as_string = open('Your_Text_File.txt', 'r').read()

Or as the default mode itself is 'r' (read) so simply use,

text_as_string = open('Your_Text_File.txt').read()

Note that this keeps the file open indefinitely.
g
gelonida

I'm surprised nobody mentioned splitlines() yet.

with open ("data.txt", "r") as myfile:
    data = myfile.read().splitlines()

Variable data is now a list that looks like this when printed:

['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']

Note there are no newlines (\n).

At that point, it sounds like you want to print back the lines to console, which you can achieve with a for loop:

for line in data:
    print(line)

g
gelonida

I have fiddled around with this for a while and have prefer to use use read in combination with rstrip. Without rstrip("\n"), Python adds a newline to the end of the string, which in most cases is not very useful.

with open("myfile.txt") as f:
    file_content = f.read().rstrip("\n")
    print(file_content)

C
Chris Eberle

It's hard to tell exactly what you're after, but something like this should get you started:

with open ("data.txt", "r") as myfile:
    data = ' '.join([line.replace('\n', '') for line in myfile.readlines()])

reduce(lambda x,y : x+y.rstrip('\n'), ['a\n', "b\n", 'c'], "") is a lot cooler :D
@Duncan what would you suggest?
data = ' '.join(line.replace('\n', '') for line in myfile) or MagerValp's version.
M
My Car

Try this:

with open("my_text_file.txt", "r") as file:
    data = file.read().replace("\n", "")

M
Michael Smith

you can compress this into one into two lines of code!!!

content = open('filepath','r').read().replace('\n',' ')
print(content)

if your file reads:

hello how are you?
who are you?
blank blank

python output

hello how are you? who are you? blank blank

I like this solution as the last word of a line will be separated by a space from the first word of the next line. However I would suggest to use the with statement. So something like with open("filepath", "r") as fin: content = fin.read().replace("\n", " ") But if course it's nit sure whether this is needed by the original poster
E
Edward D'Souza

This is a one line, copy-pasteable solution that also closes the file object:

_ = open('data.txt', 'r'); data = _.read(); _.close()

O
OrionMD

You can also strip each line and concatenate into a final string.

myfile = open("data.txt","r")
data = ""
lines = myfile.readlines()
for line in lines:
    data = data + line.strip();

This would also work out just fine.


data = data + line.strip(); can be reduced to data += line.strip();
very inefficient for huge files (a lot of memory allocations and memory copies will take place. better to create list of stripped lines and then use " ".join()`
g
gerardw

python3: Google "list comprehension" if the square bracket syntax is new to you.

 with open('data.txt') as f:
     lines = [ line.strip('\n') for line in list(f) ]

Very pythonic and worked for me quite well, although I haven't tested on large files yet. Thank you!
I'm going to be retracting my upvote because strip also strips whitespace, which may not be the desired behavior. However, I still think a modified version of this would be good.
lines = list(map(str.strip, f))?
M
Machinexa

Oneliner:

List: "".join([line.rstrip('\n') for line in open('file.txt')])

Generator: "".join((line.rstrip('\n') for line in open('file.txt')))

List is faster than generator but heavier on memory. Generators are slower than lists and is lighter for memory like iterating over lines. In case of "".join(), I think both should work well. .join() function should be removed to get list or generator respectively.

Note: close() / closing of file descriptor probably not needed


A
Ali

Have you tried this?

x = "yourfilename.txt"
y = open(x, 'r').read()

print(y)

This is wrong. You want y = open(x, 'r').read() if you're going to do it that way.
S
Sma Ma

To remove line breaks using Python you can use replace function of a string.

This example removes all 3 types of line breaks:

my_string = open('lala.json').read()
print(my_string)

my_string = my_string.replace("\r","").replace("\n","")
print(my_string)

Example file is:

{
  "lala": "lulu",
  "foo": "bar"
}

You can try it using this replay scenario:

https://repl.it/repls/AnnualJointHardware

https://i.stack.imgur.com/cmmzY.png


g
gelonida
f = open('data.txt','r')
string = ""
while 1:
    line = f.readline()
    if not line:break
    string += line

f.close()


print(string)

Loops which have a string += line should be avoided. Some versions of Python may manage to avoid O(n^2) behaviour here but any of the other answers that have been given are better than this. Also you didn't remove the newlines that were requested so your code is just a very slow way of doing string = f.read()
Thank for correcting me. But one small thing is that I have not to remove the new line, because when I tested, it didn't print '\n' out. @Duncan
very inefficient for huge files. for every iteration memory has to be allocated and data has to be copied. Also: the new line is neither removed nor replaced with a " " Try to use following command to see, that the new lines are still contained. print(repr(string))
J
John Galbraith

I don't feel that anyone addressed the [ ] part of your question. When you read each line into your variable, because there were multiple lines before you replaced the \n with '' you ended up creating a list. If you have a variable of x and print it out just by

x

or print(x)

or str(x)

You will see the entire list with the brackets. If you call each element of the (array of sorts)

x[0] then it omits the brackets. If you use the str() function you will see just the data and not the '' either. str(x[0])


佚名

Maybe you could try this? I use this in my programs.

Data= open ('data.txt', 'r')
data = Data.readlines()
for i in range(len(data)):
    data[i] = data[i].strip()+ ' '
data = ''.join(data).strip()

A
Alex

Regular expression works too:

import re
with open("depression.txt") as f:
     l = re.split(' ', re.sub('\n',' ', f.read()))[:-1]

print (l)

['I', 'feel', 'empty', 'and', 'dead', 'inside']


R
Roksolanka Fedkovych
with open('data.txt', 'r') as file:
    data = [line.strip('\n') for line in file.readlines()]
    data = ''.join(data)

P
PyGuy

This works: Change your file to:

LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE

Then:

file = open("file.txt")
line = file.read()
words = line.split()

This creates a list named words that equals:

['LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN', 'GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE']

That got rid of the "\n". To answer the part about the brackets getting in your way, just do this:

for word in words: # Assuming words is the list above
    print word # Prints each word in file on a different line

Or:

print words[0] + ",", words[1] # Note that the "+" symbol indicates no spaces
#The comma not in parentheses indicates a space

This returns:

LLKKKKKKKKMMMMMMMMNNNNNNNNNNNNN, GGGGGGGGGHHHHHHHHHHHHHHHHHHHHEEEEEEEE

Changing the file might work in a one off situation but if you have hundreds of files this just isnt a workable solution.
L
Lakshaya Maheshwari
with open(player_name, 'r') as myfile:
 data=myfile.readline()
 list=data.split(" ")
 word=list[0]

This code will help you to read the first line and then using the list and split option you can convert the first line word separated by space to be stored in a list.

Than you can easily access any word, or even store it in a string.

You can also do the same thing with using a for loop.


a
akD
file = open("myfile.txt", "r")
lines = file.readlines()
str = ''                                     #string declaration

for i in range(len(lines)):
    str += lines[i].rstrip('\n') + ' '

print str

y
yota
line_lst = Path("to/the/file.txt").read_text().splitlines()

Is the best way to get all the lines of a file, the '\n' are already stripped by the splitlines() (which smartly recognize win/mac/unix lines types).

But if nonetheless you want to strip each lines:

line_lst = [line.strip() for line in txt = Path("to/the/file.txt").read_text().splitlines()]

strip() was just a useful exemple, but you can process your line as you please.

At the end, you just want concatenated text ?

txt = ''.join(Path("to/the/file.txt").read_text().splitlines())

P
Palak Jain

Try the following:

with open('data.txt', 'r') as myfile:
    data = myfile.read()

    sentences = data.split('\\n')
    for sentence in sentences:
        print(sentence)

Caution: It does not remove the \n. It is just for viewing the text as if there were no \n