ChatGPT解决这个技术问题 Extra ChatGPT

Replace and overwrite instead of appending

I have the following code:

import re
#open the xml file for reading:
file = open('path/test.xml','r+')
#convert to string:
data = file.read()
file.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>",data))
file.close()

where I'd like to replace the old content that's in the file with the new content. However, when I execute my code, the file "test.xml" is appended, i.e. I have the old content follwed by the new "replaced" content. What can I do in order to delete the old stuff and only keep the new?

When you say "replace the old content that's in the file with the new content", you need to read in and transform the current contents data = file.read(). You don't mean "blindly overwrite it without needing to read it first".

B
Boris Verkhovskiy

You need seek to the beginning of the file before writing and then use file.truncate() if you want to do inplace replace:

import re

myfile = "path/test.xml"

with open(myfile, "r+") as f:
    data = f.read()
    f.seek(0)
    f.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>", r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", data))
    f.truncate()

The other way is to read the file then open it again with open(myfile, 'w'):

with open(myfile, "r") as f:
    data = f.read()

with open(myfile, "w") as f:
    f.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>", r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", data))

Neither truncate nor open(..., 'w') will change the inode number of the file (I tested twice, once with Ubuntu 12.04 NFS and once with ext4).

By the way, this is not really related to Python. The interpreter calls the corresponding low level API. The method truncate() works the same in the C programming language: See http://man7.org/linux/man-pages/man2/truncate.2.html


Neither truncate nor open(..., 'w') will change the inode number of the file why is it important?
@rok if the inode changes or not is not relevant in most cases. Only in edge cases where you use hard-links, but I advice to avoid hard links.
is there a drawback of using the "f.seek() ..." approach over the "with open( ...)" approach?
M
Moshe Rabaev
file='path/test.xml' 
with open(file, 'w') as filetowrite:
    filetowrite.write('new content')

Open the file in 'w' mode, you will be able to replace its current text save the file with new contents.


This is a good way to clear a file and write something new to it, but the question was about reading the file, modifying the contents and overwriting the original with the new contents.
@Boris, what is the problem with reading the file first and then using the code in this answer?
@Rayhunter : it's inefficient
it's simple and efficient, does the job in a perfect way.
C
Community

Using truncate(), the solution could be

import re
#open the xml file for reading:
with open('path/test.xml','r+') as f:
    #convert to string:
    data = f.read()
    f.seek(0)
    f.write(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>",data))
    f.truncate()

seek and truncate!!! I couldn't figure out why seek alone was not working.
@conner.xyz Maybe I am wrong but seek is responsible to change the cursor position. And write is responsible for writing into the file from the cursor position. write doesn't care about after writing a file is there any content remain or not. truncate here do the job to remove the rest of the content from the cursor position of the file.
@Almabud, I just tested with open(...) as f: f.truncate() f.write(...) (no seek(0)) and it does indeed seem to replace the file contents.
@conner.xyz recently I was trying your solution as it is simpler but it wasn't working as expected. ` file = <byte_image> pyexiv_img = pyexiv2.ImageData(file.read()) pyexiv_img.clear_exif() file.truncate() file.write(pyexiv_img.get_bytes()) ` This is not working. I need to add file.seek(0) after then it works fine.
7
7beggars_nnnnm

See from How to Replace String in File works in a simple way and is an answer that works with replace

fin = open("data.txt", "rt")
fout = open("out.txt", "wt")

for line in fin:
    fout.write(line.replace('pyton', 'python'))

fin.close()
fout.close()

N
Nadia Salgado
import os#must import this library
if os.path.exists('TwitterDB.csv'):
        os.remove('TwitterDB.csv') #this deletes the file
else:
        print("The file does not exist")#add this to prevent errors

I had a similar problem, and instead of overwriting my existing file using the different 'modes', I just deleted the file before using it again, so that it would be as if I was appending to a new file on each run of my code.


r
rok

Using python3 pathlib library:

import re
from pathlib import Path
import shutil

shutil.copy2("/tmp/test.xml", "/tmp/test.xml.bak") # create backup
filepath = Path("/tmp/test.xml")
content = filepath.read_text()
filepath.write_text(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", content))

Similar method using different approach to backups:

from pathlib import Path

filepath = Path("/tmp/test.xml")
filepath.rename(filepath.with_suffix('.bak')) # different approach to backups
content = filepath.read_text()
filepath.write_text(re.sub(r"<string>ABC</string>(\s+)<string>(.*)</string>",r"<xyz>ABC</xyz>\1<xyz>\2</xyz>", content))