FRIHOST FORUMS SEARCH FAQ TOS BLOGS COMPETITIONS
You are invited to Log in or Register a free Frihost Account!


Writing to a file -- Python





davidv
Hi, I'm slightly modifying a csv file but I don't know how to write back onto that csv file the modifications I've made.

This is my Python script that does the small changes.


Code:
for line in open('movie.txt', 'rU'):
    line = line[:-1].split(',')
    x = [line[i] for i in xrange(len(line)) if str(i) not in '178']
    # Removing 00:00:00
    x[len(x)-1] = x[len(x)-1].split()[0]


And these are the before and after datasets.

Code:
1,1,Zombieland,Action,Comedy,2009,7.0/10,1,1,20/10/2009 0:00:00
2,1,Mission Impossible I,Action,Thriller,1996,7.0/10,1,1,7/1/2011 0:00:00
3,1,Mission Impossible II,Action,Thriller,2000,6.5/10,1,1,7/1/2011 0:00:00
4,1,Superbad,Comedy,,2007,7.5/10,1,1,30/3/2010 0:00:00
5,1,Mission Impossible III,Action,Thriller,2006,7.5/10,1,1,7/1/2011 0:00:00
6,1,The Curious Case of Benjamin Button,Drama,Fantasy,2008,7.5/10,1,1,23/3/2010 0:00:00
7,1,Mystic River,Crime,Drama,2003,,0,0,21/3/2010 0:00:00
8,1,In Bruges,Crime,Drama,2008,,0,0,21/3/2010 0:00:00
9,1,Gran Torino,Drama,,2008,7.0/10,1,1,21/3/2010 0:00:00
10,1,American Beauty,Drama,,1999,7.5/10,1,1,21/3/2010 0:00:00


Code:
>>>
['1', 'Zombieland', 'Action', 'Comedy', '2009', '7.0/10', '20/10/2009']
['2', 'Mission Impossible I', 'Action', 'Thriller', '1996', '7.0/10', '7/1/2011']
['3', 'Mission Impossible II', 'Action', 'Thriller', '2000', '6.5/10', '7/1/2011']
['4', 'Superbad', 'Comedy', '', '2007', '7.5/10', '30/3/2010']
['5', 'Mission Impossible III', 'Action', 'Thriller', '2006', '7.5/10', '7/1/2011']
['6', 'The Curious Case of Benjamin Button', 'Drama', 'Fantasy', '2008', '7.5/10', '23/3/2010']
['7', 'Mystic River', 'Crime', 'Drama', '2003', '', '21/3/2010']
['8', 'In Bruges', 'Crime', 'Drama', '2008', '', '21/3/2010']
['9', 'Gran Torino', 'Drama', '', '2008', '7.0/10', '21/3/2010']
['10', 'American Beauty', 'Drama', '', '1999', '7.5/10', '21/3/2010']


Is there a module or some kind of inbuilt tool that Python has which allows me to write onto files? Thanks.
davidv
Code:
f = open('chmod_movies.txt', 'w')
try:
    t = 1
    for line in open('movie.txt', 'rU'):
        line = line[:-1].split(',')
        x = [line[i] for i in xrange(len(line)) if str(i) not in '178']
        # Removing 00:00:00
        x[len(x)-1] = x[len(x)-1].split()[0]
        f.write(','.join(x)+'\n')
        t += 1
    f.close()
    print 'Completed: Total number of lines added: %d.' % (t-1)
except IOError:
    print 'Incomplete: The file is missing.'


I spent a bit of time looking online to find a way to write to file in Python. What I did here in addition to the above was to basically create a file object using the write mode (second option in the open() function). I've then selectively chosen the fields I wanted from each line (using list comprehension), concatenated them and added a newline character at the end. Then, I finally closed the file and adding a try catch block to handle any IO exceptions.

I didn't get a response from anyone so I'm going to assume nobody knew however if someone does know a better way of doing this, please let me know Smile

Code:
1,Zombieland,Action,Comedy,2009,7.0/10,20/10/2009
2,Mission Impossible I,Action,Thriller,1996,7.0/10,7/1/2011
3,Mission Impossible II,Action,Thriller,2000,6.5/10,7/1/2011
4,Superbad,Comedy,,2007,7.5/10,30/3/2010
5,Mission Impossible III,Action,Thriller,2006,7.5/10,7/1/2011
6,The Curious Case of Benjamin Button,Drama,Fantasy,2008,7.5/10,23/3/2010
7,Mystic River,Crime,Drama,2003,,21/3/2010
8,In Bruges,Crime,Drama,2008,,21/3/2010
9,Gran Torino,Drama,,2008,7.0/10,21/3/2010
10,American Beauty,Drama,,1999,7.5/10,21/3/2010
jcreus
davidv wrote:
Hi, I'm slightly modifying a csv file but I don't know how to write back onto that csv file the modifications I've made.

This is my Python script that does the small changes.


Code:
for line in open('movie.txt', 'rU'):
    line = line[:-1].split(',')
    x = [line[i] for i in xrange(len(line)) if str(i) not in '178']
    # Removing 00:00:00
    x[len(x)-1] = x[len(x)-1].split()[0]


And these are the before and after datasets.

Code:
1,1,Zombieland,Action,Comedy,2009,7.0/10,1,1,20/10/2009 0:00:00
2,1,Mission Impossible I,Action,Thriller,1996,7.0/10,1,1,7/1/2011 0:00:00
3,1,Mission Impossible II,Action,Thriller,2000,6.5/10,1,1,7/1/2011 0:00:00
4,1,Superbad,Comedy,,2007,7.5/10,1,1,30/3/2010 0:00:00
5,1,Mission Impossible III,Action,Thriller,2006,7.5/10,1,1,7/1/2011 0:00:00
6,1,The Curious Case of Benjamin Button,Drama,Fantasy,2008,7.5/10,1,1,23/3/2010 0:00:00
7,1,Mystic River,Crime,Drama,2003,,0,0,21/3/2010 0:00:00
8,1,In Bruges,Crime,Drama,2008,,0,0,21/3/2010 0:00:00
9,1,Gran Torino,Drama,,2008,7.0/10,1,1,21/3/2010 0:00:00
10,1,American Beauty,Drama,,1999,7.5/10,1,1,21/3/2010 0:00:00


Code:
>>>
['1', 'Zombieland', 'Action', 'Comedy', '2009', '7.0/10', '20/10/2009']
['2', 'Mission Impossible I', 'Action', 'Thriller', '1996', '7.0/10', '7/1/2011']
['3', 'Mission Impossible II', 'Action', 'Thriller', '2000', '6.5/10', '7/1/2011']
['4', 'Superbad', 'Comedy', '', '2007', '7.5/10', '30/3/2010']
['5', 'Mission Impossible III', 'Action', 'Thriller', '2006', '7.5/10', '7/1/2011']
['6', 'The Curious Case of Benjamin Button', 'Drama', 'Fantasy', '2008', '7.5/10', '23/3/2010']
['7', 'Mystic River', 'Crime', 'Drama', '2003', '', '21/3/2010']
['8', 'In Bruges', 'Crime', 'Drama', '2008', '', '21/3/2010']
['9', 'Gran Torino', 'Drama', '', '2008', '7.0/10', '21/3/2010']
['10', 'American Beauty', 'Drama', '', '1999', '7.5/10', '21/3/2010']


Is there a module or some kind of inbuilt tool that Python has which allows me to write onto files? Thanks.


I would do it another way:
Code:

with open("modified_movies.txt","w") as f:
   f.write("\n".join(",".join(x)))


The syntax with "with" is pretty new and doesn't work with all the Python versions, but it handles the open/close syntax automatically. Otherwise it can be done as
Code:

f = open("modified_movies.txt","w")
f.write("\n".join(",".join(x)))
f.close()
cgkanchi
OK, here's 3 different ways to do it. I'll try to go through each of them and tell you what each one's strengths and weaknesses are:

Code:

with open('py_csv.csv', 'rU') as inFile:
    results = [line.split(',')[:1]+line.split(',')[2:-1] for line in inFile]
with open('py_csv_method_1.csv', 'wb') as outFile:
    for line in results:
        outFile.write(','.join(line))
        outFile.write('\n')


This is probably the easiest and most straightforward of the three methods. You read the file in line by line, split it on the comma character and skip the second and last fields (that is what the line.split(',')[:1] + line.split(',')[2:-1] bit does). Then, you take the resultant list and write it to an output file.

This method works pretty well, but if you have a really big file, or say you're doing this for a website with thousands of visitors on a server with limited RAM, reading the whole file into memory is not a good idea. So, we try something else:

Code:

with open('py_csv.csv', 'rU') as inFile:
    with open('py_csv_method_2.csv', 'wb') as outFile:
        for line in inFile:
            outFile.write(','.join(line.split(',')[:1] + line.split(',')[2:-1]))
            outFile.write('\n')


This is basically identical to the first method, except that in this case, you're not storing the lines you read in a list. Instead, you're processing them right then and there, and writing them out to the output file. This method is good for when you run into a situation where your script takes up more memory than is acceptable, but for small files, or sitiuations where RAM isn't limiting, it's likely to be a lot slower. This is because disk I/O is far slower than getting data in and out of RAM, so the fewer I/O operations you do, the faster your program is (this is a slight simplification, but it's a good rule of thumb).

However, having said all this, Python actually provides you with a module to handle all this for you, just so you don't have to think about it. The module is called csv, which I've used in the program below:

Code:

with open('py_csv.csv', 'rU') as inFile:
    with open('py_csv_method_3.csv', 'wb') as outFile:
        myReader = csv.reader(inFile, delimiter=',')
        myWriter = csv.writer(outFile, delimiter=',')
        for line in myReader:
            myWriter.writerow(line[:1]+line[2:-1])


Here, we open the files as before, but instead of trying to read them and write them ourselves, we pass the task on to two purpose-built classes in the csv module. These classes, csv.reader and csv.writer are designed to read and write files that are delimited by a specific character, in this case, a comma. For more information on the csv module, see http://docs.python.org/library/csv.html.

All three methods produce identical output, but you should prefer the csv module where you can, since that does all the hard work for you. Since it has been used by thousands of people, it is unlikely to have any bugs either.

To write these, I wrote up a long Python file with a few more comments than I've put in here, so I'll just paste the whole code below:

Code:

import csv
#if using python 2.5, uncomment the following line:
#from __future__ import with_statement
#for Python 2.4 and below, replace the with statement as follows:
#inFile = open('py_csv.csv', 'rU')
#the statement(s) inside the with block
#inFile.close()

def method1():
    '''Use this if you don't want to use the csv module, and you have a short file'''
    with open('py_csv.csv', 'rU') as inFile:
        results = [line.split(',')[:1]+line.split(',')[2:-1] for line in inFile]
    with open('py_csv_method_1.csv', 'wb') as outFile:
        for line in results:
            outFile.write(','.join(line))
            outFile.write('\n')

def method2():
    '''Use this if you don't want to use the csv module, and you have a long file'''
    with open('py_csv.csv', 'rU') as inFile:
        with open('py_csv_method_2.csv', 'wb') as outFile:
            for line in inFile:
                outFile.write(','.join(line.split(',')[:1] + line.split(',')[2:-1]))
                outFile.write('\n')

def method3():
    '''This is the preferred way to do it if you're using Python 2.3 or above, since it takes
    care of a lot of the internal details for you.'''
    with open('py_csv.csv', 'rU') as inFile:
        with open('py_csv_method_3.csv', 'wb') as outFile:
            myReader = csv.reader(inFile, delimiter=',')
            myWriter = csv.writer(outFile, delimiter=',')
            for line in myReader:
                myWriter.writerow(line[:1]+line[2:-1])

if __name__ == '__main__':
    method1()
    method2()
    method3()


Hope that helped.

Cheers,
cgkanchi
Related topics
Reply to topic    Frihost Forum Index -> Scripting -> Others

FRIHOST HOME | FAQ | TOS | ABOUT US | CONTACT US | SITE MAP
© 2005-2011 Frihost, forums powered by phpBB.