All this Word styles will be dropped when converting to plain text, but allows us to make the display more clear. Note also which I used a Word bulleted list to help marking the itemized list. To test it, I wrote the following Word document (note that I used Word styles to mark the section titles, and used a table to insert the code of a tikz picture, and even inserted an image showing the result for that figure, obviously not in the first pass, but later). Newfile.write('\n\n'.join(newparatextlist)) # Print our documnts test with two newlines under each paragraph Newparatextlist.append(paratext.encode("utf-8")) # Fetch all the text out of the document we just created This file opens a docx (Office 2007) file and dumps the text. Then I adapted the code of example-extracttext to our needs, and wrote the following script, which I named run.py: #!/usr/bin/env python2.7 I only uncompressed it in a folder, open a cmd shell, navigate to that folder and run the provided examples ( example-extracttext.py and example-makedocument.py) which worked. I didn't try to properly install this one. Running those exe, the appropiate libraries and bindings are installed. The easiest way I've found to install the later in Windows is going to, and download lxml-2.3.4.win32-py2.7.exe and PIL-1.1.7.win32-py2.7.exe (note that you have to choose the appropiate files for your python version). You need to install Python (I installed python2.7), and lxml and PIL. Elaborating my answer in a comment to the question, this is what I got so far.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |