My old man has a bunch of .dox stuff saved. He has complicated large files saved that are not supported by any of the FOSS conversion tools. I’ve tried Libre office, Abi Word, and every command line tool and converter I can find. These are entire book sized files.
I have a W10 machine with Word. Is extracting the .exe and running it with wine feasible without making an epic mess or massive project of this?
I thought we stopped doing the “m$” thing around 2010.
Word barely supports old Word files. Very few tools can reproduce .doc files other than Office itself, and even Office versions aren’t all compatible.
My approach would be to install some kind of Office on a machine and just script the hell out of opening files and saving them as docx or whatever open format Word supports these days. Word exposes a COM interface you can script against, so most programming languages and JScript or VBS can automate this process.
If you can figure out how to scan files in a loop, this snippet may get you started:
Set word = CreateObject("Word.Application") word.Visible = True word.Documents.Open("C:\Documents and Settings\User\Hello.doc") Set doc = word.ActiveDocument doc.SaveAs "C:\Documents and Settings\User\export.docx", 16 word.Quit()
To do this with reasonable speed, keep one instance of word around and close the documents rather than quitting Word every time you iterate through the list.
I started using M$ around 2010 personally
im pretty surs that codeweavers crossover still works for microsoft365. atleast I used it with office365 last year without major issue.
Why do you spell it as “m$ office”?
Greedy fucks.
Generally, no. M$ office has some pretty invasive DRM, so your best bet to running it on linux is to run it on a windows virtual machine
You can try Pandoc and see if that works, Google Docs, Office365, finding an abandonware version of Word and running on Wine…lots of options to work with.
It might be easier to start narrowing down where you need to look if you get the header info from one of these files.
deleted by creator
Okay. First off, I downvoted you for obvious reasons.
Second, if you’re not sure how to extract the header of a file, just Google that. You may be ill prepared and asking for help here.
deleted by creator
You don’t understand how file formats work I guess. You can’t just ‘head’ an encoded file and expect a terminal to output what you want. Do some research.
It’s not open source but probably has the best compatibility. You can give it a shot.
https://www.freeoffice.com/en/
Needs an account after one week though.
Looks interesting. Any info on whether Excel Macros work for it?
It doesn’t use Visual Basic for its macros so I wouldn’t expect a complex compatibility. To be fair Excel macros is usually a problem outside of MS Office.
To be honest, there’s a few good comments linking to scripts and methods here to batch convert them on a windows pc/vm. That’s the best way to go.
To add on to their comments. If you’re just interested in preserving them then maybe printing them to pdf, specifically pdf/a, would be my approach once you got them opened.
Why not just use the windows machine?
VMWare and archive dot org are your friend
Assuming the latest version of OpenOffice doesn’t work for these files…
My next course of action would be using the Win 10 machine with Word, or a VM with Win10 or 11 and the latest version of Word. Use MASGrave to trick M$ into considering it licensed if you need to.
Use a Powershell script to interact with Word through the COM object interface and automate opening Word, opening the file, saving it as a different filetype, and closing. Here’s a snippet of Powershell from Reddit for going in the opposite direction (odt to docx) for a single file. I wouldn’t try to do this through Linux, just suck it up and use Windows so you don’t have an extra layer of mess to deal with.
Going off M$ documentation of the save types enum, I would replace “wdFormatDocumentDefault” in that snippet with wdFormatOpenDocumentText or wdFormatStrictOpenXMLDocument, then test it with a single file to see which gives the output you need.
Getting all the files of the starting type from a folder can be done using Get-ChildItem. Store those in a variable and use a foreach loop over the initial file list.
Try your local library.
In my experience, OnlyOffice has the best compatibility with M$ Office. You should try it if you haven’t
It’s worth a try, though in my experience it can struggle with very large files.
I will agree with the people suggesting “VM and a pirated copy”
Just get like office 2010 and windows 7 off of the web, run it in a VM, convert the files, dump it all.
Doesn’t Office 2010 work in Wine?
I wouldn’t know, but since OP is having compatibility issues, I’d try to get as close to native as I could. Eliminate the room-for-error. Hence the VM with actual Windows.
They can just delete the lot after they’ve converted their files to an opener format. :P
Yeah this is the last version of Office that doesn’t nag you and your can find keys or buy generated ones off eBay if you feel guilty or worried about malicious cracks.
Instead of pirating anything, you can instead use:
https://github.com/massgravel/Microsoft-Activation-Scripts for activation
https://massgrave.dev/office_msi_links for download of office
(these count as piracy, but yes, they work well and are reliable)
OnlyOffice.
Not to be confused with OpenOffice.
(LibreOffice forked from OO back then.)
I bought a cheap win11+office 2021 combo on the net and use a VM. Its not the easiest way but it works…
😔