1

I'm looking for a way to convert excel to html while preserving formatting.

I know this is doable on windows due to the availability of some underlying win32 libraries, (eg via xlwings Python - Excel to HTML (keeping format))

But I'm looking for a solution on Linux. I've also come by Aspose Cells but this requires a paid license or else it will add a lot of extra junk to the output that needs to be scrubbed out.

And lastly I tried the python lib xlsx2html but it does a very poor job at preserving formatting.

Are there any suggestions for a Linux based solution? I'd also be interested in tools written in other languages that can be easily wrapped around via python.

Thanks in advance!

Update: Here is an example of a random excel sheet I converted via excel itself that I would like to reproduce. It has some colors, some border variations, some merged cells and some font sizes to see if they all work. enter image description here

6
  • Did you try soffice --headless --convert-to html data.xlsx? Commented Feb 8, 2023 at 22:46
  • Can you provide an example file or image to understanding better what you need? Because my old codes might convert it as you need, but I need to make sure that's what you want. A simple Excel file and your expected result can be very helpful. Commented Feb 8, 2023 at 22:54
  • @NimaAkbarzadeh I updated the question with a photo example of a dummy sheet. This is something xlsx2html would fail to produce accurately. Commented Feb 8, 2023 at 23:02
  • @Corralien soffice was promising but also fails to produce some formatting. It's a shame because --conver-to pdf works perfectly, but --conver-to html not so much. Commented Feb 8, 2023 at 23:26
  • I'm not sure it helps but you can also try pandoc. Commented Feb 8, 2023 at 23:32

2 Answers 2

1

You can use LibreOffice to convert an Excel file to a HTML file using the command line:

# --convert-to implies --headless so it's not mandatory to specify --headless
soffice --headless --convert-to html data.xlsx

You can refer to the documentation to know more about other CLI parameters.

Sign up to request clarification or add additional context in comments.

1 Comment

Il also add that some formatting changes on Excel may be needed to keep it from breaking. The main 2 things I had to do were: - Sometimes empty " " need to be added to empty columns to keep them from being collapses in HTML format. - Avoid having a border both on the top of a lower cell and bottom of an upper cell because this converts into an extra thick line on html. Otherwise the formatting is pretty good
1

I think you should search for Excel to HTML in the JS world not python (I am not saying it is not possible, but It's more usual in JS), I promise you will get better results. In my opinion, finding a JS-based solution and make a python wrapper can be more helpful. Because in JS community, they struggled more than another communities to import and work with Excels. Another idea is to change your approach, look for how you can import an Excel file in an embedded way or iframe inside an HTML page with JS and then export it. But again, I highly recommend to check JS libraries or GitHub repositories, some of them care about formatting.

2 Comments

Yea I did explore that a bit, but a few libs I found struggled with formatting (eg npmjs.com/package/excel-to-html-table). If you have any specific suggestions I'd be interesting in trying.
Sure. I also suggest you to open an issue on github.com/Apkawa/xlsx2html and explain your case. Maybe they fix it or giving you a better alternative.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.