알쓸전컴(알아두면 쓸모있는 전자 컴퓨터)
Documents4j 문서 포맷 변환기(excel->html) 본문
Documents4j 문서 포맷 변환기(excel->html)
먼저 해당 문서 를 보기전에 아래 사이트를 실습을 해보거나 분석을 꼭 해보시기 바랍니다.
그래야 해당 문서가 이해가 수월 할것으로 예상 됩니다.
http://idlecomputer.tistory.com/248
먼저 기본 로직은 아래와 같습니다.
아래 로직중
1.temp1 은 Documents4j 라이브러리가 만드는 Input 파라 메터 입니다.
2.temp2 은 Documents4j 라이브러리가 만드는 Output 파라 메터 입니다.
3.압축은 7z을 사용하기 때문에 7z이 설치가 되어 있어야 합니다. (VBScript 를 사용할때 해당 경로 적어 줍니다 )
먼저 HTML 을 위해서 오픈 소스 수정이 필요 합니다.
먼저 Maven 프로젝트 소스를 Import 해보면 https://github.com/documents4j/documents4j/releases (소스 다운로드 Site)
먼저 HTML 포멧을 추가 시켜 줘야 합니다.
에서 HTML 부분을 추가해 줍니다
package com.documents4j.api; import java.io.Serializable; /** * Represents an immutable document MIME type. */ public class DocumentType implements Serializable, Comparable{ public static final DocumentType MS_WORD = new DocumentType(Value.APPLICATION, Value.WORD_ANY); public static final DocumentType RTF = new DocumentType(Value.APPLICATION, Value.RTF); public static final DocumentType DOCX = new DocumentType(Value.APPLICATION, Value.DOCX); public static final DocumentType DOC = new DocumentType(Value.APPLICATION, Value.DOC); public static final DocumentType MS_EXCEL = new DocumentType(Value.APPLICATION, Value.EXCEL_ANY); public static final DocumentType XLSX = new DocumentType(Value.APPLICATION, Value.XLSX); public static final DocumentType XLS = new DocumentType(Value.APPLICATION, Value.XLS); public static final DocumentType ODS = new DocumentType(Value.APPLICATION, Value.ODS); public static final DocumentType CSV = new DocumentType(Value.TEXT, Value.CSV); public static final DocumentType XML = new DocumentType(Value.APPLICATION, Value.XML); public static final DocumentType MHTML = new DocumentType(Value.APPLICATION, Value.MHTML); public static final DocumentType PDF = new DocumentType(Value.APPLICATION, Value.PDF); public static final DocumentType HTML = new DocumentType(Value.APPLICATION, Value.HTML); public static final DocumentType PDFA = new DocumentType(Value.APPLICATION, Value.PDFA); public static final DocumentType TEXT = new DocumentType(Value.TEXT, Value.PLAIN); private final String type; private final String subtype; /** * Creates a new document type. * * @param type The MIME type's type name. * @param subtype The MIME type's subtype name. */ public DocumentType(String type, String subtype) { if (type == null || subtype == null) { throw new NullPointerException("Type elements must not be null"); } this.type = type; this.subtype = subtype; } /** * Creates a new document type. * * @param fullType The MIME type's type name and subtype name, separated by a {@code /}. */ public DocumentType(String fullType) { int separator = fullType.indexOf('/'); if (separator == -1 || fullType.length() == separator + 1) { throw new IllegalArgumentException("Not a legal */* document type: " + fullType); } else { type = fullType.substring(0, separator); subtype = fullType.substring(separator + 1); } } public String getType() { return type; } public String getSubtype() { return subtype; } @Override public boolean equals(Object other) { if (this == other) return true; if (other == null || getClass() != other.getClass()) return false; DocumentType documentType = (DocumentType) other; return subtype.equals(documentType.subtype) && type.equals(documentType.type); } @Override public int hashCode() { return 31 * type.hashCode() + subtype.hashCode(); } @Override public int compareTo(DocumentType other) { return toString().compareTo(other.toString()); } @Override public String toString() { return type + "/" + subtype; } /** * A holder type for type and subtype names of known {@link com.documents4j.api.DocumentType}s. */ public static class Value { public static final String APPLICATION = "application"; public static final String TEXT = "text"; public static final String DOC = "msword"; public static final String DOCX = "vnd.openxmlformats-officedocument.wordprocessingml.document"; public static final String WORD_ANY = "vnd.com.documents4j.any-msword"; public static final String XLS = "vnd.ms-excel"; public static final String XLSX = "vnd.openxmlformats-officedocument.spreadsheetml.sheet"; public static final String EXCEL_ANY = "vnd.com.documents4j.any-msexcel"; public static final String ODS = "vnd.oasis.opendocument.spreadsheet"; public static final String PDF = "pdf"; public static final String HTML = "htm"; public static final String PDFA = "vnd.com.documents4j.pdf-a"; public static final String RTF = "rtf"; public static final String XML = "xml"; public static final String MHTML = "x-mimearchive"; public static final String CSV = "csv"; public static final String PLAIN = "plain"; private Value() { throw new UnsupportedOperationException(); } } }
그리고 아래와 같이 소스를 추가해 줍니다.
package com.documents4j.conversion.msoffice; import com.documents4j.api.DocumentType; /** * An enumeration of MS Excel file * format encodings. */ enum MicrosoftExcelFormat implements MicrosoftOfficeFormat { PDF("999", "pdf", DocumentType.PDF), XLSX("51", "xlsx", DocumentType.XLSX), XLS("43", "xls", DocumentType.XLS), HTML("44", "htm", DocumentType.HTML), ODS("60", "ods", DocumentType.ODS), CSV("6", "csv", DocumentType.CSV), XML("46", "xml", DocumentType.XML), TEXT("42", "txt", DocumentType.TEXT); private final String value; private final DocumentType documentType; private final String fileExtension; private MicrosoftExcelFormat(String value, String fileExtension, DocumentType documentType) { this.value = value; this.fileExtension = fileExtension; this.documentType = documentType; } public static MicrosoftExcelFormat of(DocumentType documentType) { for (MicrosoftExcelFormat enumeration : MicrosoftExcelFormat.values()) { if (enumeration.documentType.equals(documentType)) { return enumeration; } } throw new IllegalArgumentException("Unknown document type: " + documentType); } @Override public String getValue() { return value; } @Override public String getFileExtension() { return fileExtension; } }
위에서 HTML 44 로 설정한 이유는
VBA 에서 html 저장의 상수가 44 이기 때문입니다.
그리고 나서
package com.documents4j.conversion.msoffice; import com.documents4j.api.DocumentType; import com.documents4j.conversion.ViableConversion; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import java.io.File; import java.util.concurrent.Semaphore; import java.util.concurrent.TimeUnit; import static com.documents4j.api.DocumentType.Value.*; /** * A converter back-end for MS Excel. */ @ViableConversion( from = {APPLICATION + "/" + XLS, APPLICATION + "/" + XLSX, APPLICATION + "/" + EXCEL_ANY, APPLICATION + "/" + ODS}, to = {APPLICATION + "/" + PDF, APPLICATION + "/" + XLS, APPLICATION + "/" + XLSX, APPLICATION + "/" + HTML, APPLICATION + "/" + ODS, TEXT + "/" + CSV, TEXT + "/" + PLAIN, APPLICATION + "/" + XML}) public class MicrosoftExcelBridge extends AbstractMicrosoftOfficeBridge { private static final Logger LOGGER = LoggerFactory.getLogger(MicrosoftExcelBridge.class); private static final Object EXCEL_LOCK = new Object(); /** * Other than MS Word, MS Excel does not behave well under stress. Thus, MS Excel must not be asked to convert * more than one document at a time. */ private static final Semaphore CONVERSION_LOCK = new Semaphore(1, true); public MicrosoftExcelBridge(File baseFolder, long processTimeout, TimeUnit processTimeoutUnit) { super(baseFolder, processTimeout, processTimeoutUnit, MicrosoftExcelScript.CONVERSION); startUp(); } private void startUp() { synchronized (EXCEL_LOCK) { tryStart(MicrosoftExcelScript.STARTUP); LOGGER.info("From-Microsoft-Excel-Converter was started successfully"); } } @Override public void shutDown() { synchronized (EXCEL_LOCK) { tryStop(MicrosoftExcelScript.SHUTDOWN); LOGGER.info("From-Microsoft-Excel-Converter was shut down successfully"); } } @Override protected MicrosoftOfficeTargetNameCorrector targetNameCorrector(File target, String fileExtension) { return new MicrosoftExcelTargetNameCorrectorAndLockManager(target, fileExtension, CONVERSION_LOCK, LOGGER); } @Override protected MicrosoftOfficeFormat formatOf(DocumentType documentType) { return MicrosoftExcelFormat.of(documentType); } @Override protected MicrosoftOfficeScript getAssertionScript() { return MicrosoftExcelScript.ASSERTION; } @Override protected Logger getLogger() { return LOGGER; } }
위와 같이 HTML 코드를 추가해 줍니다.
해당 파일에서 실제 VBA 코드를 VBScript 를 작성해 줍니다.
' See http://msdn.microsoft.com/en-us/library/bb243311%28v=office.12%29.aspx Const WdExportFormatPDF = 17 Const MagicFormatPDF = 999 Const HtmFormat = 44 Dim arguments Set arguments = WScript.Arguments Function Zip(Outfile) 'This script is provided under the Creative Commons license located 'at http://creativecommons.org/licenses/by-nc/2.5/ . It may not 'be used for commercial purposes with out the expressed written consent 'of NateRice.com Dim outputfilename Dim outputfoldername Dim outputzipfile Set oFSO = WScript.CreateObject("Scripting.FileSystemObject") Set oShell = WScript.CreateObject("Wscript.Shell") '--------Find Working Directory-------- aScriptFilename = Split(Wscript.ScriptFullName, "\") sScriptFilename = aScriptFileName(Ubound(aScriptFilename)) sWorkingDirectory = Replace(Wscript.ScriptFullName, sScriptFilename, "") '-------------------------------------- '-------Ensure we can find 7z.exe------ If oFSO.FileExists(sWorkingDirectory & "\" & "7z.exe") Then s7zLocation = "" ElseIf oFSO.FileExists("C:\Program Files\7-Zip\7z.exe") Then s7zLocation = "C:\Program Files\7-Zip\" Else Zip = "Error: Couldn't find 7z.exe" Exit Function End If '-------------------------------------- outputfilename = """" & Outfile &".htm" & """" outputfoldername = """" & Outfile & ".files" & """" outputzipfile = """" &Outfile &".zip" & """" oShell.Run """" & s7zLocation & "7z.exe"" a " & outputzipfile &" " _ &outputfoldername&" "& outputfilename, 0, True If oFSO.FileExists(Outfile &".htm") Then oFSO.DeleteFile (Outfile &".htm") End IF If oFSO.FolderExists(Outfile & ".files") Then oFSO.DeleteFolder(Outfile & ".files") END IF oFSO.MoveFile Outfile &".zip",Outfile &".htm" If oFSO.FileExists(Outfile) Then Zip = 1 Else Zip = "Error: Archive Creation Failed." End If End Function ' Transforms a file using MS Excel into the given format. Function ConvertFile( inputFile, outputFile, formatEnumeration ) Dim fileSystemObject Dim excelApplication Dim excelDocument ' Get the running instance of MS Excel. If Excel is not running, exit the conversion. On Error Resume Next Set excelApplication = GetObject(, "Excel.Application") If Err <> 0 Then WScript.Quit -6 End If On Error GoTo 0 ' Find the source file on the file system. Set fileSystemObject = CreateObject("Scripting.FileSystemObject") inputFile = fileSystemObject.GetAbsolutePathName(inputFile) ' Convert the source file only if it exists. If fileSystemObject.FileExists(inputFile) Then ' Attempt to open the source document. On Error Resume Next Set excelDocument = excelApplication.Workbooks.Open(inputFile) If Err <> 0 Then WScript.Quit -2 End If On Error GoTo 0 ' Convert: See http://msdn2.microsoft.com/en-us/library/bb221597.aspx ' Encoding is https://docs.microsoft.com/en-us/dotnet/api/microsoft.office.core.msoencoding?view=office-pia, ' ScreenSize is https://docs.microsoft.com/en-us/dotnet/api/microsoft.office.core.msoscreensize?view=office-pia On Error Resume Next If formatEnumeration = MagicFormatPDF Then For Each ws In excelDocument.Worksheets excelApplication.Application.PrintCommunication = False ws.PageSetup.FitToPagesWide = 1 ws.PageSetup.FitToPagesTall = 0 excelApplication.Application.PrintCommunication = True Next excelDocument.ExportAsFixedFormat xlTypePDF, outputFile, xlQualityStandard, True, True excelDocument.Close False ElseIf formatEnumeration = HtmFormat Then excelDocument.WebOptions.RelyOnCSS = True excelDocument.WebOptions.OrganizeInFolder = True excelDocument.WebOptions.UseLongFileNames = True excelDocument.WebOptions.DownloadComponents = False excelDocument.WebOptions.RelyOnVML = False excelDocument.WebOptions.AllowPNG = True excelDocument.WebOptions.ScreenSize = 4 excelDocument.WebOptions.PixelsPerInch = 96 excelDocument.WebOptions.Encoding = 949 excelApplication.Application.DefaultWebOptions.SaveHiddenData = True excelApplication.Application.DefaultWebOptions.LoadPictures = True excelApplication.Application.DefaultWebOptions.UpdateLinksOnSave = True excelApplication.Application.DefaultWebOptions.CheckIfOfficeIsHTMLEditor = True excelApplication.Application.DefaultWebOptions.AlwaysSaveInDefaultEncoding = False excelApplication.Application.DefaultWebOptions.SaveNewWebPagesAsWebArchives = True excelDocument.SaveAs outputFile, formatEnumeration excelDocument.Close False Call Zip(outputFile) Else excelDocument.SaveAs outputFile, formatEnumeration excelDocument.Close False End If ' Close the source document. If Err <> 0 Then WScript.Quit -3 End If On Error GoTo 0 ' Signal that the conversion was successful. WScript.Quit 2 Else ' Files does not exist, could not convert WScript.Quit -4 End If End Function ' Execute the script. Call ConvertFile( WScript.Arguments.Unnamed.Item(0), WScript.Arguments.Unnamed.Item(1), CInt(WScript.Arguments.Unnamed.Item(2)) )
그리고 나서 Maven Install 을 해주시면
기본 Maven 의 라이브러리 설치 임시 폴더는 c:\user\username\.m2 폴더에 설치가 되고 해당 라이브러리 적용 되게 됩니다.
그리고 나서 stand alone 서버를 mvn package -P extras 로 해주시면 해당 소스가 적용 됩니다.
그리고 나서 테스트 해보면
일단 서버 프로그램 실행 해준뒤에
클라인언트에서
public class Main { public static final String SAMPLE_XLSX_FILE_PATH = "./sample-xlsx-file.xlsx"; public static void main(String[] args) throws InterruptedException, ExecutionException, IOException { // TODO Auto-generated method stub File wordFile = new File("TESTDCO.xlsx"), target = new File("test1.zip"); RemoteConverter.Builder builder = RemoteConverter.builder() .baseFolder(new File("test")) .requestTimeout(3600, TimeUnit.SECONDS) .baseUri("http://127.0.0.1:9998"); IConverter converter = builder.build(); Boolean conversion = converter .convert(wordFile).as(DocumentType.MS_EXCEL) .to(target).as(DocumentType.HTML) .execute(); System.out.println("test"); } }
소스로 테스트 해주면
위와 같이 압축 파일을 전송 받게 되며
이렇게 2개의 파일이 압축 되어 있으므로 해당 파일을 웹서버에 올리면
excel이 html 형식으로 보기 좋게 컨버팅 되어 사용 할수 있습니다.
'기타' 카테고리의 다른 글
[Hadoop] 분산 컴퓨팅 무료 강의 소개 (0) | 2019.07.25 |
---|---|
Meshroom (Image 를 3D CaD 로 만들어 주는 무료 프로그램) (0) | 2019.04.17 |
documents4j 문서 포멧 변환기 example(excel->pdf) (0) | 2018.10.08 |
실시간 위치 추적 시스템 솔류션 업체 (0) | 2018.08.31 |
기초 선형 대수학 자료 (1) | 2017.11.05 |