pdf2office ® professional v4.0 tutorial © recosoft corporation 2003-2007
TRANSCRIPT
PDF2Office® Professional v4.0 Tutorial
© Recosoft Corporation 2003-2007
PDF2Office® : What is PDF2Office®?
PDF2Office® is a utility for converting PDF documents into editable Word, PowerPoint, RTF, HTML etc… files.
PDF2Office® also allows extracting data such as images/text from a specific range of pages.
PDF2Office® : Usage & Solutions 1
Exchange PDF data more easily
Since most software output data to the PDF format its easy to exchange data contained in PDF files as PDF2Office® converts it to popular Office formats.
PDF2Office® : Usage & Solutions 2
Recover and Reuse dataRecover and Reuse data stored in PDF documents without purchasing expensive PDF tools and other software resulting in reduced cost and savings in time.
PDF2Office® : Usage & Solutions 3
Create Slideshows Instantly
Convert PDF documents to PowerPoint files instead of having to assemble various bits of data together to create them from scratch.
PDF2Office® Professional v4.0 : Features - Tour
Click to continue…
PDF2Office® Professional v4.0 : Features 1
Convert PDF documents to common filesConvert PDF documents into editable Word, PowerPoint, RTF, Unicode, Text and HTML files.
Open PDF documents directly within Microsoft Office applicationsPDF2Office® integrates seamlessly with Microsoft® Word/PowerPoint 2000-2007 & FrontPage/SharePoint Designer allowing you to open PDF documents within the respective application.
PDF2Office® Professional v4.0 : Features 2
Windows Integration
– Click OpenRight-click on a PDF document on the Desktop to open PDF documents directly in Microsoft® Office applications.
PDF2Office® Professional v4.0 : Features 3
Retain layout modeReconstructs a PDF document page by page, maintaining an almost exact replica of the original layout.
Batch ConvertConvert multiple files with one click.
PDF2Office® Professional v4.0 : Features 4
Inspect the details of a PDF filePDF2Office® allows you to inspect the details of a PDF file such as the meta-data, restrictions and fonts used providing valuable information.
Convert Images to popular typesPDF2Office® allows you to extract images contained in PDF files and convert them to JPEG, TIFF, PNG and other types.
PDF2Office® Professional v4.0 : Open a PDF File in Office software - 1
To directly open a PDF document within Microsoft Office applications
then click the “Open PDF file” command in the PDF2Office ribbon
Click the “PDF2Office” ribbon
For Word/PowerPoint 2007
For Word/PowerPoint 2000-2003 &FrontPage/SharePoint Designer
Click the “Open PDF file” icon in the PDF2Office toolbar
PDF2Office® Professional v4.0 : Open a PDF File in Office software - 2
The Open PDF File dialog box appears
Select the PDF file and click Open
PDF2Office® Professional v4.0 : Open a PDF File in Office software - 3
The PDF2Office® - Conversion Options dialog box appears
Conversion Options dialog box for Word
Verify the settings and Click the OK button
PDF2Office® Professional v4.0 : Open a PDF File in Office software - 4
The PDF2Office Progress Information window appears indicating the status of the conversion.
PDF2Office® Professional v4.0 : Open a PDF File in Office software - 5
The converted PDF document appears in a new window within the Microsoft Office application
PDF2Office® Professional v4.0 : Windows Integration
A right-click on a PDF document on the Desktop allows you to open a PDF file in an Office application
PDF2Office® Professional v4.0 : Interface
Toolbar
Conversion Pane
Control Center
Conversion Log Pane
PDF2Office® Professional v4.0 : Interface - Conversion
The Conversion pane is used for converting PDF documents
You can simply drag and drop items out of the Conversions pane to convert them.
PDF2Office® Professional v4.0 : Interface - Conversion Log Pane
The Conversion Log Pane lists the details of conversions
PDF2Office® Professional v4.0 : Control Center 1
When the Control Center is collapsed, it shows Conversion related menus
Clicking on the Conversion kind menu allows specifying different conversion kinds
PDF2Office® Professional v4.0 : Control Center 2
When the Control Center is expanded, it shows the Conversion Settings and Document Inspector Panels
PDF2Office® Professional v4.0 : Control Center 3
The Conversion Settings panel is used to specify detailed Conversion options when converting a PDF file.
Conversion Options
PDF2Office® Professional v4.0 : Control Center 4
The Document Inspector allows viewing the meta-data, restrictions and fonts used in a PDF file
Clicking on the Document Inspector panel reveals the Document Inspector.
PDF2Office® Professional v4.0 : Interface - Drag and Drop Conversion 1
Converting files is as easy as 1 - 2 - 3.
1. Drag and Drop Files into the Conversion Pane
PDF2Office® Professional v4.0 : Interface - Drag and Drop Conversion 2
2. Set the Conversion kind
and the final format to Convert to
PDF2Office® Professional v4.0 : Interface - Drag and Drop Conversion 3
3. Drag the selected items out and drop them to the Desktop
PDF2Office® Professional v4.0 : Conversion Kind - 1
Four Conversion kinds are available.
Convert to Word Processing fileUse this to convert PDF documents to editable Word,
RTF etc… documents.
PDF2Office® Professional v4.0 : Conversion Kind - 2
Convert only images toUse this to extract images contained in the pages of PDF
documents.
PDF2Office® Professional v4.0 : Conversion Kind - 3
Convert to Presentation FileUse this to convert PDF files to PowerPoint presentations
PDF2Office® Professional v4.0 : Conversion Kind - 4
Convert to Web Page formatUse this to convert PDF files to HTML files
PDF2Office® Professional v4.0 : Conversion Options - 1
PDF2Office® Professional v4.0 offers options for controlling the precision of the conversion process.
• Converting files to a word processing formatYou can control whether page breaks, headers/footers, etc. are processed.
• HTML ConversionsYou can specify the construct of the HTML document.
• Presentation File type conversionsYou can control the image type, compression and resolution.
PDF2Office® Professional v4.0 : Conversion Options - 2
• Page RangeYou can specify the range of pages to convert.
• Image type conversionsYou can control the compression quality, resolution and specify grouping of overlapping objects.
PDF2Office® Professional v4.0 : Conversion Options - 2
The conversion options are displayed in the Conversion Settings panel
To control the conversion process, click on the disclosure triangle in the Control Center
PDF2Office® Professional v4.0 : Conversion Options - 3
When converting to Word Processing file type you canfurther specify the processing type to perform -
Free Flowing
Retain layout
Extract only text
Extract only images
PDF2Office® Professional v4.0 : Conversion Options - 4
Retain layoutSelect this option if the document you are converting is a form or you want to retain an extremely accurate layout of the original PDF document.
Free FlowingSelect this option to convert a PDF document to a word processing file format by recreating the original construction and layout of the document.
Extract only textSelect this option when you want to extract only the text in a PDF Document.
PDF2Office® Professional v4.0 : Conversion Options - 5
Extract only imagesSelect this option to extract only the images contained in a PDF document.
PDF2Office® Professional v4.0 : Conversion Options - 6
Each processing type allows specifying processing options
Annotations
Make Tables
Graphics
Document Properties
Apply Page Breaks
Document Information
The list below shows some of the options available when converting to word processing files -
Image Type/Compression/Resolution Options
PDF2Office® Professional v4.0 : Conversion Options - 7
AnnotationsProcesses annotations stored in PDF documents and applies it to the related text.
Document PropertiesAttempts to form headers, footers, sections, columns, document margins, endnotes and footnotes.
Apply Page breaksControls whether PDF2Office® should calculate and apply page breaks.
PDF2Office® Professional v4.0 : Conversion Options - 8
GraphicsProcesses all graphics and images, and attempts to regroup independent graphic elements.
Make TablesAttempts to recreate tables where possible.
Document InformationSpecifies whether document information such as the document's author should be processed and transferred to the output file.
PDF2Office® Professional v4.0 : Conversion Options - 9
Image Type/Compression/Resolution OptionsAllows you to specify the image type, the compression settings and the resolution of the images when converting images.
PDF2Office® Professional v4.0 : Conversion Options - 10
When Converting only images to other image formats you can specify whether overlapping images should be grouped or not and the resolution and compression levels.
PDF2Office® Professional v4.0 : Conversion Options - 12
When Converting to presentation file format you can specify the processing type.
PDF2Office® Professional v4.0 : Conversion Options - 13
And furthermore you can specify processing options for image related settings etc…
PDF2Office® Professional v4.0 : Conversion Options - 14
For HTML (Web page) conversions, you can include annotations and PDF document information as meta-information.
PDF2Office® Professional v4.0 : Conversion Options - 15
You can also tailor the HTML output by including Navigation links, Table of Contents and other structural information.
PDF2Office® Professional v4.0 : Conversion Options - 16
You can control the range of pages to convert by specifying the pages in the Page Range area.
PDF2Office® Professional v4.0 : Advanced Conversion Options - 1
Starting with PDF2Office Professional v4.0, Advanced Conversion Options can be specified when using the
Convert to Word Processing file
Convert to Presentation file
Convert to Web Page format
conversion kinds.
PDF2Office® Professional v4.0 : Advanced Conversion Options - 2
Hyphenation Processing The hyphenation options enable precise control over hyphens that appear at the end of a line when a word is hyphenated.
• Remove HyphenSetting this option removes the hyphen mark detected in between hyphenated words at the end of a line.
• Retain HyphensSetting this option retains the hyphen mark detected in between hyphenated words at the end of a line.
• Don’t Process HyphensSetting this option will not process hyphenated words and may at times leave "white space" on either side of the hyphen mark of hyphenated words at the end of a line.
PDF2Office® Professional v4.0 : Advanced Conversion Options - 3
Mathematical Formula Processing This “Form mathematical formulas where possible” option allows PDF2Office to identify and group data that may be interpreted as mathematical formulas.
Although every attempt has been made to identify and group mathematical formulas, at times the processing may not operate as expected.
PDF2Office® Professional v4.0 : Advanced Conversion Options - 4
Font Matching, Substitution & Scaling
Starting with PDF2Office Professional v4.x series, a new PostScript font matching mechanism has been introduced.
The fonts in the PDF document are matched against fonts present in your system
If a match cannot be found, the font can be substituted with a default font and its size scaled to closely match the layout characteristics of the original PDF document.
PDF2Office® Professional v4.0 : Advanced Conversion Options - 5
• Substitute fonts that don't map with preset fontsThis option substitutes the fonts that could not be matched in a PDF document against the installed fonts on your computer.
• Scale font size to match original layoutThis option controls whether the font size of the substituted fonts should be scaled in size so that it closely matches the layout of the original document.
PDF2Office® Professional v4.0 : Advanced Conversion Options - 6
The Substitution fonts are predefined in the Fonts Substitution panel of the Preferences dialog.
PDF2Office® Professional v4.0 : Advanced Conversion Options - 7
The Substitution Fonts can be specified per file when converting a PDF file using PDF2Office via the “Font Substitution panel” located in the Conversion options area. You can specify a different font for every unmapped font.
PDF2Office® Professional v4.0 : Advanced Conversion Options - 8
You can also specify the Substitution Fonts to use when opening a PDF file directly in the Office Applications.
PDF2Office® Professional v4.0 : Retain layout - 1
The Retain layout processing type is ideal when you want to replicate the PDF document page for page.
When Retain layout processing is used PDF2Office® tries to match as precise as possible the location of the data.
PDF2Office® Professional v4.0 : Retain layout - 2
The Retain layout option groups the data and stores everything into the Drawing/Text box layer of the target file type.
All data remains ungrouped allowing for easy editing and furthermore text box formation has been dramatically reduced.
PDF2Office® Professional v4.0 : Conversion Examples 1
This section shows examples of the usage of the various options along with conversion results.
Click to Continue…
PDF2Office® Professional v4.0 : Conversion Example - 2a
Convert to Word Processing FileScanned/Faxed Document Example
Background: If a PDF file was created using a scanner or by a fax machine the data is normally an image. PDF2Office is not an “OCR (Optical Character Recognition)” software so it will treat the data as an image.
To convert documents that have been scanned/faxed you must first run the PDF file through an OCR process. Then save the PDF file.
After the “OCR” process,the text is ordinarily behind the “image”.
PDF2Office® Professional v4.0 : Conversion Example - 2b
Convert to Word Processing FileScanned/Faxed Document Example contd…
Simply uncheck “graphics” from the “Processing Options” when converting the file and the text will appear fully formatted.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Original PDF file Converted to Word format
PDF2Office® Professional v4.0 : Conversion Example - 3a
Convert to Word Processing FilePage Layout/Graphics Software created PDF files
Background: A page layout software/graphics software is very different from a Word Processing software. It is very difficult to replicate Page layout type documents in a Word processing software.
To convert PDF documents that are created by such software you need to specify to PDF2Office that you want to retain the layout
PDF2Office® Professional v4.0 : Conversion Example - 3b
Convert to Word Processing FilePage Layout/Graphics Software created PDF files (contd)
Original PDF file Converted to Word format
Simply use the “Retain Layout/Process Form” processing type when converting the file. The layout comes out almost exact.
PDF2Office® Professional v4.0 : Conversion Example - 4a
Convert only images toExtracting graphics that overlap and are separate layers
Background: Sometimes a PDF file can contain separate overlapped (layered) images. When these images are converted to “other” image types the overlapped images are written out as separate images.
You can combine the layered images and treat it as a single image and also specify the resolution to use in such instances.
PDF2Office® Professional v4.0 : Conversion Example - 4b
Convert only images toExtracting graphics that overlap and are separate layers (contd).
Specify “Group overlapping images” when converting the file and the resolution to use.
PDF2Office® Professional v4.0 : Conversion Example - 4c
Convert only images toExtracting graphics that overlap and are separate layers (contd).
Original layered images in PDF file with different resolutions
Converted to JPEG as one image with resolutions matched
PDF2Office® Professional v4.0 : Conversion Example - Summary
PDF2Office® Professional v4.0 provides “FULL” control over the conversion process. The options exist so that you can fine tune the conversion results to achieve the desired output.
The examples provided are a mere glimpse of what can be achieved using the different options.
PDF2Office® Professional v4.0 : Using the Document Inspector - 1
The Document Inspector is a very powerful tool as it allows inspecting the details of a PDF document.
PDF2Office® Professional v4.0 : Using the Document Inspector - 2
It show the file’s meta-data
And within the meta-data the original software that it was created in.
PDF2Office® Professional v4.0 : Using the Document Inspector - 3
Knowing the original software the file was created in allows identifying which process to use when converting a file to the word processing format.
• Use Retain Layout When created in a page layout software to maintain layout.
PDF2Office® Professional v4.0 : Using the Document Inspector - 4
The Document Inspector also shows the restrictions applied to the file.
Explaining whether the file can be converted.
PDF2Office® Professional v4.0 : Using the Document Inspector - 5
Finally the Document Inspector shows the fonts used in the file.
Showing the fonts that cannot be matched and allowing you to decide the font to substitute it with.
PDF2Office® Professional v4.0 : Batch Convert - 1
Batch Conversion converts all files in a specific folder.1. To use this feature, first preset the type through the Preferences Setting
PDF2Office® Professional v4.0 : Batch Convert - 2
2. Then set the Destination folder for the files to be placed after they are converted
3. You can also set the directory where the original PDF files reside as the destination through the Preferences settings.
PDF2Office® Professional v4.0 : Batch Convert - 3
3. Finally using the Batch Convert command, Choose the folder which contains the files to convert
PDF2Office® Professional v4.0 : Batch Convert - 4
The conversion starts and the converted files are placed in the destination folder
PDF2Office® Professional v4.0 : Trouble Shooting 1
At times the conversions may not be what one would expect. There may be many reasons - from this section onwards we describe how to trouble shoot some common problems.
PDF2Office® Professional v4.0 : Trouble Shooting 2
• Does the converted file have columns or page margins that don’t seem proper?
Ans: Turn off the Document Properties option and try reconverting the file. Since the Document Properties option tries to figure out margins, columns and sections etc… sometimes turning this option off produces a file that is properly formatted.
You can also try using the Retain layout mode.
PDF2Office® Professional v4.0 : Trouble Shooting 3
• Does the converted file have sections that are proper and then an area where the output is not properly formatted?
Ans: Try converting the PDF document by breaking up the conversion by specifying a range of pages to convert. Then specify another range of pages to convert - until you have finished converting all pages.
Finally assemble the files together in the word processor.
PDF2Office® Professional v4.0 : Trouble Shooting 4
• I have a document that has both landscape and/or portrait pages. The converted output has improper formatting - What should I do?
Ans: Try converting the PDF document by breaking up the conversion by specifying a range of pages to convert. Then specify another range of pages to convert - until you have finished converting all pages.
Finally assemble the files together in the word processor.
PDF2Office® Professional v4.0 : Trouble Shooting 5
• Are tables improperly formatted?
Ans: There can be many reasons why tables may not format properly. One possible reason is that the table has borders that are hidden or cells that are not connected. In such cases PDF2Office® can’t determine the table precisely and may capture the data as a graphic. The best option is to deselect both the Graphics and Make Tables processing options and then reconstruct the table in the output document by using the text data that was collected.
PDF2Office® Professional v4.0 : Trouble Shooting 6
• Are graphics not being processed properly?
Ans: PDF2Office™ Professional v4.0 doesn’t honor certain clipping paths, transparencies and some graphics transformations and manipulations. In such cases turning off the “Graphics” option will stop the graphics processing with no graphics being output into the converted file.
PDF2Office® Professional v4.0 : Trouble Shooting 7
• Do you get an Error -3 or Error -11000?
Ans: When an error -3 or -11000 appear these mean that the PDF document has some kind of data that cannot be processed and the file may not get converted.
PDF2Office® Professional v4.0 : Trouble Shooting 8
• Are you trying to convert PDF documents created by Page Layout software?
Ans: Since a word processing software is not a Page Layout program it is very difficult to reconstruct such documents due to inherent limitations in a word processor. The Retain layout option attempts to reconstruct the document page by page maintaining an almost exact replica of the original layout.
PDF2Office® Professional v4.0 : Trouble Shooting 9
• It takes a long time to convert 800+ page files.
Ans: When converting 800+ page documents we recommend you convert such documents on a machine equipped with at least a 2Ghz processor with at least 512MB RAM to speed things up.
PDF2Office® Professional v4.0 : Trouble Shooting 10
• The document still doesn’t convert properly at all?
Ans: Please contact Recosoft Corporation customer support about this. We are continuously improving the PDF conversions.
PDF2Office® Professional v4.0: Processing Scope - 1
• PDF2Office® takes a PDF document and performs the following processing -
1. Forms paragraphs and applies indentations (justification is set to left or center)
2. Applies text styles and retains font information (or font mapping is performed)
3. Constructs page properties such as margins and page breaks where appropriate
4. Calculates columns and section breaks
5. Matches headers and footers where possible
PDF2Office® Professional v4.0: Processing Scope - 2
• PDF2Office® Professional v4.0 takes a PDF document and performs the following processing (contd.) -
6. Forms endnotes/footnotes
7. Identifies and creates Tables
8. Regroups intersecting and overlapping graphics
9. Processes all images (except JBIG2/JPEG2000 format) and re-groups intersecting sliced images
PDF2Office® Professional v4.0: Specifications
• Microsoft Windows 2000 SP4 with GDI+, XP, 2003, Vista
• Machine with at least a 500 MHz Pentium III processor
• Microsoft Word/PowerPoint 2000-2007 to open PDF documents directly within Microsoft Word/PowerPoint
• Microsoft FrontPage 2003/SharePoint Designer to open PDF documents directly in FrontPage/SharePoint Designer
PDF2Office® Professional v4.0: Export File Formats
• Word 97/98-2003/2004 Windows/Macintosh
• PowerPoint 97/98-2003/2004 Windows/Macintosh
• RTF
• Unicode UTF-8/16
• HTML
• JPEG/PNG/TIFF/PICT/BMP/Photoshop etc…
PDF2Office® Professional v4.0: Contact Information
• Recosoft CorporationUstubo Hommachi 2-9-11, Nishi-ku,Osaka, JapanTel: +81-6-6443-0015 Fax: +81-6-6443-1458
• For General Inquiries : [email protected]
• http://www.recosoft.com
PDF2Office® Professional v4.0 : Notices
PDF2Office® is a registered trademark of Recosoft CorporationAll other trademarks are the property of their respective owners.