Office 2007 uses a new document format called Office Open XML or OOXML for short. They typically have the extension .docx to differentiate them from other Office documents which are .doc If you open one of these files in notepad you will notice that it is almost completely unreadable and you will wonder how this format could possibly be better.
The same file saved in both Office 2003 doc format and in Office 2007 docx format will be much smaller in docx format. Why is this? Its because docx files are really just zip files! Simply take any .docx file and rename it to .zip and then extract it. You will then have several directories and XML files which make up the docx file. In these XML files you can see the different styles used as well as your text in the document.xml file.
Apart from having an XML file of your text the extracted data will also contain a Media folder if any was included. This folder will contain all images and media that was added to the Word document making it extremely easy to copy all the images used in a docx to something else. Whats even more amazing then just having a folder of all the images available by only just unzipping it is that the images will be their original size, all that changes is that they are given a new name such as image1.jpg. Try it for yourself. Create a new Word doc, drag in several images. Then resize the images down and save the file as a docx. Then unzip the docx and look in the media folder. All images will be in folder and will be the real size instead of the scaled down one shown in the Word doc.