office documents metadata

Catalog files, extract metadata to xml, html