Coming from a recent update through Use EPO publication server for obtaining PDFs of EP publications (#12) · ip-tools/ip-navigator@525f5f4 · GitHub (thanks again, @aghster!), PDF documents can be directly fetched from the European publication server.
After looking at them in detail, we can say these documents are minted by high standards. Besides being actually ASCII-accessible (not only scanned images attached to each other), there is also XMP metadata in XML/RDF format embedded into the documents.
This is coming from the EPO initiative to encode and publish data as linked open data, see also EPO - Linked open EP data. While we aimed at unlocking this for PatZilla already, we didn’t have the chance to try yet. However, great to see this in the wild already.
Adobe’s Extensible Metadata Platform (XMP) is a file labeling technology that lets you embed metadata into files themselves during the content creation process. With an XMP enabled application, your workgroup can capture meaningful information about a project (such as titles and descriptions, searchable keywords, and up-to-date author and copyright information) in a format that is easily understood by your team as well as by software applications, hardware devices, and even file formats. Best of all, as team members modify files and assets, they can edit and update the metadata in real time during the workflow.
With XMP, desktop applications and back-end publishing systems gain a common method for capturing, sharing, and leveraging this valuable metadata. Adobe has taken the “heavy lifting” out of metadata integration, offering content creators an easy way to embed meaningful information about their projects and providing industry partners with standards-based building blocks to develop optimized workflow solutions.
By providing a standard way of tagging files with metadata across products from Adobe and other vendors, XMP is a powerful solution enabler. As an open source technology, it is freely available to developers, which means that the user community benefits from the innovations contributed by developers worldwide. The XMP SDKs are available in the downloads section. Furthermore, XMP is extensible — it can accommodate existing metadata schemas, so systems don’t need to be rebuilt from scratch. A growing number of third-party applications now support XMP.
Since early 2012, XMP is also an ISO standard (16684-1).
As serialization format, a subset of the W3C RDF/XML syntax is most commonly used. It is a syntax to express a Resource Description Framework graph in XML. There are various equivalent ways to serialize the same XMP packet in RDF/XML.