What is Metadata?
Office documents, pictures, videos and other files contain lots of information in the meta tags that may deanonymize their author. Before uploading them to the Internet you should remove this meta data.
For more information about metadata please read the MAT (w) website. Information about metadata can also be found on the Warning page, see Whonix doesn't clear the metadata of your documents.
Metadata cannot be used to de-anonymization, if you follow the following guidelines:
- Always think twice before uploading anything.
- Upload files only, which you, either:
- created inside your Whonix-Workstation
- downloaded using your Whonix-Workstation
- carefully scrubbed yourself
- For example, if you want to upload photos or videos, unless you know what you are doing, get a separate camera, which you only use for anonymous usage.
- Keep in mind that even if de-anonymization is not possible, identity correlation to the same pseudonym might still happen. For example, let's suppose you created a video using a video creation software and uploaded it to a popular video portal under the pseudonym A. Then you created another video using the same software on the same machine and uploaded it under the pseudonym B. An adversary checking the metadata could correlate pseudonym A with pseudonym B.
- Not metadata, but for anonymous photo sharing, also learn about Fingerprintable Camera Anomalies.
- Files by editor software (such as Microsoft Word, Libre Office, and so forth) could leak information about incremental edits and updates.
- Re-saving a final copy of the document is enough for mitigation. (?)
- This is by no means is this an exhaustive list of file format leak problems. Understand that file format specifications are not designed with adversarial situations in mind. 
- Jpeg images are stored in PDFs as-is in their complete form and can leak EXIF data.
- Generally speaking, the only reliable way to scrub any type of documents without unintended leaks is to use Imagemagick convert them to images then - |import into a new PDF| (?) before distribution. Reportedly this is the same technique used by advanced adversaries. 
- Note that you cannot sanitize untrusted files you download using that or any other way. Malicious data can be crafted to remain intact even if processed by a format encoder. The best way to interact with these files is in the Workstation VM.
- Then apply MAT on the resulting files for good measure.
- Look into MAT (w) (Metadata Anonymisation Toolkit). Preinstalled on Whonix.
MAT - Metadata Anonymisation Toolkit
Check the MAT homepage for supported file formats.
Why MAT is not the ultimate solution?
Mat only removes metadata from your files, it does not anonymise their content, nor handle watermarking, steganography, or any overly customized metadata field/system. Also please not that MAT does its best to scrub as much metadata as possible, it is not really efficient at scrubbing embedded media inside complex formats. For examples, images embedded inside PDF may not be cleaned!
(Also keep in mind MAT is not actively maintained by the author because of health reasons)
Add the files you want to clean to the list. The "dirty" state indicates that the file contains removable metadata. After cleaning, the cleaned files will be created in the same directory as the original files with the extension ".cleaned".
- Exiftool - a Perl application for editing metadata in a wide variety of files.
- exiv2 - a C++ application to manage image metadata
- jhead - a jpeg header manipulation tool
- pdfparanoia - a tool to remove watermarks from academic papers
Thanks for the MAT public domain screenshot to awxcnx.de.
Gratitude is expressed to JonDos for permission to use material from their website. (w) (w)  The Metadata page contains content from the JonDonym documentation Anonymizing Documents and Pictures page.
Impressum | Datenschutz | Haftungsausschluss
This is a wiki. Want to improve this page? Help is welcome and volunteer contributions are happily considered! See Conditions for Contributions to Whonix, then Edit! IP addresses are scrubbed, but editing over Tor is recommended. Edits are held for moderation.Whonix (g+) is a licensee of the Open Invention Network. Unless otherwise noted, the content of this page is copyrighted and licensed under the same Libre Software license as Whonix itself. (Why?)