Surfing Posting Blogging
|About this Surfing Posting Blogging Page|
Contributor maintained wiki page.
Surf, Blog and Post anonymously on the Internet. Essential knowledge about Anonymous File Sharing, Keystroke / Mouse Fingerprinting and Stylometry risks. Tips for avoiding detection.
Tor Browser is installed in Whonix by default to browse the Internet anonymously. Tor Browser is optimized for safe browsing via pre-configured security and anonymity settings that are quite restrictive. It is recommended to read the entire Tor Browser chapter for tips on basic usage before undertaking any high-risk activities.
Whonix-Workstation contains all the necessary tools to post or run a blog anonymously. It is recommended to review the following chapters / sections, as well as follow all the recommendations on this page:
- Tips on Remaining Anonymous
- Data Collection Techniques
- Unsafe Tor Browser Habits
- Hardware Threat Minimization
- Multiple Whonix-Workstation
Anonymous File Sharing
It is possible for adversaries to link audio recordings to the specific hardware (microphone) that is used. This has implications for shooting anonymous videos. It is also trivial to fingerprint the embedded audio acoustics associated with the particular speaker device; for example, consider ringtones and video playback in public spaces.  For these reasons it is recommended to follow the operational security measures in the Photographs section when sharing audio files.
This recommendation equally applies to any data that is recorded by each and every other sensor component, such as accelerometers.  The best way to defend against this threat is to deny all access to the hardware in question, while also avoid the sharing of unencrypted data recorded by sensors. Similarly, it is inadvisable to share audio with third parties who have limited technical ability or if they are potentially malicious.
Digital watermarks are a subset of the science of steganography and can be applied to any type of digital media, including audio, pictures, video, texts or 3D models.  In basic terms, covert markers are embedded into the "noise" of data which are imperceptible to humans: 
Digital watermarking is defined as inserted bits into a digital image, audio or video file that identify the copyright information; the digital watermarking is intended to be totally invisible unlike the printed ones, bits are scattered in different areas of the digital file in such a way that they cannot be identified and reproduced, otherwise the whole goal of watermarking is compromised.
A digital watermark is said to be robust if it remains intact even if modifications are made to the files.   In addition to protecting copyright, another watermarking goal is to trace back information leaks to the specific source. A good countermeasure to this threat is to run documents through an optical character recognition (OCR) reader and share the output instead.
According to a talk by Sarah Harrison from WikiLeaks,  source tracing can also happen through much simpler techniques such as inspecting the access lists for the materials that have been leaked. For example, if only three people have access to a set of documents then the hunt is narrowed down considerably.
Redacting identifying information in electronic documents by means of image transformation (blurring or pixelization) has proven inadequate for concealing the intended text; the words can be reconstructed by machine learning algorithms. Solid bars are sufficient but they must be large enough to fully cover the original text. Otherwise, clues are left about the length of underlying word(s) which makes it easier to infer the censored text based on the sentence remainder.  Only digital redaction bars are recommended as manual Sharpie ones can be insufficient, leading to leaks when documents are scanned.
Every camera's sensor has a unique noise signature because of subtle hardware differences. The sensor noise is detectable in the pixels of every image and video shot with the camera and could be fingerprinted. In the same way ballistics forensics can trace a bullet to the barrel it came from, the same can be accomplished with adversarial digital forensics for all images and videos.   Note this effect is different from file Metadata that is easily sanitized with the Metadata Anonymization Toolkit v2 (MAT2). 
A camera fingerprint arises for the following reason: 
Photo-Response NonUniformity (PRNU) is an intrinsic property of all digital imaging sensors due to slight variations among individual pixels in their ability to convert photons to electrons. Consequently every sensor casts a weak noise-like pattern onto every image it takes and this pattern plays the role of a sensor fingerprint.
The reason for this phenomenon is all devices have manufacturing imperfections that lead to small variation in camera sensors, causing some pixels to project colors a little brighter or darker than normal. When extracted by filters, this leads to a unique pattern.  Simply put, the type of sensor being used, along with shot and pattern noise leads to a specific fingerprint.
The threat to privacy is obvious: if the camera reference pattern can be determined and the noise of an image is calculated, a correlation between the two can be formed. For example, recent research suggests that only one image is necessary to uniquely identify a smartphone based on the particular PRNU of the built-in camera's image sensor.  Major data mining corporations are starting to use this technique to associate identities of camera owners with everything or everyone else they shoot.  It follows that governments have had the same capabilities for some time now and can apply them to their vast troves of data.
There are methods to destroy, forge or remove PRNU, but these should only be used with caution. The reason is related research on the question of spoofing sensor fingerprints in image files has proven non-trivial and easily defeated.  
Other unique camera identifiers include specific JPEG compression implementation, distinct pattern of defective pixels (hot/dead), focal length and lens distortion, camera calibration and radial distortion correction, distribution of pixels in a RAW image and statistical tests such as Peak to Correlation Energy (PCE) ratio. 
Operational Security Advice
This section assumes the user wants to preserve their anonymity, even when publicly sharing media on networks that are monitored by the most sophisticated adversaries on the Internet. Always conduct a realistic threat assessment before proceeding. These steps do not apply for communications that never leave anonymous encrypted channels between trusted and technically competent parties.
Table: Operational Security Advice
|Current Devices||It is almost a certainty that photos and videos have been shared from your current devices through non-anonymous channels. Do not use any of these devices to shoot media that will be shared anonymously.|
Most will probably want to avoid phones altogether and use tablets instead, but for most situations phones are a reasonable choice:
Keystroke biometric algorithms have advanced to the point where it is viable to fingerprint individuals based on soft biometric traits. This is a privacy risk because masking spatial information -- such as the IP address via Tor -- is insufficient for anonymity. 
Unique fingerprints can be derived from various dynamics: 
- Typing speed.
- Exactly when each key is located and pressed (seek time), how long it is held down before release (hold time), and when the next key is pressed (flight time).
- How long the breaks/pauses are in typing.
- How many errors are made and the most common errors produced.
- How errors are corrected during the drafting of material.
- The type of local keyboard that is being used.
- The likelihood of being right or left-handed.
- Rapidity of letter sequencing indicating the user's likely native language.
A unique neural algorithm generates a primary pattern for future comparison. It is thought that most individuals produce keystrokes that are as unique as handwriting or signatures. This technique is imperfect; typing styles can vary during the day and between different days depending on a person's emotional state and energy level. 
There are several related anonymity threats that need to be considered. For instance:
- Linguistic style must be disguised to combat stylometric analysis.
- Mouse tracking techniques must also be countered.
Keystroke Anonymization Tool (kloak)
kloak is designed to stymie adversary attempts to identify and/or impersonate users' biometric traits. The GitHub site succinctly describes kloak's purpose and the tradeoff between usability and the level of privacy. Notably, shorter time delays between keystrokes and release events reduces overall anonymity: 
kloak is a privacy tool that makes keystroke biometrics less effective. This is accomplished by obfuscating the time intervals between key press and release events, which are typically used for identification. This project is experimental.
kloak works by introducing a random delay to each key press and release event. This requires temporarily buffering the event before it reaches the application (e.g., a text editor).
The maximum delay is specified with the -d option. This is the maximum delay (in milliseconds) that can occur between the physical key events and writing key events to the user-level input device. The default is 100 ms, which was shown to achieve about a 20-30% reduction in identification accuracy and doesn't create too much lag between the user and the application (see the paper below). As the maximum delay increases, the ability to obfuscate typing behavior also increases and the responsive of the application decreases. This reflects a tradeoff between usability and privacy.
While kloak makes it hard for adversaries to identify individuals or to replicate their typing behavior -- for example to overcome two-factor authentication based on keystroke biometrics -- it is not perfect:
- Small delays are not effective; higher values that can be tolerated are preferable.
- It does not address stylometric threats.
- Repeated (held-down) key presses that repeat at a unique rate can lead to identification.
Testing and Interpretation
NOTE: The test website documented below (keytrac.net) seems to be permanently down. This wiki page needs yet to be updated to using one or multiple of the following tests. Help welcome!
- TODO: research https://github.com/topics/keystroke-dynamics
It is recommended to test that kloak is actually working by trying an online keystroke biometrics demo. Three different scenarios are available, but "Train normal" (without kloak running) is not recommended for anonymity reasons:
- Train normal, test normal
- Train normal, test kloak
- Train kloak, test kloak
The KeyTrac demo allows the entering of a username and password on the enrollment page and then testing it on an authentication page. Below is a sample result and interpretation of entering a username and password without/without kloak running, with both training methods.
Table: Sample kloak Test Results
|kloak Configuration||Results and Interpretation|
|Train normal, test normal||
|Train normal, test kloak||
|Train kloak, test kloak||
From the first test set it is evident that without kloak, users can be identified with a high degree of certainty. The second test set demonstrates that kloak definitely obfuscates typing behavior, making it difficult to authenticate or identify a particular user. Finally, the third set evidences that users who run kloak may look "similar" to one another. That is, it might be possible to identify kloak users from non-kloak users; if this is true, then the anonymity set will increase as more users start running kloak.
- Specific mouse tracking software can reveal:
Covert Impairments in Human Computer Interaction
Recent research on deceptive input modifications - where a site deliberately misrepresents mouse movements or key presses to elicit a corrective user reaction - reveals that this reaction is apparently fingerprintable. This tactic is used in CAPTCHAs and site logins. Possible mitigations involve detecting third party meddling with inputs (since it is an active process) and applying anti-fingerprinting protections on the fly.
Whonix does not obfuscate an individual's writing style. Consequently, unless precautions are taken (see below), users are at risk from stylometric analysis based on their linguistic style. Research suggests only a few thousand words (or less) may be enough to positively identify an author and there are a host of software tools available to conduct this analysis.
This technique is used by advanced adversaries to attribute authorship to anonymous documents, online texts (web pages, blogs etc.), electronic messages (emails, tweets, posts etc.) and more. The field is dominated by A.I. techniques like neural networks and statistical pattern recognition, and is critical to privacy and security. Current anonymity and circumvention systems are focused on location-based privacy, but ignore leakage of identification via the content of data which has a high accuracy in authorship recognition (90%+ probability). 
- stylistic flourishes
- spelling preferences and misspellings
- language preferences
- word frequency
- number of unique words
- regional linguistic preferences in slang, idioms and so on
- sentence/phrasing patterns
- word co-location (pairs)
- use of formal/informal language
- function words
- vocabulary usage and lexical density
- character count with white space
- average sentence length
- average syllables per word
- synonym choice
- expressive elements like colors, layout, fonts, graphics, emoticons and so on
- analysis of grammatical structure and syntax
Fortunately, research suggests that by purposefully obfuscating linguistic style or imitating the style of other known authors, this is largely successful in defeating all stylometric analysis methods. This means they are no better than randomly guessing the correct author of a document. However, using automated methods like machine translation services does not appear to be a viable method of circumvention. 
Tips for Anonymous Posting, Blogging and Uploading
Before undertaking any anonymous activities, be sure to understand and exercise a healthy dose of Operational Security (OpSec). Even the best anonymity software available today cannot prevent catastrophic mistakes by individuals.
Table: Blogging Tips
|Activity Partitioning||Separate all online activities and only use a dedicated email address for the blog.|
|Blog Administration||Usually the blog is administrated via a web interface only. Use Tor Browser for all blog activities.|
|Blog Posting||Every type of blog software offers the option to select a point in time when new postings are published. It is safer to delay the publishing of new posts to a time when you are not online anymore, rather than publishing immediately. |
|Email Address Registration||
For anonymous blogs hosted on third-party services, register it with a new and anonymous e-mail address (see E-Mail) that has never been used before and which has been exclusively paired with Tor for logins and other related activity: 
Table: Browser Input Tips
|Accidental Searches||Text can be accidentally pasted into the search or URL bar, which triggers an unintended search across the public internet.|
Hardware Threat Mitigation
Table: Hardware Threat Tips
|Disable Dangerous Peripherals||
|Remove External Devices||Remove all phones, tablets and so on from the room to avoid them issuing watermarked sounds as well as listening to keystroke sounds and watermarked sounds.   Similarly, do not make / take calls in the same room where anonymous browsing is underway, or run sensitive applications (like Tor Browser for Android) or have documents open on the phone before calls.|
|Side-channel Attacks||This class of attacks depend on eavesdropping on the passively leaked signals by a trusted process which a surveilling entity can use to reconstruct the sensitive data on the computer. These are more dangerous than covert-channels discussed below.
|Covert-channel Attacks||In contrast to side-channel attacks, covert-channels depend on a compromised process operating on the machine to be able to exfiltrate data to the outside without being noticed by the machine operator. While attacks of this nature that cross security and virtualization boundaries on the same machine are known, this section covers air-gapped machines which are the hardest targets to penetrate from the attacker's perspective. While using SSD PCs are a solution to many of these attacks, they bring another set of problems regarding the impossibility of secure data erasure.|
|Wi-Fi Signal Emitters||Another keystroke snooping technique involves a WiFi signal emitter (router) and malicious receiver (laptop) that detects changes in the signal that correspond to movements of the target's hands on their keyboard. According to researchers, a user’s movement over the keyboard generates a unique pattern in the time-series of Channel State Information (CSI) values. A Wi-Fi signal based keystroke recognition system called WiKey can recognize typed keys based on CSI values at the Wi-Fi signal receiver end using Commercial Off-The-Shelf Wi-Fi devices. In real-world testing, “WiKey achieves an average keystroke recognition accuracy of 77.43% for typed sentences when 30 training samples per key were used. WiKey achieves an average keystroke recognition accuracy of 93.47% in continuously typed sentences with 80 training samples per key,”. Limitations, include variations in environment, as it can work well only under relatively stable environments. Human motion in surrounding areas, changes in orientation and distance of transceivers, typing speeds, and keyboard layout and size also influence the accuracy.|
Table: User Habit Tips
|Cookies||Remember to purge the browser's cookie and history cache periodically. When running Tor Browser, it is recommended to simply close Tor Browser after online activities are finished, then restart it.|
|Pseudonym Isolation||For advanced separation of discrete activities, use Multiple Whonix-Workstation.|
|Publishing Time||Over time, pseudonymous activity can be profiled to provide an accurate estimate of the timezone, reducing the user's anonymity set. It is better to restrict posting activity to a fixed time that fits the daily activity pattern of people across many places.|
|Tor Browser Censorship||In most cases, Tor blocks by destination servers can be easily bypassed with simple proxies.|
- Do You Hear What I Hear? Fingerprinting Smart Devices Through Embedded Acoustic Components
- Mobile Device Identification via Sensor Fingerprinting.
- For detailed information on this topic, see: Steganography and Digital Watermarking.
- Notably the watermark does not change the size of the carrier signal.
- Missing footnote.
- On the (In)effectiveness of Mosaicing and Blurring as Tools for Document Redaction
- Fingerprintable Camera Anomalies
- While MAT2 does clean a wide range of files, the list of supported file formats is not exhaustive. Also, the author of MAT notes embedded media inside of complex formats might not be cleaned.
- The error rates is less than 0.5%
- Sensor Noise Camera Identification: Countering Counter-Forensics
- Anonymizing the PRNU noise pattern of pictures remains a promising area of research.
- Defeating Image Obfuscation with Deep Learning
/dev/input/event0in Qubes VMs will not work since not a keyboard device.
- This deanonymization technique is likely to succeed, since it is already used to lock persons out of secure accounts (pending identity verification) when their monitored behavior significantly deviates from behavior that has been learned.
- 8-16ms should be enough for this purpose.
- This will trick lesser adversaries, who cannot force the blog service provider to reveal exactly when and for how long a blog administrator logged in. This will not fool the blog service provider nor an adversary capable of recording all internet traffic.
- Do not use personal or identifying data as part of the account creation.
- This does not clear EU false positive requirements however, so they recommend it is combined with keystroke dynamics for extra confirmation, see: User re-authentication via mouse movements, On Using Mouse Movements as a Biometric and http://www.cs.wm.edu/~hnw/paper/ccs11.pdf
- For instance, stylometry works with less data (final text only) and in concert with keystroke fingerprinting is completely effective. An adversary can compare statistics about a user's typing over clearnet, then compare it to texts composed over Tor in real-time.
- For example, launch KWrite:
Start menu button→
Text Editor (KWrite). Once KWrite is open, click on
Automatic spell checking. Misspelled words will be underlined with a red color.
- User Behavior
- This deanonymization technique works by playing a unique sound inaudible to human ears which is picked up by the microphones of untrusted devices. Watermarked audible sounds are equally dangerous, which means that hardware incapable of ultrasound is an ineffective protection.
- SPEAKE(a)R: Turn Speakers to Microphones for Fun and Profit
- Acoustic Denial of Service Attacks on Hard Disk Drives
- Hidden Voice Commands, Cocaine Noodles: Exploiting the Gap between Human and Machine Speech Recognition
- Inaudible Voice Commands, DolphinAtack: Inaudible Voice Commands
- Rocking Drones with Intentional Sound Noise on Gyroscopic Sensors, WALNUT: Waging Doubt on Integrity of MEMS Accelerometers with Acoustic Injection Attacks
- Gyrophone: Recognizing Speech from Gyroscope Signals
- Accelerometer-based smartphone eavesdropping, Spearphone: Motion Sensor-based Privacy Attack on Smartphones, Learning-based Practical Smartphone Eavesdropping with Built-in Accelerometer
- Stealing Keys from PCs using a Radio: Cheap Electromagnetic Attacks on Windowed Exponentiation: Extraction of secret decryption keys from laptop computers, by non-intrusively measuring electromagnetic emanations for a few seconds from a distance of 50 cm. The attack can be executed using cheap and readily-available equipment: a consumer-grade radio receiver or a Software Defined Radio USB dongle.
- Another attack involves measuring acoustic emanations: RSA Key Extraction via Low-Bandwidth Acoustic Cryptanalysis.
- A poor man's implementation of TEMPEST attacks (recovering cryptographic keys by measuring electromagnetic emissions) using $3000 worth of equipment was proven possible from an adjacent room across a 15cm wall. These attacks were only possible for adversaries with nation-state resources for the past 50 years. See: CDH Key-Extraction via Low-Bandwidth Electromagnetic Attacks on PCs
- https://web.archive.org/web/20170227052456/https://www.cio.com.au/article/602415/researchers-steal-data-from-pc-by-controllng-noise-from-fans/, Fansmitter: Acoustic Data Exfiltration from (Speakerless) Air-Gapped Computers
- https://www.computerworld.com/article/3106862/sounds-from-your-hard-disk-drive-can-be-used-to-steal-a-pcs-data.html, [https://arxiv.org/ftp/arxiv/papers/1608/1608.03431.pdf DiskFiltration: Data Exfiltration from Speakerless Air-Gapped Computers via Covert Hard Drive Noise]
- https://www.computerworld.com/article/3173370/a-hard-drives-led-light-can-be-used-to-covertly-leak-data.html, Leaking (a lot of) Data from Air-Gapped Computers via the (small) Hard Drive LED
- https://www.wired.com/story/air-gap-researcher-mordechai-guri/ ODINI : Escaping Sensitive Data from Faraday-Caged, Air-Gapped Computers via Magnetic Fields
- https://www.schneier.com/blog/archives/2017/04/jumping_airgaps.html, Oops!...I think I scanned a malware
- Keystroke Recognition Using WiFi Signals
- In the paper: An attack variant using USRP (cellphone radio ranges) has performed poorly because of background energy interference due to microwave ovens, refrigerators, and televisions.
- CAPTCHAS also directly enhance the strike capabilities of military drones, see: https://joeyh.name/blog/entry/prove_you_are_not_an_Evil_corporate_person/
- [ThermoSecure: Investigating the effectiveness of AI-driven thermal attacks on commonly used computer keyboards], https://www.schneier.com/blog/archives/2022/10/recovering-passwords-by-measuring-residual-heat.html
- This is a variation of an older attack perfected during the Cold War where recorded typewriter sounds allowed discovery of what was typed. See: https://freedom-to-tinker.com/2005/09/09/acoustic-snooping-typed-information/ and https://www.schneier.com/blog/archives/2016/10/eavesdropping_o_6.html
- Such as unique camera IDs and often GPS coordinates in the case of photographs.
- Such a bias means the program does what it is designed to do: produce pronounceable passwords rather than pure line noise. Even with the secure option
-sit has been noted that it produces passwords with bias towards numbers and uppercase letters to make password checkers happy. The CVE to fix this was rejected and the behavior was not corrected by the authors. This is undesirable for creating true random output, see: pwgen: Multiple vulnerabilities in passwords generation.