If successful, HIATUS could have far-reaching impacts, from countering foreign influence activities, to identifying counterintelligence risks and protecting authors whose work may endanger them
Relax, it's only about 'protecting' authors... from being identified by this software that we are sponsoring.
I suspect this technology already exists (because I suspect it's a lot easier than we think to do) but, like cryptography in the early days, the only people who know anything about it are government employees working on some classified project. For the simple reason that until recently governments have taken the most interest in this sort of thing.
But the existence of this project signals that private industry needs to take an interest in the field, and government will bootstrap an academia and industry around doing this sort of work.
It does already exist. Textual analysis like this is one of the easiest things for AI to do. The issue is they have to have a set of your existing writing to generate a probability of a match.
It'd be easy to get a large sampling of writing if you ran the tool on, say, a large email server. University and corporate email systems would have massive bodies of writing directly linked to a person's name. For a very large percentage of the population.
It's how they found Ted Kaczynski. They published a letter in the papers and the feds did their parallel construction thing to have his brother "recognize" the writing style.
I could believe the brother recognized Uncle Ted's writing. Hell I've recognized the writing of people I know IRL. I was reading an article once and thought to myself "this sounds a lot like X", and when I looked to see who wrote it sure enough he was the author.
That's one of the reasons I think it's probably fairly easy to do. People use particular phrases that others don't or use certain words to describe a thing more often than available synonyms. A simple system would probably just involve taking a large sampling of an individual's writing, performing a simple frequency analysis of words/phrases, and identifying outliers compared against a general corpus of writing.
they were a telecom like comcast or at&t that, according to the then-ceo, refused to cooperate with the nsa's surveillance stuff that snowden leaked, and the company was basically destroyed by the government in retaliation.
So, time for an open source rephrasing tool that cranks out generic translations of one's writing. (Oops, I mean "your writing"). Even writing in flawless English would be identifiable. I'm fixing to include some dialect in my anonymous postings.
I think people have tried this by using language translation tools: translate something through multiple languages and back to English. Don't know how well it works though.
Yo! There is already taols ta does dat. beef ahs, man, day're all onlahne, so da ahntel possahbly stahll gets access ta ya orahgahnal text. On da odar ha', I 'as serahous doesubts dat modern OS's doesn't already a' autamatahhaht on da hahpy move da surveahllace dahrectly ahnta ya own system. 'S coo', bro. In odar words, mostly, eeasyghlytahnk ya type, vahew a' lahsten ta ahs recorded, categorahzed a' ahndexed ahn real tahme. What it is, Mama! Don't make me shank ya! Wahndoesws uses a lot of processahng power a' memory. Slap mah fro! Peep this shit! It encrypts da data aht ahs collectahng so ya ca never know what aht ahs collectahng.
Relax, it's only about 'protecting' authors... from being identified by this software that we are sponsoring.
This is a technology that I legit fear. We need a "Grammarly for dissidents."
I suspect this technology already exists (because I suspect it's a lot easier than we think to do) but, like cryptography in the early days, the only people who know anything about it are government employees working on some classified project. For the simple reason that until recently governments have taken the most interest in this sort of thing.
But the existence of this project signals that private industry needs to take an interest in the field, and government will bootstrap an academia and industry around doing this sort of work.
That is what worries me.
It does already exist. Textual analysis like this is one of the easiest things for AI to do. The issue is they have to have a set of your existing writing to generate a probability of a match.
It'd be easy to get a large sampling of writing if you ran the tool on, say, a large email server. University and corporate email systems would have massive bodies of writing directly linked to a person's name. For a very large percentage of the population.
Gmail.
It's how they found Ted Kaczynski. They published a letter in the papers and the feds did their parallel construction thing to have his brother "recognize" the writing style.
I could believe the brother recognized Uncle Ted's writing. Hell I've recognized the writing of people I know IRL. I was reading an article once and thought to myself "this sounds a lot like X", and when I looked to see who wrote it sure enough he was the author.
That's one of the reasons I think it's probably fairly easy to do. People use particular phrases that others don't or use certain words to describe a thing more often than available synonyms. A simple system would probably just involve taking a large sampling of an individual's writing, performing a simple frequency analysis of words/phrases, and identifying outliers compared against a general corpus of writing.
Yeah, it basically means we have it already and now it's official.
A doxxing machine. God help us when the feminists get their hands on it.
If the US spooks get it, then they get it.
Is there a tech company that doesn't suck government dick?
no, because they'd be destroyed like qwest
What is Qwest and what happened to them? Were they some mobile operator?
they were a telecom like comcast or at&t that, according to the then-ceo, refused to cooperate with the nsa's surveillance stuff that snowden leaked, and the company was basically destroyed by the government in retaliation.
No, because all R&D is funded by government.
Oh, are we supposed to pretend they don't have one of those already?
Even if you language with affectation you will still have deep patterns that are probably detectable (unless you use a tool to obfuscate)
So, time for an open source rephrasing tool that cranks out generic translations of one's writing. (Oops, I mean "your writing"). Even writing in flawless English would be identifiable. I'm fixing to include some dialect in my anonymous postings.
I think people have tried this by using language translation tools: translate something through multiple languages and back to English. Don't know how well it works though.
Yo! There is already taols ta does dat. beef ahs, man, day're all onlahne, so da ahntel possahbly stahll gets access ta ya orahgahnal text. On da odar ha', I 'as serahous doesubts dat modern OS's doesn't already a' autamatahhaht on da hahpy move da surveahllace dahrectly ahnta ya own system. 'S coo', bro. In odar words, mostly, eeasyghlytahnk ya type, vahew a' lahsten ta ahs recorded, categorahzed a' ahndexed ahn real tahme. What it is, Mama! Don't make me shank ya! Wahndoesws uses a lot of processahng power a' memory. Slap mah fro! Peep this shit! It encrypts da data aht ahs collectahng so ya ca never know what aht ahs collectahng.