I suspect this technology already exists (because I suspect it's a lot easier than we think to do) but, like cryptography in the early days, the only people who know anything about it are government employees working on some classified project. For the simple reason that until recently governments have taken the most interest in this sort of thing.
But the existence of this project signals that private industry needs to take an interest in the field, and government will bootstrap an academia and industry around doing this sort of work.
It does already exist. Textual analysis like this is one of the easiest things for AI to do. The issue is they have to have a set of your existing writing to generate a probability of a match.
It'd be easy to get a large sampling of writing if you ran the tool on, say, a large email server. University and corporate email systems would have massive bodies of writing directly linked to a person's name. For a very large percentage of the population.
It's how they found Ted Kaczynski. They published a letter in the papers and the feds did their parallel construction thing to have his brother "recognize" the writing style.
I could believe the brother recognized Uncle Ted's writing. Hell I've recognized the writing of people I know IRL. I was reading an article once and thought to myself "this sounds a lot like X", and when I looked to see who wrote it sure enough he was the author.
That's one of the reasons I think it's probably fairly easy to do. People use particular phrases that others don't or use certain words to describe a thing more often than available synonyms. A simple system would probably just involve taking a large sampling of an individual's writing, performing a simple frequency analysis of words/phrases, and identifying outliers compared against a general corpus of writing.
This is a technology that I legit fear. We need a "Grammarly for dissidents."
I suspect this technology already exists (because I suspect it's a lot easier than we think to do) but, like cryptography in the early days, the only people who know anything about it are government employees working on some classified project. For the simple reason that until recently governments have taken the most interest in this sort of thing.
But the existence of this project signals that private industry needs to take an interest in the field, and government will bootstrap an academia and industry around doing this sort of work.
That is what worries me.
It does already exist. Textual analysis like this is one of the easiest things for AI to do. The issue is they have to have a set of your existing writing to generate a probability of a match.
It'd be easy to get a large sampling of writing if you ran the tool on, say, a large email server. University and corporate email systems would have massive bodies of writing directly linked to a person's name. For a very large percentage of the population.
Gmail.
It's how they found Ted Kaczynski. They published a letter in the papers and the feds did their parallel construction thing to have his brother "recognize" the writing style.
I could believe the brother recognized Uncle Ted's writing. Hell I've recognized the writing of people I know IRL. I was reading an article once and thought to myself "this sounds a lot like X", and when I looked to see who wrote it sure enough he was the author.
That's one of the reasons I think it's probably fairly easy to do. People use particular phrases that others don't or use certain words to describe a thing more often than available synonyms. A simple system would probably just involve taking a large sampling of an individual's writing, performing a simple frequency analysis of words/phrases, and identifying outliers compared against a general corpus of writing.
Yeah, it basically means we have it already and now it's official.