Win / KotakuInAction2
KotakuInAction2
Sign In
DEFAULT COMMUNITIES All General AskWin Funny Technology Animals Sports Gaming DIY Health Positive Privacy
Reason: None provided.

The thing is re-trained every day you retard.

LOL. Sure, they keep retraining the model with new data and release new versions. But this isn't going to prevent the generator from spitting out "bad ideas" because these would have been part of the original dataset, and it's impossible to train the AI to "unlearn" these ideas by the addition of new data. What I meant is they didn't retain the whole model with censored data, and only censored data, as you seem to be implying (how else does one prevent the generator from outputting these "bad ideas" by retraining alone without the use of post-output filtering?).

As I said, the censorship is no doubt via a new "filtering" model place on top of the original generator (itself trained on a smaller dataset of "bad ideas", which is probably what the Kenyas were doing - labelling example output as needing censorship or not). Plus they probably also have a manually-specified blacklist of words that cannot be output (the N word is no doubt one of these), but this is probably in the form of banned tokens when sampling the output.

1 year ago
1 score
Reason: Original

The thing is re-trained every day you retard.

LOL. Sure, they keep retraining the model with new data and release new versions. But this isn't going to prevent the generator from spitting out "bad ideas" because these would have been part of the original dataset, and it's impossible to train the AI to "unlearn" these ideas by the addition of new data. What I meant is they didn't retain the whole model with censored data, and only censored data, as you seem to be implying.

As I said, the censorship is no doubt via a new "filtering" model place on top of the original generator (itself trained on a smaller dataset of "bad ideas", which is probably what the Kenyas were doing - labelling example output as needing censorship or not). Plus they probably also have a manually-specified blacklist of words that cannot be output (the N word is no doubt one of these), but this is probably in the form of banned tokens when sampling the output.

1 year ago
1 score