42

On a bit of a whim I've spent some time on and off during the past couple of weeks looking through the New Answers to Old Questions feed, and I have noticed a recurring pattern of plagiarism. A few times each day, an answer is posted that consists entirely of multiple lines of code copied from some other answer to the same question without attribution. Can some sort of automated filter be implemented that blocks such trivially plagiarized answers?

Adding some details, the pattern I have noticed is:

One or two plagiarized answers per day doesn't sound like a lot, but that's just what I noticed by checking the NATO feed for a few hours on several days in the past couple weeks. Assuming my sample was random we're talking hundreds per year. It's enough that I can manually raise "Plagiarism" flags faster than mods seem to be handling them. And while reviewers in the "Late Answers" queue are, according to this answer to Is a plagiarism audit justified?, supposed to check for internal plagiarism:

the Late Answers queue is a bit different from others insofar as checking for internal plagiarism (i.e. copies of other answers to the same question) should (IMHO) be an inherent part of the review process.

it doesn't seem to be working.

Could we implement some automated way to block such answers or screen them out? And if so, what should it be? I'm not sure that "better onboarding" would be the solution here, such trivially obviously copied answers border on being abusive. If we warn the poster about their plagiarism while they are creating their post, then (assuming they are not bots) they will probably just tweak the post a little, so maybe a plagiarism flag should be automatically raised instead?

Update January 2025.

Note that this is still going on, see e.g.:

My observation is that I see at least one plagiarized answer per day in the NATO queue. The discussion under the question How to handle code-only answers that are entirely copied, verbatim, from the question itself? suggests that most of these plagiarized answers are created by someone who picks "Copy snippet to answer" from some other answer, then somehow submits their answer, so maybe we could try to prevent that from happening somehow?

Update May 2025.

Still happening, e.g.:

Seems to be less frequent though. Instead I'm seeing more nonsense answers.

Update June 2025.

Update November 2025.

Still happening, e.g.

One new twist is that sometimes the new "Copy" button is used, which auto-inserts an attribution.

16
  • ...How many different users do you see doing this? Commented May 15, 2024 at 22:55
  • @KarlKnechtel - Each copied answer is from a different account. All except one are 1-rep accounts. After double-checking I realized one had a 21-rep. That one only copied two lines so it might have been a mistake by that user. I'll remove the 21-rep answer from my list. Commented May 15, 2024 at 22:58
  • 3
    Weird. Maybe someone is trying to establish a farm of "legitimate" accounts for future spamming or something like that. Commented May 15, 2024 at 23:31
  • Or for the Review Queues ('First Answers' / 'Late Answers'), we could use a 'Delete'/Not allow/Close reason as "Late Answer, only copy of some older Answer in the same Thread, does not add any new Info", or stg like that... Commented May 16, 2024 at 0:09
  • 4
    @KarlKnechtel - there are around 1.46 billion English speakers in the world. Could just be that every day, one or two out of a billion of 'em decides to start padding their fake resume with a real Stack Overflow account containing fake answers. Commented May 16, 2024 at 0:27
  • The (main?) purpose of the 'Late Answers' Review Queue should be to prevent that kind of Answers...! Commented May 16, 2024 at 1:23
  • 19
    Ideally, I'd much rather such a feature find plagiarism from any other answer on the site, as (perhaps unsurprisingly) Stack Overflow is a very frequent source for plagiarized material posted to Stack Overflow. It is usually not from the same question, though (although that is, as you demonstrate, surprisingly common). Commented May 16, 2024 at 1:55
  • @RyanM, hum..., my experience is also just like OP's, Surf Answers like I call them (nearly) always come from the same Thread. What you describe is more the Answerer did a little bit of search/research, on such a basic (recent) Question that the Question actually is a Duplicate. (And should be closed as such...) OP is rather talking about "older" Questions with already 10's or 100's of UV's and [5-25+] Answers already..., as I understand... Commented May 16, 2024 at 3:26
  • "Can some sort of automated filter be implemented that blocks such trivially plagiarized answers?" - another "Can" question :) Will it be implemented is the follow up question. The one that will perpetually disappoint. Nice effort, but I think it is more productive if people reviewing can keep more of an eye out for it. Commented May 16, 2024 at 9:05
  • 5
    @Gimby - the problem with that is that the "Late answers" review queue does not show the other answers, meaning the reviewer has to open the question in another tab or window and thread through them all (and for older questions there might be multiple pages to read through). If we want people to check for duplicates in that queue, the Stack Exchange software need to show them the possible duplicates on the review page itself. Commented May 16, 2024 at 15:22
  • Re "Consisting entirely of multiple lines of code with no explanatory text.": AKA code dumps. Not for this kind of plagiarism, but spot check of code dumps often reveal they are totally bogus answers. Suggesting they are (often) the result of blind copying from somewhere, e.g., blind copying from a search engine result (without any understanding of it). Commented May 17, 2024 at 5:30
  • 2
    We’re still getting these. I’ve had multiple in the last few days and I’m not really looking for them. I don’t know if it is people or bots or of it is a UX issue or bad intentions, but some automated filtering would be welcome. Commented Mar 22, 2025 at 20:32
  • 2
    Here's the latest one: stackoverflow.com/a/79532282/19068 Commented Mar 26, 2025 at 21:04
  • 5
    @ScottHannen - one theory is that these are "spam seeds" from automated or semi-automated fake accounts. Later the account will reactivate and replace the copy/paste with spam, thereby evading any hypothetical "new post" and "new account" filters. Commented May 31, 2025 at 18:11
  • 3
    Let’s just delete plagiarized content… Commented Jun 19, 2025 at 22:52

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.