How often do people actually copy and paste from Stack Overflow? Now we know.
[Ed. note: While we take some time to rest up over the holidays and prepare for next year, we are re-publishing our top ten posts for the year. Please enjoy our favorite work this year and we’ll see you in 2022.]
They say there’s a kernel of truth behind every joke. In the case of our recent April Fools gag, it might be more like an entire cob, perhaps a bushel of truth. We wanted to embrace a classic Stack Overflow meme and tweak one of our core principles. Our company was inspired by the founders frustration with websites that kept answers to coding questions behind paywalls. What would the world look like if we suddenly decided to monetize the act of copying code from Stack Overflow?
Ok, jokes over, hope everyone had a good laugh and no one got too freaked out. But wait, there’s more. Once we set up a system to react every time someone typed Command+C, we realized there was also an opportunity to learn about how people use our site. We were able to catalog every copy command made on Stack Overflow over the course of two weeks, and here’s what we found.
You are not alone
One out of every four users who visits a Stack Overflow question copies something within five minutes of hitting the page. That adds up to 40,623,987 copies across 7,305,042 posts and comments between March 26th and April 9th. People copy from answers about ten times as often as they do from questions and about 35 times as often as they do from comments. People copy from code blocks more than ten times as often as they do from the surrounding text, and surprisingly, we see more copies being made on questions without accepted answers than we do on questions which are accepted.
So, if you’ve ever felt bad about copying code from our site instead of writing it from scratch, forgive yourself! Why recreate the wheel when someone else has done the hard work? We call this knowledge reuse – you’re reusing what others have already learned, created, and proven. Knowledge reuse isn’t a bad thing – it helps you learn, get working code faster, and reduces your frustration. Our whole site runs on knowledge reuse – it’s the altruistic mentorship that makes Stack Overflow such a powerful community.
You can stand on the shoulders of giants and use their prior lessons learned to build new things of value. You should still follow some basic best practices to prevent bugs or safety issues from sneaking into your code when copying, so make sure you educate yourself before grabbing and pasting. And of course, be aware that some code requires a certain license to use. Beyond that, we encourage everyone to share in the benefits of what the community has created.
That’s the high level TL;DR, but for folks who want a deep dive into all the things we learned while studying the copy data, please read on for some marvelous insights and charts from David Gibson, a data analyst on our product marketing team. If you want to hear about how we built the software modal and physical keyboard behind our April Fools joke, check out the podcast below.
As someone who has been unapologetically copying from Stack Overflow for years, I was not surprised to see the millions of copy events rolling in. What did surprise me was the number of questions we could finally answer. How many people really are copying from Stack Overflow? Are people just copying code? Are people more likely to copy the accepted answer?
To add some direction to the analysis, the team and I came up with a list of questions that we wanted to answer. What started as a joke has snowballed into a worthwhile exploration, producing new insights and sparking many internal conversations about how we can continue to innovate our public platform and bring more value to Stack Overflow for Teams.
Using our homegrown web tracking tool, we created custom events to capture when a user copied from the site. With these events we were able to capture many different attributes; tags, question answer or comment, code block or plain text, copier reputation and post score, region, and if the post was accepted or not. We pretty much captured everything except the actual text being copied.
We collected data for two full weeks, from March 26th 2021 to April 9th 2021. The following analysis is based on the behavior during that time.
Ben already mentioned some of the high-level stats that quickly proved what people had long joked about: everyone is copying from Stack Overflow. We also quickly realized that the overall copy behavior closely followed what we already knew about our site traffic. Most copies occurred during the work week and during working hours. Our largest geographies make up the majority of copies; Asia 33%, Europe 30%, and North America 26%. Finally, 86% of all copies came from anonymous users, aka users with 0 rep.
Things started to become more interesting when we asked more detailed questions about who was copying and what they were copying.
Are higher rep users copying more?
To start, we wanted to see if our higher reputation users are copying more.
We can see that the majority of copies are coming from users with 0 reputation. These are our anonymous users because you immediately get 1 rep by creating an account. It is possible that some of these copies are from users with an account but are not logged in. Unfortunately, there is not a way for us to test this theory.
Since the majority of the users on our platform have a lower rep, let’s remove the groupings to see if we can normalize our data. By looking at Count of Copies Per User instead of Total Copies, we can see the average number of copies a user makes by their reputation.
When looking at this visualization, it appears that as Reputation increases, the Count of Copies Per User decreases. So the higher a user’s reputation, the less often they are copying. This relationship is present but is not very strong, so I am not confident in saying either higher or lower reputation users copy more. Developers who are learning often have a lower reputation and are looking for things that can accelerate their learning and get them started quickly. As developers build their expertise, they also build their reputation, and they focus on more precise challenges, things that may not be possible to copy from Stack Overflow.
Are accepted posts copied more?
When we think of an accepted answer, we may think it is the best one, and infer it is copied much more than non-accepted answers. Looking at the data, however, we find 52.4% of copies come from answers that are not accepted. But on average, accepted answers get seven copies per unique post while non-accepted answers get five copies per unique post. So more copies come from non-accepted answers, but there is higher knowledge reuse from accepted answers. At Stack Overflow, we define knowledge reuse as reusing what others have already learned, created, and proved.
|Answer accepted||Total copies||Unique posts||Percent||Copies per post|
It is worth noting that a question may not even have an accepted answer. Take this answer: it has almost 4,984 up-votes and was copied 7,943 unique times during our study, but is not accepted. Actually, none of the answers have been accepted. It could be because the question poster has not been seen since 2010, but also many of the other answers are valid.
Are higher scored posts copied more?
So if accepted answers are not copied more, then answers with a higher score answers must be copied more, right? Let’s find out!
We see for Answers it seems to be pretty evenly split across our defined score groupings from 1 to 1000. As for questions, the majority of copies are from posts with 1-5 points. I suspect that is because users are copying the question to reproduce it and eventually post an answer.
Similar to when looking at user reputation, the majority of posts on the site have a lower score. To normalize this, let’s look at the copies per post.
We can plainly see that as a post increases in Post Score so does the Copies Per Post. This makes sense because as a post increases in score it is more likely that the knowledge is being reused by our community.
Do people copy downvoted answers?
But what about those blue dots with a negative score? Why would anyone copy down-voted answers? Well, we never want to judge a book by its cover.
Take a look at this answer. It was our most copied down-voted answer with a score of -2 and a total of 288 copies. Looking closer, it appears to be a more concise version of the accepted answer above it that has a score of 29 and had a total of 493 copies. Although our negative score post did not have more copies, it is the perfect example of a “too long didn’t read” post.
What are the most copied tags?
Now for the question I was most excited to answer: what tags are being copied the most? Unfortunately, due to the scale of the data and available resources, I was unable to parse out nested tags. For example, the html tag will not include posts within the |html|css| tag grouping.
Top ten tags copied
Not to my surprise, the tags receiving the most copies are some of the most popular and active tags on Stack Overflow. The one thing that jumped out to me is python appears in four of the top tag groupings. Three of them are data analytics specific tag groups; |python|pandas|, |python|pandas|dataframe| and |python|matplotlib|. As a data nerd myself I love to see more people learning these tools.
|Tags||Total Copies||Unique Posts||Copies Per Posts|
Top ten tags with most copies per post
In addition to looking at the tags with the most copies, I wanted to see what tags have the highest copies per post. Filtering for tags with at least ten unique posts, we can plainly see as tags become more specific, they receive more Copies Per Post.
|Tags||Total Copies||Unique Posts||Copies Per Posts|
What are the most copied posts?
Now to answer the question I am sure many of you are interested in. What post received the most copies?
Answer with code block
With a post score of 3,497 and 11,829 copies, I am happy to announce that How to iterate over rows in a DataFrame in Pandas received the most copies. Answered in 2013, this question continues to help thousands of people each week.
Answer plain text
As for the most copied answer with plain text, we have TypeError: this.getOptions is not a function [closed] with a post score of 218 and 1,570 total copies. Although we were unable to confirm this I suspect that the `firstname.lastname@example.org` is being copied.
Question code block
And the most copied question with a post score of 2,147 and 3,665 copies, we have How to create an HTML button that acts like a link?
Question plain text
Finally, the most copied question with plain text with a post score of 322 and 261 copies, we have Updates were rejected because the tip of your current branch is behind its remote counterpart. This one is a little tricky because there are a handful of git commands not in code blocks that could easily be the copied part of the question. But as we are not capturing the actually copied text, we cannot confirm this.
It’s important that answers are not everything on Stack Overflow. Sometimes all you need is one useful comment. Here are the most copied comments!
|Comment Score||Total Copies||URL|
The first comment is our most copied comment across the site, and the second comment is our “unsung hero” as it only has a post score of five but was our sixth most copied comment.
UPDATE: There has been a lot of interest in purchasing a real life version of our prank. The good news is we anticipated this might happen and we’ve been working on something along these lines. Stay tuned for more!Tags: april fools, copying code, data science
I think you missed an important one (for technical reasons you can’t): people copying their browser address bar. That’s a measure for attribution, and it would be especially interesting in relation to code copies.
This can be partially replaced by counting the Share + Copy for the post, but I bet most folks would just copy the URL instead of using the share functionality.
most copies are for convenient local reference and rework. While most answers aren’t directly applicable, having a local copy saves having to rediscover the exact words to find the article again.
I always copy the url. I wasn’t even aware of the share functionality.
What about creating badges for this?
Badge for copying posts or for having your posts copied?
Maybe both? 😉
Certainly the second, but the first one would also be very cool to see!
Having your posts copied
Definitely the one for having a post which is being copied (that shows the interest by the other users).
But maybe also a more specific badge: when one of your answers has ten copies / one hundred copies, you get a badge, something like “Code source” would be a great name for that.
Easy to misuse. We don’t know if they are actually copying or just copying for a badge!
If there was a thumbs-up button here I would have pressed that.
But for a tutorial copying something yourself would be a good idea, I think.
Why were you surprised that unanswered questions are more often copied from than answered ones? I would have expected that. The longer a question remains unanswered, the more potential answerers will copy code in the question, intending to attempt to replicate the problem/solve it. A question that gets a quick answer is less likely to be looked at by “answer only” users: if someone has already answered it, why bother opening it?
I actually did not look at unanswered questions. I did see that questions with a score of 1-5 have the most copies. I agree that a lower-scored question or a question with no answers will get more copies as it is trying to be replicated.
I’m sure, Jonathan was referring to “…surprisingly, we see more copies being made on questions without accepted answers than we do on questions which are accepted”.
I just hope most people who copy answers are making sure they understand WHY something works (e.g. reading the answerer’s description & following documentation links) instead of just throwing the answer in whatever project they’re working on. That’s why some math teachers ban calculators and math auto-solvers. Learning is about the finding the paths to the answer, not arriving at the answer.
Any chance you would release the raw data?
I would love to analyze it myself. Also how often do stack overflow developers copy and paste from stack overflow?
Not for everyone. I can think of two countertypes:
1.) Some people work backwards, because their minds are more visual. If they copy something first and see it in action, on real data, they can better walk through it and see how each step produces the result it did. For example, if I’m drawing a box with a complex algorithm, it may just look like recondite gibberish at first, but when each step produces something tangible in the debugger, the pieces start to fit together.
2.) Some people are trying to solve a bigger problem and, due to urgency or the likelihood of never seeing this building block ever again, don’t want to waste time on trivialities. For example, if my complex app has 100 steps, only 1 of which is a complicated sorting algorithm, I don’t really care how it works so much as verifying that it does work off the shelf, right out of the box. This means no disrespect to the people who do want to learn the reasoning behind the algorithm, but to certain people with larger, more complex projects, the priorities may be elsewhere, and you’re happy enough with a black box that does what it is supposed to do with certain inputs. Even if you take the time to learn how it works, you’re likely to forget because of the other 99 pieces, some of which may be more integral and exclusive to your app. So it doesn’t mean you don’t care to learn; it means your priorities and bottlenecks are elsewhere.
What about people who copied in the question to play with code to provide an answer? (something I have done many times). I imagine you may find people with more rep doing that as they would likely spend more time answering questions.
That data viz is crying out for non-linear scaling and some other plot types.
Low numbered scores are much much more common than high numbers.
Log scale on the x-axis, using abs(score)+1
Dot sizes scaled by count.
Nevertheless, very cool info, thanks for posting it.
You know, I only discovered the April Fool’s gag because I was coping some text from a question in order to quote it in a comment. My immediate reaction was that it was an awfully lame gag – I actually had to think about it to recognize that it was meant to catch people copying code. It’s something I *don’t do* to such a degree that it didn’t even occur to me what the point of the gag was without a moment of actively thinking about it.
For the rest of the day I took particular note of the fact that, while I *frequently* copy things on SO, it’s really never about copying code. It’s usually just copying snippets of OP’s question for feedback, even if sometimes that ends up being part of a code section (though frequently it was not). So you probably got a load of copies from people like me who were just doing it as part of normal site interaction and not because we were lifting code to add to our pasta.
Interesting. I copy unanswered questions once, about 80% of time when I am trying to answer it, and I copy my own answer several times during later edits. I guess these will be counted as 3 copies per average. For my own questions, I mostly bookmark the link or copy the link page. Rarely I copy an answer code, only for a longer code block answers that I want to study. So my usage statistic will throw off some of the main use cases.
> TypeError: this.getOptions is not a function [closed]
> Closed. This question needs details or clarity.
As one of the top most copied answers, clearly this question does not need clarity.
It’s one of the most annoying things on SE: questions closed without regard to traffic or usefulness. If a mod thinks a question needs clarity, but it is 2 years old with 80k views and 8 responses, then it does not need clarity.
If a mod thinks a question needs clarity, but it already has an accepted answer, then it does not need clarity.
This is very concerning from a licensing standpoint. All this content you seem happy about people copying is CC BY-SA 4.0 licensed, meaning the copy-paster has to, among other things, “give appropriate credit, provide a link to the license, and indicate if changes were made.”
Apache Software Foundation projects are forbidden from using SO copy/pasted content exactly because of these licensing restrictions.
If you felt the license was important you’d interject appropriate license language into the clipboard when users copied, at least prompting them to try to do the right thing. Instead, you seem happy with millions of license violations.
You’re assuming the copied code is being used in projects that are being distributed.
which they aren’t necessarily, like my projects which mostly stay within my house.
But I routinely give the source of more complex code pieces anyways. (although I mostly only copy fragments, not actual functioning code)
If I just looked up the syntax of something I might not put a link as I then would have to link thousands of sources which describe that code. But I also use SO to help me build my website (I am building it completely manually, without any content management system) on which there’s a code source link and sometimes a short description to where the code came from.
In Croatian, the proportions behind that saying are totally different. “There’s a hint of a joke in every joke.” 😀
The copy events by post score graph is so satisfying. It’s like some little blue dots shooting a fire hose of orange.
The example quoted of frequent copies of an unaccepted answer is hardly surprising!
**Unregistered users rarely (i.e. NEVER) accept Answers** – they just post Questions that keep being resurrected from the dead.
I think it would it be interresting to publish the Top X of the most copied code. It would be interresting to highlight them so that more people can review the code and ensure it’s bug free. If it’s often copied, it’s important to be sure the code is safe.
I dont know much about statistics but for me it seems a bit obvious that a little more non-accepted answers are copied just because there are more than accepted answers, or am I missing something here?
In addition: I expect there’s a clear relation between the age of an answer and its score. We could be seeing here old high-score answers that have had their share of copying in their time, but still attract traffic. And new answers that are still hot and in their frst wave. A multifactor analysis is called for!
While command-C is one copy technique, did this capture users who select, then right click and choose copy? Or those who have an X11-style of desktop environment where merely selecting some text is enough to copy that into a clipboard?
But that’s not always easy to pull off (depending on the program you’re copying from that way).
Copying via selecting and then middle-clicking it somewhere else isn’t the same as Ctrl+C→Ctrl+V. When you use the conventional copying method the text is actually copied into the clipboard an can be pasted anywhere else.
Completely different is the select→middle-click technique. It doesn’t save the text in Memory, it just makes a pointer in memory to that text which is accessed and then read by the OS if a middle-click is made inside a text field, without the text actually going to memory before. The last one should appear to the site the same as if a screen-reader reads the text.
Oh, I forgot to say why it’s not always easy to pull off…
That’s because some programs (excluding all modern web browsers I know) don’t actually select the text in a way the OS can interpret or they deselect it whenever they lose the focus.
Interresting ! One question remains : What percentage of the code on the Stackoverflow site comes from copies of answers from Stackoverflow ?
Could you share how you captured copy events, or maybe I missed that in the post. I will usually copy while I’m on the page of a question in order to get the title of a question so that I can link it in a comment. Just copying the link from the address bar or share button doesn’t auto-title whenever you post it in a comment. Did you attach the event to the question body or something more broad?
Were you able to find how they did?
Thanks. I’ll be turning JS off on SO now. I think tracking my behavior like this is creepy.
I realize probably the majority of big sites on the web are even creepier than this, in terms of what they track, and most of them aren’t going to be so honest about it. I also appreciate that you’ve made this information public.
But this just points out the problem with web apps these days, as compared to the good old days of static HTML… user privacy is dead.
> user privacy is dead
Not on every site, I know for example that my site doesn’t track the users at all. I only have as much JS as needed to make the site function as expected. Because I don’t use any CMS I can surely say that I don’t have any trackers lurking on my website (although I use modern HTML5 to build it).
But if you use an adblocker tracking should be not as big of a problem, as (at least AdBlocker Ultimate does so) they also block many trackers. And Firefox does so without any adblocker installed (although I think most others do so, too).
> Why recreate the wheel when someone else has done the hard work? We call this knowledge reuse – you’re reusing what others have already learned, created, and proven. Knowledge reuse isn’t a bad thing – it helps you learn, get working code faster, and reduces your frustration. Our whole site runs on knowledge reuse – it’s the altruistic mentorship that makes Stack Overflow such a powerful community.
In my opinion this could be reworded a bit, when I think of “knowledge reuse” I don’t think “here’s some code I can take and put verbatim in my important program”, I’ve literally never done that in my entire career. I don’t know if “avoid reinventing the wheel” is a good selling point for SO.
I’m all for sharing knowledge and learning from the community but I’d never paste in code (occasionally I copy/paste from my own answers, usually comments / descriptive text for the sake of consistency). Generally SE is a hint, not a solution. It’s like the old saying about giving somebody a fish vs teaching them to fish.
My process is something like this:
1) Search for information
2) Find a post / example code
3) Copy/paste that code into your editor and examine it / test it (including edge cases), see if it meets your needs, and read the explanation carefully
4) Recurse: If it does what you need it to, for each thing you don’t understand in the code, repeat steps 1-4
5) If you find an oversight in the code, or if you have any improvements to suggest to the answer, leave a comment
6) If you find a serious problem in the answer and the author fails to edit or reply, come back later and downvote
By this time you know enough to write the code in your own words, hopefully with more error handling (which is sadly almost always absent in SO posts and can sink a program easily), in the style your group uses, and in a way that fits well in your program’s architecture and avoids repeated code.
Wow, not a fan of even more intrusion via telemetry and monitoring, or the fact that you’re apparently proud of it.
And just like that, Stackoverflow lost me as a member. They now know what programmers are copying what code fragments. This could easily give them insight into, e.g., what project that I [for my company] am working on, what features we’re looking to implement, what problems we’re having, what direction our product is headed, and so many more other things, that it worries the hell out of me. It’s a gross breach of privacy that reflects an equally gross failure of judgment, one that, as far as I can tell, was done without our permission. What a shame.
Any inputs on how they are able to know what is copied? I tried looking at the network’s tab, there is no info sent back to server when copy is performed
Your article says: “It is possible that some of these copies are from users with an account but are not logged in. Unfortunately, there is not a way for us to test this theory.” Do not most your users keep using the same IP address?
I think they would consider this a privacy problem if they recorded the IP addresses.
Although this is interesting, I have my doubts about how accurate this data is. I did a Ctrl+C on April 1st because I saw a picture of the April fools gag and I wanted to see it firsthand on the site. I wasn’t actually copying anything. I’m sure many other users did this too, since this joke went nearly viral. That’s a lot of bad data points.
(A couple people also already mentioned copying text to use in a reply, or copying code to solve the asker’s question.)
Similar statistics on GitHub…?…?!!!!!…!!!!!
How many copy actions were part of edit actions?
Hi, nice one. Very good findings!
You probably do not account for a portion of the developers out there who use Linux. I sometimes just highlight with the mouse the text I want to copy, then use the middle click button to paste into my development environment. And I am sure I am not the only one.
You surely aren’t the only one. I am using that technique, although I don’t think it’s possible for a site to distinguish between a screen-reader reading it and a Linux installation copying text by reading it straight to the target application.
My most downvoted questions are the ones I personally consider to be my best contributions. No surprise these are the ones that I copy first, to make sure the material survives deletion.
It’s kind of weird to refer to it as Command+C rather than Ctrl+C when the overwhelming majority of users must be on Windows or Linux on a Windows keyboard, no?
It shows the Stack Exchange leadership’s Mac snobbery.
I frequently copy code to see if it still works, especially if the post is more than a few years old. I frequently get messages that many items used in the code have been deprecated. However, I would like to add that I do not copy and paste into my code. I usually copy and paste into a text editor window where I will heavily edit it. The edited version gets copied/pasted into my code. I also frequently copy and paste portions of the code into a Google request so that I can find related articles. Finally, the demonstration code found in Stack Overflow frequently has items that are highly inappropriate, such as a number of malloc statements with no free statements.
In several places in this post, and on the podcast, it seems like you’re making the assumption that if someone copied code, then they must have thereafter pasted it into a production application. Sometimes people copy code to a repository of their own for future reference on how to go about something. Sometimes they paste it into a code editor, perhaps even the code they’re working on, and compare its efficacy to their existing code. Or they might refactor it in place — adding null checks, dependency injection, making it more reusable, etc. In this way they’ve really pasted a strategy, not just someone else’s code. There are many reasons why someone might copy that don’t come down to being lazy and don’t really pose a risk.
> Sometimes people copy code to a repository of their own for future reference on how to go about something.
I have a whole collection of those files with code snippets in them that are completely useless on their own. But I copy them to a central place so I can later read through them, understand them better and use that understanding to build e.g. a countdown for my website (that’s a real example, I used that for the countdown on my redirection page. I modified one that showed minusts:seconds to only show seconds. And I combined it with a ton of other code, partially invented by myself just based on concepts).
I’d be interested to see how many copies in a given question come from the answers which are not a) the accepted answer for that question, and/or b) the highest-scored answer for that question. It’d be an indication of how accurate the wisdom of crowds actually is in this case. E.g. the graph that shows the scores for copied answers shows very few answers with 5000+ score, simply because there *are* very few answers with that score.
I think that graph would be more accurate if it showed percented numbers, like “30% of the answers with a score of 5000+ have been copied, 34% of the answers with a score of 3000-4999 have been copied, …” (those 30 and 34 percent are just made up)
I often wonder when copying code, do people actually know what they are copying. In that do the understand the code and what’s doing, or just copy it and hope it works. The understanding of what it does (the copied code) is the real benefit for me as it helps me learn to code. And not just “get the job done” which I see a lot of.. 🙁
I don’t use code that works but is too complicated for me to understand it 😉
That’s the reason why I don’t have too complex code to reset the CSS-animated progress bar on my own redirection page.
I’m sorry, it never occurred to me that I should be ashamed of copying solutions or parts of solutions from Stack Overflow. I mean, it’s there, right? It’s someone’s best contribution. It often gets fixed or enhanced based on subsequent comments. How is this not the most ingenious thing in the world? It’s like having a smart guy sitting next to me at work or school.
It’s a bit like developing code on GitHub, where many eyes study the code to resolve issues together. It’s community work!
I have a 500+ page document of code and explanations garnered mostly from SO. It is all organized so I can find a clever algorithm when I need it. I also copy posters code that I intend to answer. I have to look at it in VS and get rid of all the blank lines.
Copy count on each question and answer would be a strong signal.
But seeing that in small font across every SO page would be too much truth for sensitive trolls and orcs.
The badge idea is a good one, though. More balanced, though less useful.
I think a copy counter (maybe in the left of the post, below the up/downvote button) is a good idea.
But it’s also true that some people might not like that…
I hope StackOverflow use this information to add a “Copy” button next to code blocks. (Not that I would ever use it of course… 🙂 )
It is often more difficult than it should be to select a code block and copy the contents (scroll bars, etc.). This could be made so much easier with a tiny copy button.
…which you can find on many other pages but not SO. And I also miss it sometimes…
How many of those copies were people just hitting Ctrl-C to activate the April Fool because they’d heard about it from someone else?
Probably many. Although I completely missed that (I wasn’t registered back then but I use it since years now. I only registered because I wanted to ask a question)
Just a minor nit-pick, but this post assumes the reader knows what the April Fools joke was. I couldn’t find a link that told me what the April Fools joke actually was – just an implication that it had something to do with copy and pasting
Am I alone in finding it questionable that you should be monitoring my behaviour in this way? I don’t expect my keyboard activity in the privacy of my own browser window to be a matter of public knowledge. Next, you’ll be switching the camera on and watching me. If you’re going to ask for permission to install cookies, I think you should ask for permission to monitor keystrokes and mouse movements.
Hi Michael, I completely agree with you – I’m sure we are not alone on this topic – but still think it is at least good to see that SO is plainly admitting doing this, while other companies – especially big ones – are trying to do this – and doing it – under the hood. We are being watched all the time -more and more – and just think what AI, ML, and DL can do with information, what we humans are not very capable of – that makes me really frightened! So yes this is a BIG and GROWING issue and something needs to be done about it ASAP.
Very interesting topic!
I’d think that if someone would do just copy is going to log out first, then copy.
Personally I admit having done this too, but I always review the code to understand what it actually does!
Simply because many code isn’t going to work just like that – most of the time it needs some tweaks – but the wonderful thing is that you do not have to read a bible on the subject to get what you need.
That approach belongs to the past – there’s to much we need to know of these days – so we need a fast way and StackOverflow is very useful for when you get stuck. So please keep up the good work – we need you just as much as you need us!! 🙂
Have a great day all!
Very interesting topic that spawn from an April fool’s joke 🙂
Please add an upvote button for your blogs. My instinct was telling me to upvote this blog but I can’t find it.
And a thumbs-up button for comments below the blogs would also be great!
First thing I do when I check-out the answer to a problem is to test it in an isolated environment… This equates to a copy and paste to an isolated (html, php etc.) file… You would be surprised how much code doesn’t work due to unannounced or wrong versions of classes (especially GoogleAPI)… That is why I have a certain subset of code sharing websites such as stackoverflow who verify and rank answers… YEH – WELL DONE STACKOVERFLOW
i’d like to know the language breakdown. probably a lot of bash being copied that shouldnt be
Why do you think that?
Is there a possibility of conflict of these stats with usage of bots in grabbing content ?
I find it difficult to find out which is #comment47852929 and #comment71717506 as the web browser does not move to that comment. (Google Chrome 90, macOS Sierra 10.12.6 )
Firefox 98 (Linux Mint 20.3) jumped to the second one an it blinked yellow one time when nthe page finished loaing. The first comment may have gotten deleted in the meantime?
If you missed how did it look like, watch this video:
Thanks! for the video!
What if the user copies from the right-click menu?Uses ctrl-X?Got a custom copy key in the browser via extension settings?Using screenshot+image-to-text?
I think the page just ignores such actions. But I thought about the right-click menu, too.
today we live in the pre active code editor epoch where people drive your editor … at some point the editor will actively do the driving and software engineers will act in an advisory role feeding hints, suggestions and essentially thumbs up or down back into the active editor … that will happen during the transition phase before the machine just wins … until then yes copying and organizing top tips is a critical skill of any productive developer
it would be better if stack overflow put a reference to the post in with the copied code and a legal disclaimer that the comment of the post must be included for use “so people can trouble shoot you shit job”
Amazing! One question remains: What % of the code on the Stackoverflow site comes from copies of answers?
By the way, thanks write for us
For 2022 I hope this copy stats we become a regular part of SO 😉
I don’t think you’ve done, but a break down of copies vs the relative post score *relative to the highest scoring post*.
Mainly to see how often somebody copies the less highly voted scores, and which reputation bracket do that most often.
Well this shows how internet giants can easily crawl into user action. This is just a copy of text and you get rep. value and other attributes. Imagination has no limits.
Ok, this was a very funny april fools joke. But:
Am I the only one who’s a little freaked out to discover hir keystrokes have been surveilled once the joke was over? Of all the sites on the internet, I trust SO’s motive among the most, but this seems very uncool in principle.
The above idea to have an HTML “copy” button to copy each code block, plus a counter to provide useful info that many people found a code block useful, would be not only acceptable, but useful. Since it involves direct interaction with page elements, and reasonable developer would assume it’s trackable. But my own keystrokes… that’s creepy.
Now I wonder how many other sites are silently tracking my keystrokes, and for what purpose, and what I can do to make sure they’re not.
Also YouTube looks at your keyboard. Just type “awesome” while in no text input field on the YouTube video player site and see what happens. (look at the red elements in the player itself, like the “HD” marker in the quality settings)
Don’t copy-paste commands from webpages — you can get hacked. https://www.bleepingcomputer.com/news/security/dont-copy-paste-commands-from-webpages-you-can-get-hacked/
Preview clipboard with https://clipboardplaintextpowertool.blogspot.com
Finally I found a way to protect my own little fictional story from being copied, although on an unusual way, thank you!
Such an interesting post. Yes, people often copy the code from stack overflow but the best practice is to share the URL of that code.
In my own copying experience, the first answer that comes – i.e. the accepted one in the first question googled – is more of a direction than a solution. More googling ensues, and I do take the time to read the answers below the accepted one.
Come to think of it, I use stackoverflow a lot, yet I directly copy just a tiny portion of answers – 5% at best. I am also under the impression (no real statistical analysis here) is that copying answers from long-established fields (i.e. unix/bash) is much more prevalent, as they tend to contain “the truth”. Newer fields require more than one pass, definitely.
I agree with Mike Qt – I think this invades my privacy. I didn’t consciously give StackOverflow permission to monitor my behaviour in this way. It’s almost like switching my desktop camera on without asking me first.