Please remove it if unallowed
I see alot of people in here who get mad at AI generated code and I am wondering why. I wrote a couple of bash scripts with the help of chatGPT and if anything, I think its great.
Now, I obviously didnt tell it to write the entire code by itself. That would be a horrible idea, instead, I would ask it questions along the way and test its output before putting it in my scripts.
I am fairly competent in writing programs. I know how and when to use arrays, loops, functions, conditionals, etc. I just dont know anything about bash’s syntax. Now, I could have used any other languages I knew but chose bash because it made the most sense, that bash is shipped with most linux distros out of the box and one does not have to install another interpreter/compiler for another language. I dont like Bash because of its, dare I say weird syntax but it made the most sense for my purpose so I chose it. Also I have not written anything of this complexity before in Bash, just a bunch of commands in multiple seperate lines so that I dont have to type those one after another. But this one required many rather advanced features. I was not motivated to learn Bash, I just wanted to put my idea into action.
I did start with internet search. But guides I found were lacking. I could not find how to pass values into the function and return from a function easily, or removing trailing slash from directory path or how to loop over array or how to catch errors that occured in previous command or how to seperate letter and number from a string, etc.
That is where chatGPT helped greatly. I would ask chatGPT to write these pieces of code whenever I encountered them, then test its code with various input to see if it works as expected. If not, I would ask it again with what case failed and it would revise the code before I put it in my scripts.
Thanks to chatGPT, someone who has 0 knowledge about bash can write bash easily and quickly that is fairly advanced. I dont think it would take this quick to write what I wrote if I had to do it the old fashioned way, I would eventually write it but it would take far too long. Thanks to chatGPT I can just write all this quickly and forget about it. If I want to learn Bash and am motivated, I would certainly take time to learn it in a nice way.
What do you think? What negative experience do you have with AI chatbots that made you hate them?
Many lazy programmers may just copy paste without thinking too much about the quality of generated code. The other group of person who oppose it are those who think it will kill the programmer job
There is an enormous difference between:
rm -rf / path/file
vs.
rm -rf /path/file
Many lazy programmers may just copy paste without thinking too much about the quality of generated code
Tbf, they’ve been doing that LONG before AI came along
Sure, but if you’re copying from stack overflow or reddit and ignore the dozens of comments telling you why the code you’re copying is wrong for your use case, that’s on you.
An LLM on the other hand will confidently tell you that its garbage is perfect and will do exactly what you asked for, and leave you to figure out why it doesn’t by yourself, without any context.
An inexperienced programmer who’s willing to learn won’t fall for the first case and will actually learn from the comments and alternative answers, but will be completely lost if the hallucinating LLM is all they’ve got.
I use it as a time-saving device. The hardest part is spotting when it’s not actually saving you time, but costing you time in back-and-forth over some little bug. I’m often better off fixing it myself when it gets stuck.
I find it’s just like having another developer to bounce ideas off. I don’t want it to produce 10k lines of code at a time, I want it to be digestible so I can tell if it’s correct.
I’ve found it to be extremely helpful in coding. Instead of trying to read huge documentation pages, I can just have a chatbot read it and tell me the answer. My coworker has been wanting to learn Powershell. Using a chatbot, his understanding of the language has greatly improved. A chatbot can not only give you the answer, but it can break down how it reached that conclusion. It can be a very useful learning tool.
It’s great for regurgitating pre written text. For generating new or usable code it’s largely useless. It doesn’t have an actual understanding of what it says. It can recombine information and elements its seen before. But not generate anything truly unique.
That isn’t what the comment you replied to was talking about so that’s why you’re getting downvoted even though some of what you said is right.
The first sentence addressed what they talked about. It’s great as an assistant to cut through documentation to get at what you need. In fact, here’s a recent video from Perry Fractic doing just that with microtext for the C64.
Anything else like having it generate the code itself, it’s more of a liability than an asset. Since it doesn’t really understand what its doing.
Perhaps I should have separated the two thoughts initially? Either way I’ve said my piece.
Top level comment is talking about using it for learning. Saying that AI is just regurgitating text doesn’t address that fact at all. In fact it sounds like you were putting down the commentor for using it for learning.
The bulk of your comment was about how poorly it writes code which isn’t what that comment was talking about. At all. So yes, I agree, you should have separated your two thoughts and probably focused the second thought on a different thread within this post. Perhaps at the top level to say it to the OP.
I’ve been using it for CLI syntax and code for a while now. It’s not always right but it definitely helps in getting you almost all the way there when it doesn’t. I will continue to use it 😁
It’s really useful to quickly find the parameters to convert something in a specific way using ffmpeg.
Hell yeah it is. So much faster than reading the man pages and stuff
When was it wrong? I am curious like how much wrong it was and what AI assistent you asked.
Chatgpt all versions. I don’t know. I use it a lot and I just know it’s been wrong. Powershell comes to mind. And juniper srx syntax. And Alcatel.
Yes, me too, you can often ask it to explain it to a layman and it provides pretty easy to follow explanation
Is the explanation accurate?
It could be, in a monkeys with typewriters sort of way… 🤷♂️
I agree AI is a godsend for non coders and amateur programmers who need a quick and dirty script. As a professional, the quality of code is oftentimes 💩 and I can write it myself in less time than it takes to describe it to an AI.
AI is a godsend for non coders and amateur programmers who need a quick and dirty script.
Why?
I mean, it is such a cruel thing to say.
50% of these poor non coders and amateur programmers would end up with a non-functioning script. I find it so unfair!
You have not even tried to decide who deserves and gets the working solution and who gets the garbage script. You are soo evil…
I think you’ve hit the nail on the head. I am not a coder but using chatGPT I was able to take someone else’s simple program and modify for my own needs within just a few hours of work. It’s definitely not perfect and you still need to put in some work to get your program to run exactly the way you want it to but it’s using chatGPT is a good place to start for beginners, as long as they understand that it’s not a magic tool.
i love it when the AI declares and sets important sounding variables it then never uses 🙄
I think the process of explaining what you want to an AI can often be helpful. Especially given the number of times I’ve explained things to junior developers and they’ve said they understood completely, but then when I see what they wrote they clearly didn’t.
Explaining to an AI is a pretty good test of how well the stories and comments are written.
Because most people on Lemmy have never actually had to write code professionally.
When it comes to writing code, there is a huge difference between code that works and code that works *well." Lets say you’re tasked with writing a function that takes an array of RGB values and converts them to grayscale. ChatGPT is probably going to give you two nested loops that iterate over the X and Y values, applying a grayscale transformation to each pixel. This will get the job done, but it’s slow, inefficient, and generally not well-suited for production code. An experienced programmer is going to take into account possible edge cases (what if a color is out of the 0-255 bounds), apply SIMD functions and parallel algorithms, factor in memory management (do we need a new array or can we write back to the input array), etc.
ChatGPT is great for experienced programmers to get new ideas; I use it as a modern version of “rubber ducky” debugging. The problem is that corporations think that LLMs can replace experienced programmers, and that’s just not true. Sure, ChatGPT can produce code that “works,” but it will fail at edge cases and will generally be inefficient and slow.
Exactly. LLMs may replace interns and junior devs, they won’t replace senior devs. And if we replace all of the interns and junior devs, who is going to become the next senior devs?
As a senior dev, a lot of my time is spent reviewing others’ code, doing pair-programming, etc. Maybe in 5-10 years, I could replace a lot of what they do with an LLM, but then where would my replacement come from? That’s not a great long-term direction, and it’s part of how we ended up with COBOL devs making tons of money because financial institutions are too scared to port it to something more marketable.
When I use LLMs, it’s like you said, to get hints as to what options I have. I know it’s sampling from a bunch of existing codebases, so having the LLM go figure out what’s similar can help. But if I ask the LLM to actually generate code, it’s almost always complete garbage unless it’s really basic structure or something (i.e. generate a basic web server using <framework>), but even in those cases, I’d probably just copy/paste from the relevant project’s examples in the docs.
That said, if I had to use an LLM to generate code for me, I’d draw the line at tests. I think unit tests should be hand-written so we at least know the behavior is correct given certain inputs. I see people talking about automating unit tests, and I think that’s extremely dangerous and akin to “snapshot” tests, which I find almost entirely useless, outside of ensuring schemas for externally-facing APIs are consistent.
deleted by creator
Now, I obviously didnt tell it to write the entire code by itself. […]
I am fairly competent in writing programs.
Go ahead using it. You are safe.
A lot of people spent many many nights wasting away at learning some niche arcane knowledge and now are freaking out that a kid out of college can do what they can with a cool new machine. Maybe not fully what they do but 70% there and that makes them so hateful. They’ll pull out all these articles and studies but they’re just afraid to face the reality that their time and life was wasted and how unfair life can be
Who hurt you?
I have been there, wasted learning stupid things I will never need to know.
Coders are gonna get especially screwed by AI, compared to other industries that were disrupted by leaps in technology.
Look at auto assembly. Look at how many humans used to be involved in that process. Now a lot of the assembly is performed by robotics.
The real sad part is that there’s tons of investment (in terms of time and in terms of money) to become a skilled programmer. Any idiot can read a guide on Python and throw together some functional scripts, but programming isn’t just writing lines of code. That code comes from tons of experience, experiments, and trial and error.
At least auto workers had unions though. Coders don’t have that luxury. As a profession it really had its big boom at a time when people had long since been trained to be skeptical of them.
I don’t think it’s the same at all. Building code issue the same as building physical vehicle parts. All it’ll mean is that any company that uses strictly AI will be beat by a company using AI plus developers because the developers will just add AI as another tool in their toolbox to develop code.
It gives a false sense of security to beginner programmers and doesn’t offer a more tailored solution that a more practiced programmer might create. This can lead to a reduction in code quality and can introduce bugs and security holes over time. If you don’t know the syntax of a language how do you know it didn’t offer you something dangerous? I have copilot at work and the only thing I actually accept its suggestions for now are writing log statements and populating argument lists. While those both still require review they are generally faster than me typing them out. Most of the rest of what it gives me is undesired: it’s either too verbose, too hard to read, or just does something else entirely.
If the AI was trained on code that people permitted it to be freely shared then go ahead. Taking code and ignoring the software license is largely considered a dick-move, even by people who use AI.
Some people choose a copyleft software license to ensure users have software freedom, and this AI (a math process) circumvents that. [A copyleft license makes it so that you can use the code if you agree to use the same license for the rest of the program - therefore users get the same rights you did]
I hate big tech too, but I’m not really sure how the GPL or MIT licenses (for example) would apply. LLMs don’t really memorize stuff like a database would and there are certain (academic/research) domains that would almost certainly fall under fair use. LLMs aren’t really capable of storing the entire training set, though I admit there are almost certainly edge cases where stuff is taken verbatim.
I’m not advocating for OpenAI by any means, but I’m genuinely skeptical that most copyleft licenses have any stake in this. There’s no static linking or source code distribution happening. Many basic algorithms don’t follow under copyright, and, in practice, stack overflow code is copy/pasted all the time without that being released under any special license.
If your code is on GitHub, it really doesn’t matter what license you provide in the repository – you’ve already agreed to allowing any user to “fork” it for any reason whatsoever.
Be it a complicated neural network or database matters not. It output portions of the code used as input by design.
If you can take GPL code and “not” distribute it via complicated maths then that circumvents it. That won’t do, friendo.
For example, if I ask it to produce python code for addition, which GPL’d library is it drawing from?
I think it’s clear that the fair use doctrine no longer applies when OpenAI turns it into a commercial code assistant, but then it gets a bit trickier when used for research or education purposes, right?
I’m not trying to be obtuse-- I’m an AI researcher who is highly skeptical of AI. I just think the imperfect compression that neural networks use to “store” data is a bit less clear than copy/pasting code wholesale.
would you agree that somebody reading source code and then reimplenting it (assuming no reverse engineering or proprietary source code) would not violate the GPL?
If so, then the argument that these models infringe on right holders seems to hinge on the verbatim argument that their exact work was used without attribution/license requirements. This surely happens sometimes, but is not, in general, a thing these models are capable of since they’re using loss-y compression to “learn” the model parameters. As an additional point, it would be straightforward to then comply with DMCA requests using any number of published “forced forgetting” methods.
Then, that raises a further question.
If I as an academic researcher wanted to make a model that writes code using GPL’d training data, would I be in compliance if I listed the training data and licensed my resulting model under the GPL?
I work for a university and hate big tech as much as anyone on Lemmy. I am just not entirely sure GPL makes sense here. GPL 3 was written because GPL 2 had loopholes that Microsoft exploited and I suspect their lawyers are pretty informed on the topic.
The corresponding training data is the best bet to see what code an input might be copied from. This can apply to humans too. To avoid lawsuits reverse engineering projects use a clean room strategy: requiring contributors to have never seen the original code. This is to argue they can’t possibility be copying, even from memory (an imperfect compression too.
If it doesn’t include GPL code then that can’t violate the GPL. However, OpenAI argue they have to use copyrighted works to make specific AIs (if I recall correctly). Even if legal, that’s still a problem to me.
My understanding is AI generated media can’t be copyrighted as it wasn’t a person being creative - like the monkey selfie copyright dispute.
Yeah. I’m thinking more along the lines of research and open models than anything to do with OpenAI. Fair use, above all else, generally requires that the derivative work not threaten the economic viability of the original and that’s categorically untrue of ChatGPT/Copilot which are marketed and sold as products meant to replace human workers.
The clean room development analogy is definitely an analogy I can get behind, but raises further questions since LLMs are multi stage. Technically, only the tokenization stage will “see” the source code, which is a bit like a “clean room” from the perspective of subsequent stages. When does something stop being just a list of technical requirements and veer into infringement? I’m not sure that line is so clear.
I don’t think the generative copyright thing is so straightforward since the model requires a human agent to generate the input even if the output is deterministic. I know, for example, Microsoft’s Image Generator says that the images fall under creative Commons, which is distinct from public domain given that some rights are withheld. Maybe that won’t hold up in court forever, but Microsoft’s lawyers seem to think it’s a bit more nuanced than “this output can’t be copyrighted”. If it’s not subject to copyright, then what product are they selling? Maybe the court agrees that LLMs and monkeys are the same, but I’m skeptical that that will happen considering how much money these tech companies have poured into it and how much the United States seems to bend over backwards to accommodate tech monopolies and their human rights violations.
Again, I think it’s clear that commerical entities using their market position to eliminate the need for artists and writers is clearly against the spirit of copyright and intellectual property, but I also think there are genuinely interesting questions when it comes to models that are themselves open source or non-commercial.
The human brain is compartmentised: you can damage a part and lose the ability to recognizes faces, or name tools. Presumably it can be seen as multi-stage too but would that be a defense? All we can do is look for evidence of copyright infringement in the output, or circumstantial evidence in the input.
I’m not sure the creativity of writing a prompt means you were creative for creating the output. Even if it appears your position is legal you can still lose in court. I think Microsoft is hedging their bets that there will be president to validate their claim of copyright.
There are a few Creative Commons licenses but most actually don’t prevent commercial use (the ShareAlike is like the copyleft in GPL for code). Even if the media output was public domain and others are free to copy/redistribute that doesn’t prevent an author selling public domain works (just harder). Code that is public domain isn’t easily copied as the software is usually shared without it as a binary file.
It doesn’t pass judgment. It just knows what “looks” correct. You need a trained person to discern that. It’s like describing symptoms to WebMD. If you had a junior doctor using WebMD, how comfortable would you be with their assessment?
One point that stands out to me is that when you ask it for code it will give you an isolated block of code to do what you want.
In most real world use cases though you are plugging code into larger code bases with design patterns and paradigms throughout that need to be followed.
An experienced dev can take an isolated code block that does X and refactor it into something that fits in with the current code base etc, we already do this daily with Stackoverflow.
An inexperienced dev will just take the code block and try to ram it into the existing code in the easiest way possible without thinking about if the code could use existing dependencies, if its testable etc.
So anyway I don’t see a problem with the tool, it’s just like using Stackoverflow, but as we have seen businesses and inexperienced devs seem to think it’s more than this and can do their job for them.
but chose bash because it made the most sense, that bash is shipped with most linux distros out of the box and one does not have to install another interpreter/compiler for another language.
Last time I checked (because I was writing Bash scripts based on the same assumption), Python was actually present on more Linux systems out of the box than Bash.
If you’re a seasoned developer who’s using it to boilerplate / template something and you’re confident you can go in after it and fix anything wrong with it, it’s fine.
The problem is it’s used often by beginners or people who aren’t experienced in whatever language they’re writing, to the point that they won’t even understand what’s wrong with it.
If you’re trying to learn to code or code in a new language, would you try to learn from somebody who has only half a clue what he’s doing and will confidently tell you things that are objectively wrong? Thats much worse than just learning to do it properly yourself.
I’m a seasoned dev and I was at a launch event when an edge case failure reared its head.
In less than a half an hour after pulling out my laptop to fix it myself, I’d used Cursor + Claude 3.5 Sonnet to:
- Automatically add logging statements to help identify where the issue was occurring
- Told it the issue once identified and had it update with a fix
- Had it remove the logging statements, and pushed the update
I never typed a single line of code and never left the chat box.
My job is increasingly becoming Henry Ford drawing the ‘X’ and not sitting on the assembly line, and I’m all for it.
And this would only have been possible in just the last few months.
We’re already well past the scaffolding stage. That’s old news.
Developing has never been easier or more plain old fun, and it’s getting better literally by the week.
Edit: I agree about junior devs not blindly trusting them though. They don’t yet know where to draw the X.
Edit: I agree about junior devs not blindly trusting them though. They don’t yet know where to draw the X.
The problem (one of the problems) is that people do lean too heavily on the AI tools when they’re inexperienced and never learn for themselves “where to draw the X”.
If I’m hiring a dev for my team, I want them to be able to think for themselves, and not be completely reliant on some LLM or other crutch.
As a cybersecurity guy, it’s things like this study, which said:
Overall, we find that participants who had access to an AI assistant based on OpenAI’s codex-davinci-002 model wrote significantly less secure code than those without access. Additionally, participants with access to an AI assistant were more likely to believe they wrote secure code than those without access to the AI assistant.
FWIW, at this point, that study would be horribly outdated. It was done in 2022, which means it probably took place in early 2022 or 2021. The models used for coding have come a long way since then, the study would essentially have to be redone on current models to see if that’s still the case.
The people’s perceptions have probably not changed, but if the code is actually insecure would need to be reassessed
I think it’s more appalling because they should have assumed this was early tech and therefore less trustworthy. If anything, I’d expect more people to believe their code is secure today using AI than back in 2021/2022 because the tech is that much more mature.
I’m guessing an LLM will make a lot of noob mistakes, especially in languages like C(++) where a lot of care needs to be taken for memory safety. LLMs don’t understand code, they just look at a lot of samples of existing code, and a lot of code available on the internet is terrible from a security and performance perspective. If you’re writing it yourself, hopefully you’ve been through enough code reviews to catch the more common mistakes.
Sure, but to me that means the latest information is that AI assistants help produce insecure code. If someone wants to perform a study with more recent models to show that’s no longer the case, I’ll revisit my opinion. Until then, I’m assuming that the study holds true. We can’t do security based on “it’s probably fine now.”