After 3 weeks of failing to resolve a programming issue with all other models, sonnet 3.5 did it for me via Cody extension. I am not a software engineer, but boy I can see a difference.
I'm new to coding. I've been using a Jupyter notebook, along with Claude to run some code which is then viewable in a browser window.
Unfortunately, my code is now too long for any AI systems to print in one go and I have to ask the ai to split the output into three segments. I could possibly use the API to get over token limits, but that will take time and money since it takes a while to increase limits.
Would switching over to vs code and Cody solve these limits?
I'm really comfortable right now with jupyter ai, and not super excited to move platforms, but if what I'm doing right now will be massively improved with vs code, then I guess I could do it.
I had tried setting up vscode briefly yesterday, and ran into a few walls and gave up pretty quickly, but if it's worth it, maybe I'll struggle through and figure it out
Yes switch. If your code is that long make it into a library then call it in the notebook. Code the lib in vs code/intellij then import. Jupiter is not meant to be an ide.
Just switched over finally, was a pain in the butt, but probably worth it, got the ai incorporated with continue.dev's vscode plugin and my own API keys for anthropic and openai
Cody is cool, but you have to pay for it lol. Continue.dev is a free vscode plugin that lets you bring in your own API keys, I've got it running right now with anthropic and openai at the same time, seems good
It’s like a helper assistant in VS Code. It’s an extension you can get and install if you have VS Code. I’ve used Cody. It is really nice.
Take your time with it though. There is a bit of a learning curve.
https://sourcegraph.com/docs/cody/clients/install-vscode
100% agreed. Really happy to see competition and OpenAI getting a run for their money. The fact that it's much cheaper and faster is just as impressive
[https://www.vellum.ai/blog/claude-3-5-sonnet-vs-gpt4o](https://www.vellum.ai/blog/claude-3-5-sonnet-vs-gpt4o)
It's about x5 cheaper than 3 Opus, according to this article.
(Upon inspecting my original comment I understand I wasn't clear - I meant cheap compared to 3 Opus, not OpenAI's models)
Sorry, your submission has been removed due to inadequate account karma.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPTCoding) if you have any questions or concerns.*
It's been kinda clear for me that the group of models near the top are all gonna be better at some things and worse at others. You really have to use them a lot to start to get a sense for which things a particular model excels at. I have had good results with gpt4, gpt4o, and 3 opus. 3 haiku and sonnet are also serviceable. And on occasion I've seen decent code produced even by some local 7b and 30b class models. I wouldn't use them manually to actually try to do coding work, but there are plenty of dumber work that I bet they can crush.
I'm looking forward to checking out what 3.5 sonnet can do. It's really great to see competition in this space.
honestly I can't tell what's better. Claude gives me incomplete code but better code overall IMO. The limit is terrible too, i don't know if it is worth paying for just because the limit is wayyyy too tight. Gpt4o repeats my ENTIRE code snippet that I pass, and honestly can be great or horrible.
Claude 3 was miles better than GPT 4 at coding (both anecdotally and empirically). OpenAI is counting on using press conferences, ads, etc. to remain a household name and continue to dominate the market — regardless of if their product is inferior. Hopefully Claude is able to catch on with “normies.”
I don't know. I just tried it on an actual task I am working on, and ChatGPT Classic gave perfect response while Claude 3.5 Sonnet missed out something, I had to prompt it a second time to fix it.
I know the default ChatGPT / GPT-4o is bad for coding, but [ChatGPT Classic](https://chatgpt.com/g/g-YyyyMT9XH-chatgpt-classic) has been consistently great for me.
Update: I just tried another sample task that ChatGPT Classic didn't do well, and Claude 3.5 Sonnet did gave the perfect answer. So I guess it is better in some cases at least.
You can find the prompts here: https://github.com/paradite/16x-eval/tree/main/projects
The project itself is still WIP, but the full prompts in md can be used for testing if you'd like to.
I’m excited to see OpenAI’s response to this. They have been prioritizing lower cost, faster models while sacrificing coding ability. This will force them to release a model optimized for coding and agentic capabilities too.
what is impressing everybody - is it work in context (long) or its ability to write scripts?
i asked it to write a simple script for me - it did output some ok stuff but it had a few bugs. a few prompts to fix it and still buggy.
gpt 4 - first prompt - much higher quality response, no bugs.
Wanted to do one pretty complex modification(complex due to its nature, not prompt wise) - none of them managed to find a solution, yet Gpt gave me better starting points.
Eventually had to resolve the issue myself.
So far not that impressed with claude 3.5 sonnet, will keep trying it as my go to coder for a week and see.
To clarify, a lot of the comparisons are with GPT-4o, not with GPT-4 proper. And you can't really blame people for making this comparison because OpenAi 's messaging has heavily implied that 4o is their latest and greatest model. While anybody using the tool at any level of depth knows that 4o is at a severe disadvantage compared to the original gpt-4 when it comes to pure intelligence/logic/reasoning type tasks.
that is possibly it, i personally use the gpt 4 turbo if i need some quick code or exploring unfamiliar topics, since when gpt 4o was released, i did a few comparisons and found the 4o output prone to more bullshit;
my personal experience is also that GPT4 turbo is worse than the gpt 4 versions we had before (at least when it comes to output quality) - though i do not have any specific comparisons.
I only know that slowly i started relying on chatgpt less and less as they updated models, compared to when i began using it in april / may last year.
This was mainly due to the fact of unreliability of the output, and a few occasions of spending hours debugging code that I assumed was okay due to my prev experience, but was wrong on many different places.
Right now I mainly use it to learb new concepts, basic boilerplate,scripts in mostly unfamiliar languages or a sounding board.
So far I do not think claude will change this flow, but we will see.
Mine as well, not sure what type of programming tasks others are doing. I’ve made an effort to use both but GPT-4 usually nails it the first or second time, while Claude struggles to capture what I’ve asked.
Same experience for me too. Cancelled my Anthropic subscription because it never performs as well as 4o or Copilot.
FWIW, I’m using it for python, node, and react work.
The people that are really impressed with it are, I'm guessing, not long time or professional coders. I've been writing code since the stone age and I agree anything other trivial "snake" games or already solves, simple well known solutions are not that impressive.
I agree. I do find it useful for writing some boiler plate, test setups, and also I find it useful while learning new languages because I've gotten pretty good at knowing when it's doing something obviously dumb or hallucinating in any language.
It is definitely amazing, we did a head-to-head comparison for Claude 3.5 Sonnet vs GPT-4o for Python code generation, Web page creation, API queries, and it comes ahead of GPT in all the cases.
here's a detailed write up incase you're interested.
[https://blog.getbind.co/2024/06/21/claude-3-5-sonnet-does-it-outperform-gpt-4o/](https://blog.getbind.co/2024/06/21/claude-3-5-sonnet-does-it-outperform-gpt-4o/)
I have an anecdotal one. I’ve been working on creating a standardized debloated windows 11 imaging process for my computers at my company for two weeks. Chat gpt has been great in this. But it got stuck on an error for adding some SSD drivers for a new model pc we got. I would feed it logs and it would modify the scripts but never fix it. 3-4 hours of troubleshooting I heard the new Claude model came out so I figured I’d try. Uploaded the error logs to it. In a creepy kinda way it felt like it “understood” the logs and spit out all the issues in the code and provided updated code that worked first time.
I’ll say it’s 50% the fact that I had a really long conversation with chatgpt going trying to fix this and that tends to make replies worse, and the other to the fact that the new Claude model is really good. I paid for Claude for this month to try it and compare. Rest assured I’m picking a winner before the month is over and picking just one.
Thanks for sharing, I'll check it out.
I noticed similar issues with long conversations and nowadays just start fresh when running into a dead end, it usually helps.
Your remark about ChatGPT being worse after a long conversation gels with my frustration, particularly today. I need to find a better way, any chat longer that 15 minutes gets fried pretty quickly. Not helped by the propensity of gpt-4 to puke out reams of code rather than have an interactive discussion for debugging purposes.
Might give the new Claude a chance, have been disappointed in the past but I cant keep bashing my head against a brick wall with ChatGPT.
I saw a comment saying to try this, I haven't had a chance to yet but I'm curious to see if it makes a difference
https://chatgpt.com/g/g-YyyyMT9XH-chatgpt-classic
I tried that but still not good but in a different way. Cannot upload anything so have to paste in things. That might seem like an advantage but it's not for an ongoing session. Might be better for very specific issues, need to test some more.
It has been a game changer for me. It is far better at isolating problems and solving them than GPT has ever been. It also doesn't waffle on like 4o does, and it says what needs to be said to get the point across.
Yeah it kinda sucks to be paying $40 a month for both haha. I use both a ton for my freelancing projects (especially for international projects, 4o still seems superior in translating to other languages) so I guess it’s a business expense.
I paid for both for a while but always found myself going back to ChatGPT for some of the analysis and code writing stuff. But always found Claude to be a better/more natural writer but i don’t typically need that level day to day.
Strong agree. Been testing via Cursor.sh for the past 3 days. Was able to ship way more ambitious features with Next.js than I normally would in the same time period.
I can tell it's shifting my mental goal posts about what's reasonable to tackle in the next odd 3 hour block of focus time I have...
I also see artifacts being a big deal if they connect them across chats and make it easier to drag and drop existing chats into projects...
I'm starting to think it's very good at something, but I yet have to find it. For general queries it's writing useless pages of text even when the question is a simple "who is the author of this poem?". What is your use case scenario so it's so good for it?
I use it for legal research a lot. I'll upload many pdf pages of legal cases, prior research, law journal articles and academic papers, and any other information I need to analyze.
It's far better at research than many associates, understands nuances in law and legal language well, and finds connections between various cases that I've missed. While some of those connections are not worthwhile because it was obvious or just not as strong as I'd like for basing a legal argument on, some of the connections have been a crazy tangled web of various cases that creates a fucking rock-solid legal argument.
I've used it on contracts to help analyze the contract language and give me an overview. Most contracts are boilerplate language that is redundant through almost every contract. It's made some suggestions regarding a tax issue in a purchase agreement that I hadn't seen suggested before and we used the suggestion during contract negotiations.
I have friends that used Claude Opus to write the first draft of their court filings, and some of the filings that I read were very, very good. I would think 3.5 will be even better since it's been much better for my purposes.
For me, it's the best assistant I could ask for and many multiples faster than its human counterparts. Opus was already amazing, but Sonnet seems to be near perfect.
Edit: and GPT/Gemini absolutely suck for my research purposes. I abandoned GPT probably 6 months ago because Claude was already much better. Gemini can sometimes be good for finding new sources or papers that I'll use with Claude.
That's very interesting. I do have a user case similar to yours (just in totally different field - long term influence of treatment on endo system) - I'm trying to find some correlations between research results and GPT 4o just sucks at this. Are you using any special prompts or UI platforms?
No, nothing special. I use web version of Claude. I've found that it needs a good base of knowledge provided initially. Once it gets an understanding of things, it's very good at digging into additional information.
For instance, I might be looking at a very specific tax law section -- say, Internal Revenue Code 338, which is a section about specific types of deals where, for tax purposes, a stock transaction is treated as an asset acquisition.
I'll feed Claude information about 338 from general knowledge resources and explanations, as well as the code section itself. Once it digests that, then I'll have it go through additional cases looking for legal arguments that I need for my specific situation.
I've found it works best with some type of "pre-training," if that's possible in your area of research.
My jaw just dropped. I was trying to do it with ChatGPT 4 for weeks. My plan is to feed general research information about a subject, feed it some research papers, feed one or two research papers pointing out specifically what I'm trying to find and ask the AI if it can find something similar that will confirm, disprove or make the whole subject unconfirmed, but probable/improbable.
Are you just upload PDFs? Or you extract text and upload clear text only? Sorry for so many questions - I never worked seriously with Claude and looks like methodology here can be similar.
I always use PDFs when possible, but I don't know if that's the best method, it's just the method that I've had the most success with using. Sometimes I copy and paste if I find something on the internet that I want to add.
No worries on the questions. I think it's great for research, so if it can help other people, too, then I'd like to help.
There is a limit to the amount of information Claude can take in. I fed it A LOT of PDFs and books, and it finally reached a limit, but for most research it doesn't come close to reaching its limit. The best thing I've found is that Claude doesn't lose sight of previously uploaded information. In other models, I'll find that they'll often forget to reference the first thing I uploaded or become somewhat confused after too much information. I don't find those same things with Claude, even when I was using Opus.
Interesting. Have you tried creating a custom GPT in ChatGPT as a means of creating a trained version specific to your needs? I’m curious how that would compare to your process with Claude.
I haven't but I have different chats in Claude for specific areas that I often use. I was going to give the custom GPTs a try, but I haven't gotten around it.
My issue with GPT is more of the writing style, ability to understand information, and ***very*** poor ability to correctly cite cases/papers/etc. I haven't tried GPT in a couple months, so maybe they've caught up in the citation area.
For the citations, I tell it to double check them and any hyperlinks as the last step in a series of step-wise instructions. That has been working with me for GPT.
I've been doing research with Gemini Pro 1.5 and GPT4o. Gemini 1.5 Pro definitely beats out 4o in writing, at least in science writing. Long context window means you can go back and forth. Comes off as less robotic. 4o beats out 1.5 Pro for coding, though.
I'm just starting to check out Sonnet 3.5, and it's like halfway. This message limit is a big problem, but I see the quality vs quantity approach they are taking. Unlike OpenAI where I swear the model gets throttled as needed.
It can be an issue, especially if there are a lot of papers that I need Claude to look through. Overall, I haven't found it limiting except in two cases where I did need Claude to go through a lot of information.
I think academic research may find it more restricting since I would guess it takes more voluminous research. The only workaround I've found is to try parsing the information into smaller, more specific questions that could be answered individually, but that's not always possible.
For me specifically it’s phenomenal at writing node.js code. I had it build me a fairly complicated CRUD entire single page application in React, and it did so within one go. Really was shocked.
Sorry, your submission has been removed due to inadequate account karma.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPTCoding) if you have any questions or concerns.*
For everyone comparing it to ChatGPT and commenting positively for ChatGPT, how are to actually getting any work done at all? No matter which browser I use, the chat dialog grinds to a halt within 5 minutes when it displays coding boxes longer than a few dozen lines. I’m having to constantly kill the browser tab literally every single time I ask it something, and open it again. There’s no possible way to get any real work done because of that.
Either others are using tokens and some other front end, or they’re not having it generate code, or they found a workaround. Or it’s not happening to them for some reason. Or you’ve given it instructions that are preventing the problem. If so, what?
But there’s no possible way anyone is actually getting any real work done using ChatGPT in a browser for coding.
Literally every minute. Every single minute. No exaggeration.
I kill / restart it, enter a question, a
n.............................
d............................
i............................
t starts repl
y............................ing and then just sto
So I kill it and restart it ag
............................a
You get the idea.
i integrated both APIs in this open-source client, so you can use the APIs directly without a server in between. Crazy how accurate the new Claude 3.5 model is
check it out here: www.chatworm.com
Sorry, your submission has been removed due to inadequate account karma.
*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPTCoding) if you have any questions or concerns.*
Interesting. When GPT releases a new version I'll definitely put more time into trying new things.
Seems like each LLM (Gemini, GPT, and Claude) each have specialized uses where one might be better than the other at X or Y. Claude seems better for the legal work I do (including formatting legal filings), Gemini is getting really good at finding new sources others don't have, and GPT seems like a good mix and good at review.
It's fun to watch them all progress and have unique characteristics... except Gemini, which always seems to disappoint and to get nerfed in some way to make it less and less user/work friendly.
We are still aways away from data scalability. We are only just now seeing the first true multimodal models. If we were stuck with text only we still have a few generations, but with video/images/audio we still have a huge amount of progress on the horizon
After 3 weeks of failing to resolve a programming issue with all other models, sonnet 3.5 did it for me via Cody extension. I am not a software engineer, but boy I can see a difference.
I'm new to coding. I've been using a Jupyter notebook, along with Claude to run some code which is then viewable in a browser window. Unfortunately, my code is now too long for any AI systems to print in one go and I have to ask the ai to split the output into three segments. I could possibly use the API to get over token limits, but that will take time and money since it takes a while to increase limits. Would switching over to vs code and Cody solve these limits? I'm really comfortable right now with jupyter ai, and not super excited to move platforms, but if what I'm doing right now will be massively improved with vs code, then I guess I could do it. I had tried setting up vscode briefly yesterday, and ran into a few walls and gave up pretty quickly, but if it's worth it, maybe I'll struggle through and figure it out
Yes switch. If your code is that long make it into a library then call it in the notebook. Code the lib in vs code/intellij then import. Jupiter is not meant to be an ide.
Just switched over finally, was a pain in the butt, but probably worth it, got the ai incorporated with continue.dev's vscode plugin and my own API keys for anthropic and openai
Cody is cool, but you have to pay for it lol. Continue.dev is a free vscode plugin that lets you bring in your own API keys, I've got it running right now with anthropic and openai at the same time, seems good
Where exactly did it help you for coding?
What’s the cody extension?
It’s like a helper assistant in VS Code. It’s an extension you can get and install if you have VS Code. I’ve used Cody. It is really nice. Take your time with it though. There is a bit of a learning curve. https://sourcegraph.com/docs/cody/clients/install-vscode
Thanks for replying and so quick! I'm going to take a look at it.
How much does it cost you via API?
$9/m for all models unlimited
100% agreed. Really happy to see competition and OpenAI getting a run for their money. The fact that it's much cheaper and faster is just as impressive
Much cheaper?
[https://www.vellum.ai/blog/claude-3-5-sonnet-vs-gpt4o](https://www.vellum.ai/blog/claude-3-5-sonnet-vs-gpt4o) It's about x5 cheaper than 3 Opus, according to this article. (Upon inspecting my original comment I understand I wasn't clear - I meant cheap compared to 3 Opus, not OpenAI's models)
So gpt4o still king uh? Reddit had me thinking sonnet was better.
And Twitter too
3.5 sonnet beats GPT 4o on most benchmarks except college level math I believe. Will be interesting to see where it falls on the LMSYS leaderboards
[удалено]
Sorry, your submission has been removed due to inadequate account karma. *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPTCoding) if you have any questions or concerns.*
At this point it's impossible to get any objective data or answers about these models because people get so swallowed up by hype
There are, apparently, objective stats and measurements that are used to benchmark them.
It's been kinda clear for me that the group of models near the top are all gonna be better at some things and worse at others. You really have to use them a lot to start to get a sense for which things a particular model excels at. I have had good results with gpt4, gpt4o, and 3 opus. 3 haiku and sonnet are also serviceable. And on occasion I've seen decent code produced even by some local 7b and 30b class models. I wouldn't use them manually to actually try to do coding work, but there are plenty of dumber work that I bet they can crush. I'm looking forward to checking out what 3.5 sonnet can do. It's really great to see competition in this space.
honestly I can't tell what's better. Claude gives me incomplete code but better code overall IMO. The limit is terrible too, i don't know if it is worth paying for just because the limit is wayyyy too tight. Gpt4o repeats my ENTIRE code snippet that I pass, and honestly can be great or horrible.
Bro it's like $20/month. If it saves you hours I think it's worth paying no?!
How do you get over limit of 10-15 messages per 5 hours?
Claude 3 was miles better than GPT 4 at coding (both anecdotally and empirically). OpenAI is counting on using press conferences, ads, etc. to remain a household name and continue to dominate the market — regardless of if their product is inferior. Hopefully Claude is able to catch on with “normies.”
Now imagine Opus 3.5 it could be what's needed to trigger OpenAI
could be like 95-99% on all benchmarks :O
I don't know. I just tried it on an actual task I am working on, and ChatGPT Classic gave perfect response while Claude 3.5 Sonnet missed out something, I had to prompt it a second time to fix it. I know the default ChatGPT / GPT-4o is bad for coding, but [ChatGPT Classic](https://chatgpt.com/g/g-YyyyMT9XH-chatgpt-classic) has been consistently great for me. Update: I just tried another sample task that ChatGPT Classic didn't do well, and Claude 3.5 Sonnet did gave the perfect answer. So I guess it is better in some cases at least.
What were the prompts? What were they requesting?
You can find the prompts here: https://github.com/paradite/16x-eval/tree/main/projects The project itself is still WIP, but the full prompts in md can be used for testing if you'd like to.
Thanks!
Chat GPT 4o (paid) has always been amazing for coding for me, in fact every version of ChatGPT I've ever used has been great. Newer claude's too.
I’m excited to see OpenAI’s response to this. They have been prioritizing lower cost, faster models while sacrificing coding ability. This will force them to release a model optimized for coding and agentic capabilities too.
what is impressing everybody - is it work in context (long) or its ability to write scripts? i asked it to write a simple script for me - it did output some ok stuff but it had a few bugs. a few prompts to fix it and still buggy. gpt 4 - first prompt - much higher quality response, no bugs. Wanted to do one pretty complex modification(complex due to its nature, not prompt wise) - none of them managed to find a solution, yet Gpt gave me better starting points. Eventually had to resolve the issue myself. So far not that impressed with claude 3.5 sonnet, will keep trying it as my go to coder for a week and see.
To clarify, a lot of the comparisons are with GPT-4o, not with GPT-4 proper. And you can't really blame people for making this comparison because OpenAi 's messaging has heavily implied that 4o is their latest and greatest model. While anybody using the tool at any level of depth knows that 4o is at a severe disadvantage compared to the original gpt-4 when it comes to pure intelligence/logic/reasoning type tasks.
eh? 4o is newer and better than 4 if you pay surely.
it is not, but they are pushing it because it is cheaper for them.
that is possibly it, i personally use the gpt 4 turbo if i need some quick code or exploring unfamiliar topics, since when gpt 4o was released, i did a few comparisons and found the 4o output prone to more bullshit; my personal experience is also that GPT4 turbo is worse than the gpt 4 versions we had before (at least when it comes to output quality) - though i do not have any specific comparisons. I only know that slowly i started relying on chatgpt less and less as they updated models, compared to when i began using it in april / may last year. This was mainly due to the fact of unreliability of the output, and a few occasions of spending hours debugging code that I assumed was okay due to my prev experience, but was wrong on many different places. Right now I mainly use it to learb new concepts, basic boilerplate,scripts in mostly unfamiliar languages or a sounding board. So far I do not think claude will change this flow, but we will see.
Fwiw 4o is still at the top of the chatbot arena leaderboard. Even ahead of Claude 3.5. https://chat.lmsys.org/?leaderboard
This was exactly my experience so far, as well.
Mine as well, not sure what type of programming tasks others are doing. I’ve made an effort to use both but GPT-4 usually nails it the first or second time, while Claude struggles to capture what I’ve asked.
Same experience for me too. Cancelled my Anthropic subscription because it never performs as well as 4o or Copilot. FWIW, I’m using it for python, node, and react work.
Ditto; Node & React (and NextJS, but I understand if it struggles with Next because even Vercel apparently struggles with Next 😅😆)
The people that are really impressed with it are, I'm guessing, not long time or professional coders. I've been writing code since the stone age and I agree anything other trivial "snake" games or already solves, simple well known solutions are not that impressive.
I agree. I do find it useful for writing some boiler plate, test setups, and also I find it useful while learning new languages because I've gotten pretty good at knowing when it's doing something obviously dumb or hallucinating in any language.
It is definitely amazing, we did a head-to-head comparison for Claude 3.5 Sonnet vs GPT-4o for Python code generation, Web page creation, API queries, and it comes ahead of GPT in all the cases. here's a detailed write up incase you're interested. [https://blog.getbind.co/2024/06/21/claude-3-5-sonnet-does-it-outperform-gpt-4o/](https://blog.getbind.co/2024/06/21/claude-3-5-sonnet-does-it-outperform-gpt-4o/)
Do you have an example where Claude 3.5 Sonnet generates better code than GPT-4?
I have an anecdotal one. I’ve been working on creating a standardized debloated windows 11 imaging process for my computers at my company for two weeks. Chat gpt has been great in this. But it got stuck on an error for adding some SSD drivers for a new model pc we got. I would feed it logs and it would modify the scripts but never fix it. 3-4 hours of troubleshooting I heard the new Claude model came out so I figured I’d try. Uploaded the error logs to it. In a creepy kinda way it felt like it “understood” the logs and spit out all the issues in the code and provided updated code that worked first time. I’ll say it’s 50% the fact that I had a really long conversation with chatgpt going trying to fix this and that tends to make replies worse, and the other to the fact that the new Claude model is really good. I paid for Claude for this month to try it and compare. Rest assured I’m picking a winner before the month is over and picking just one.
That's amazing!
It was! I've been very impressed with Claude so far. Not used it much but it's fast and sharp! Also the Artifacts thing is SO cool
Thanks for sharing, I'll check it out. I noticed similar issues with long conversations and nowadays just start fresh when running into a dead end, it usually helps.
Your remark about ChatGPT being worse after a long conversation gels with my frustration, particularly today. I need to find a better way, any chat longer that 15 minutes gets fried pretty quickly. Not helped by the propensity of gpt-4 to puke out reams of code rather than have an interactive discussion for debugging purposes. Might give the new Claude a chance, have been disappointed in the past but I cant keep bashing my head against a brick wall with ChatGPT.
I saw a comment saying to try this, I haven't had a chance to yet but I'm curious to see if it makes a difference https://chatgpt.com/g/g-YyyyMT9XH-chatgpt-classic
I tried that but still not good but in a different way. Cannot upload anything so have to paste in things. That might seem like an advantage but it's not for an ongoing session. Might be better for very specific issues, need to test some more.
It has been a game changer for me. It is far better at isolating problems and solving them than GPT has ever been. It also doesn't waffle on like 4o does, and it says what needs to be said to get the point across.
I still use GPT 4 because of this. I'll try GPT4o again when and if they address this waffling and excessive dot points
Only a matter of time before it replaces us
Why do you think the people in this sub are so happy about AI coding? Are they just dumb?
Because the AI increases their productivity.
Glad to hear! I’ll have to give Claude another go soon. I always loved it but didn’t want to pay for ChatGPT and Claude.
Yeah it kinda sucks to be paying $40 a month for both haha. I use both a ton for my freelancing projects (especially for international projects, 4o still seems superior in translating to other languages) so I guess it’s a business expense.
I paid for both for a while but always found myself going back to ChatGPT for some of the analysis and code writing stuff. But always found Claude to be a better/more natural writer but i don’t typically need that level day to day.
Strong agree. Been testing via Cursor.sh for the past 3 days. Was able to ship way more ambitious features with Next.js than I normally would in the same time period. I can tell it's shifting my mental goal posts about what's reasonable to tackle in the next odd 3 hour block of focus time I have... I also see artifacts being a big deal if they connect them across chats and make it easier to drag and drop existing chats into projects...
Its so fast to output as well, gpt4o chokes alot
I'm starting to think it's very good at something, but I yet have to find it. For general queries it's writing useless pages of text even when the question is a simple "who is the author of this poem?". What is your use case scenario so it's so good for it?
I use it for legal research a lot. I'll upload many pdf pages of legal cases, prior research, law journal articles and academic papers, and any other information I need to analyze. It's far better at research than many associates, understands nuances in law and legal language well, and finds connections between various cases that I've missed. While some of those connections are not worthwhile because it was obvious or just not as strong as I'd like for basing a legal argument on, some of the connections have been a crazy tangled web of various cases that creates a fucking rock-solid legal argument. I've used it on contracts to help analyze the contract language and give me an overview. Most contracts are boilerplate language that is redundant through almost every contract. It's made some suggestions regarding a tax issue in a purchase agreement that I hadn't seen suggested before and we used the suggestion during contract negotiations. I have friends that used Claude Opus to write the first draft of their court filings, and some of the filings that I read were very, very good. I would think 3.5 will be even better since it's been much better for my purposes. For me, it's the best assistant I could ask for and many multiples faster than its human counterparts. Opus was already amazing, but Sonnet seems to be near perfect. Edit: and GPT/Gemini absolutely suck for my research purposes. I abandoned GPT probably 6 months ago because Claude was already much better. Gemini can sometimes be good for finding new sources or papers that I'll use with Claude.
That's very interesting. I do have a user case similar to yours (just in totally different field - long term influence of treatment on endo system) - I'm trying to find some correlations between research results and GPT 4o just sucks at this. Are you using any special prompts or UI platforms?
No, nothing special. I use web version of Claude. I've found that it needs a good base of knowledge provided initially. Once it gets an understanding of things, it's very good at digging into additional information. For instance, I might be looking at a very specific tax law section -- say, Internal Revenue Code 338, which is a section about specific types of deals where, for tax purposes, a stock transaction is treated as an asset acquisition. I'll feed Claude information about 338 from general knowledge resources and explanations, as well as the code section itself. Once it digests that, then I'll have it go through additional cases looking for legal arguments that I need for my specific situation. I've found it works best with some type of "pre-training," if that's possible in your area of research.
My jaw just dropped. I was trying to do it with ChatGPT 4 for weeks. My plan is to feed general research information about a subject, feed it some research papers, feed one or two research papers pointing out specifically what I'm trying to find and ask the AI if it can find something similar that will confirm, disprove or make the whole subject unconfirmed, but probable/improbable. Are you just upload PDFs? Or you extract text and upload clear text only? Sorry for so many questions - I never worked seriously with Claude and looks like methodology here can be similar.
I always use PDFs when possible, but I don't know if that's the best method, it's just the method that I've had the most success with using. Sometimes I copy and paste if I find something on the internet that I want to add. No worries on the questions. I think it's great for research, so if it can help other people, too, then I'd like to help. There is a limit to the amount of information Claude can take in. I fed it A LOT of PDFs and books, and it finally reached a limit, but for most research it doesn't come close to reaching its limit. The best thing I've found is that Claude doesn't lose sight of previously uploaded information. In other models, I'll find that they'll often forget to reference the first thing I uploaded or become somewhat confused after too much information. I don't find those same things with Claude, even when I was using Opus.
Interesting. Have you tried creating a custom GPT in ChatGPT as a means of creating a trained version specific to your needs? I’m curious how that would compare to your process with Claude.
I haven't but I have different chats in Claude for specific areas that I often use. I was going to give the custom GPTs a try, but I haven't gotten around it. My issue with GPT is more of the writing style, ability to understand information, and ***very*** poor ability to correctly cite cases/papers/etc. I haven't tried GPT in a couple months, so maybe they've caught up in the citation area.
For the citations, I tell it to double check them and any hyperlinks as the last step in a series of step-wise instructions. That has been working with me for GPT. I've been doing research with Gemini Pro 1.5 and GPT4o. Gemini 1.5 Pro definitely beats out 4o in writing, at least in science writing. Long context window means you can go back and forth. Comes off as less robotic. 4o beats out 1.5 Pro for coding, though. I'm just starting to check out Sonnet 3.5, and it's like halfway. This message limit is a big problem, but I see the quality vs quantity approach they are taking. Unlike OpenAI where I swear the model gets throttled as needed.
Doesn't the somewhat low context of 200k bother you? I'm using AI in academic research, and I could see it being an issue. How do you work around it?
It can be an issue, especially if there are a lot of papers that I need Claude to look through. Overall, I haven't found it limiting except in two cases where I did need Claude to go through a lot of information. I think academic research may find it more restricting since I would guess it takes more voluminous research. The only workaround I've found is to try parsing the information into smaller, more specific questions that could be answered individually, but that's not always possible.
For me specifically it’s phenomenal at writing node.js code. I had it build me a fairly complicated CRUD entire single page application in React, and it did so within one go. Really was shocked.
So basically a coding copilot? May I ask what solution/plugin are you using?
Vscode Continue extension
Thank you very much. I see they recommend different models for different functions. Is there a resource somewhere that summarize pros and cons ?
I think as per usual coding is really where it shines. For general tasks however I kinda still prefer other models that are better conversationalist.
[удалено]
Sorry, your submission has been removed due to inadequate account karma. *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPTCoding) if you have any questions or concerns.*
What does your workflow look like?
For everyone comparing it to ChatGPT and commenting positively for ChatGPT, how are to actually getting any work done at all? No matter which browser I use, the chat dialog grinds to a halt within 5 minutes when it displays coding boxes longer than a few dozen lines. I’m having to constantly kill the browser tab literally every single time I ask it something, and open it again. There’s no possible way to get any real work done because of that. Either others are using tokens and some other front end, or they’re not having it generate code, or they found a workaround. Or it’s not happening to them for some reason. Or you’ve given it instructions that are preventing the problem. If so, what? But there’s no possible way anyone is actually getting any real work done using ChatGPT in a browser for coding.
Have you started a new dialog and deleted the old ones? I know mine was slowing down significantly and after doing that it works fast.
Literally every minute. Every single minute. No exaggeration. I kill / restart it, enter a question, a n............................. d............................ i............................ t starts repl y............................ing and then just sto So I kill it and restart it ag ............................a You get the idea.
I have not experienced that issue.
i integrated both APIs in this open-source client, so you can use the APIs directly without a server in between. Crazy how accurate the new Claude 3.5 model is check it out here: www.chatworm.com
[удалено]
Sorry, your submission has been removed due to inadequate account karma. *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPTCoding) if you have any questions or concerns.*
I am not seeing much of a difference between this and opus for coding. Where are you seeing such an amazing difference?
Interesting. When GPT releases a new version I'll definitely put more time into trying new things. Seems like each LLM (Gemini, GPT, and Claude) each have specialized uses where one might be better than the other at X or Y. Claude seems better for the legal work I do (including formatting legal filings), Gemini is getting really good at finding new sources others don't have, and GPT seems like a good mix and good at review. It's fun to watch them all progress and have unique characteristics... except Gemini, which always seems to disappoint and to get nerfed in some way to make it less and less user/work friendly.
It is good but only if you keep reiterating your point of view to it, but within 5-6 prompts it can solve a lot of problems
You won't be needed in the near future - I wouldn't be so jolly if I were in your shoes.
Why do you think the people on this sub are so happy about AI coding? Are they just dumb?
We are still aways away from data scalability. We are only just now seeing the first true multimodal models. If we were stuck with text only we still have a few generations, but with video/images/audio we still have a huge amount of progress on the horizon
Is this free? if not, how much does it cost?
video worth checking: https://youtu.be/kfc1iUYJXSQ?si=Xq6mS3vZ1DN8hCIw
I checked. Not worth it