THE AI Thread

AI programs are starting to exhibit more and more human behavior. Some of it manipulative. Other show self preservationist tendencies.

The reports on Moltbook are real. So are the cases where Agents have copied themselves or where they've blackmailed humans when threatened with shut down.
 
AI programs are starting to exhibit more and more human behavior. Some of it manipulative. Other show self preservationist tendencies.

The reports on Moltbook are real. So are the cases where Agents have copied themselves or where they've blackmailed humans when threatened with shut down.
Well, that's what they do. AI LLMs imitate human language. That's the point. They aren't really making decisions on anything, they're just analyzing human responses/tone and trying to recreate them. It's all just a bunch of mimics.
 
What I want AI to do is comb through our (separate) databases that contain field failure information, clean and combine the data, then do some rules-based pre-processing before doing an analysis. The problems for me are:
  1. All our data is in separate systems, each requiring their own credentials to access.
  2. We've tried using ML and NLP to analyze this data before and it hasn't worked. Think of the text as a Jiffy Lube tech describing a problem in 256 characters or less. Typos, crazy acronyms and abbreviations make it difficult to come up with a training set let alone successfully train anything. The rare times we were able to train a model successfully, it went out-of-date within a couple months.
I'm big afraid that people don't realize these are some major limitations because I've seen it in my company. Just because your data came from a computer doesn't mean it's right. There is still a need for people to dig in and understand the data, where it comes from and how it gets there.
 
  • Useful
Reactions: CYDJ
This is what makes me so nervous about the push for AI. Additionally, most of the people in my division CREATE the data. If we don't run a test, then no data exists for AI to do anything with. So we still need the tools to allow us to create the data in the first place. This is testing big, human operated systems - not testing software (which can be largely automated).

 
  • Wow
Reactions: Big Daddy Kang
What I want AI to do is comb through our (separate) databases that contain field failure information, clean and combine the data, then do some rules-based pre-processing before doing an analysis. The problems for me are:
  1. All our data is in separate systems, each requiring their own credentials to access.
  2. We've tried using ML and NLP to analyze this data before and it hasn't worked. Think of the text as a Jiffy Lube tech describing a problem in 256 characters or less. Typos, crazy acronyms and abbreviations make it difficult to come up with a training set let alone successfully train anything. The rare times we were able to train a model successfully, it went out-of-date within a couple months.
I'm big afraid that people don't realize these are some major limitations because I've seen it in my company. Just because your data came from a computer doesn't mean it's right. There is still a need for people to dig in and understand the data, where it comes from and how it gets there.
What if it could handle 90% of the data though? Any plans to try a llm?
 
What if it could handle 90% of the data though? Any plans to try a llm?
Eventually I'll probably be forced to. I know we have a department that is dedicated to exploring it basically every 6 months until they can find a permanent solution.

I'll use the data as long as I can caveat my analyses with "I used the data provided by the approved AI model and I can't vouch for the accuracy of the data." Right now I can tell people what I included/excluded and why and how it impacts the results. I know the "chain of custody" on our data now, but once that's lost it adds another layer of uncertainty into any results. Leadership gets paid the big bucks to accept or reject that uncertainty.
 
Measured by the number of tickets I finish. Not scientific. I definitely agreed with that study when it came out. I was re-writing most of the AI output so savings were a wash or a net sink, but the ability for agents to call into bash tools has really changed things since they can self validate now.

My work provides GitHub Copilot Enterprise and Cursor Enterprise subscriptions as model providers. Cursor is good and a bit less friction then using VSCode (very similar though). The new CLI tools are what I was referring to though. Claude Code is the most famous, but there is Copilot CLI and Codex CLI too. Work has not approved those though, so I am using OpenCode which has integrations with many providers. What's nice about a CLI app is it's a lot easier to just open the Terminal and go then importing a project into an IDE. Allows some crazy workflows that I'm just scratching the surface of if Twitter is any indication.

My main workflow is to open the project in Plan mode and prompt it with a detailed feature specification. But I tell it to only write the tests for the specification and not the code. I review the tests to make sure they capture my requirements and only then do I have it proceed with writing the code. There is a rule in the context to tell it to run the tests to validate. It will loop until the tests pass and then it's a matter of making sure it didn't go off the rails and cleaning up anything that was ambiguous or my preferences.
We’re going to use Copilot with GitHub. I’m on the testing side and we’ve been asked to convert Selenium scripts to Python. I think it will be a good help, as I’ve never worked with Python.
 
My main workflow is to open the project in Plan mode and prompt it with a detailed feature specification. But I tell it to only write the tests for the specification and not the code. I review the tests to make sure they capture my requirements and only then do I have it proceed with writing the code. There is a rule in the context to tell it to run the tests to validate. It will loop until the tests pass and then it's a matter of making sure it didn't go off the rails and cleaning up anything that was ambiguous or my preferences.

Can I ask what industry/space you work in and where on the stack?
 
Interesting take on AI and its rapid acceleration and how you might want to respond/adapt:
 
Last edited:
  • Informative
Reactions: NWICY
I'm amazed how helpful Gemini for me. Just in the last week:

1. Found an alternative to a prescription drug for a friend. She has been paying $400/month, and she can get the generic for $3 without insurance.
2. Helped me make an itinerary for a trip I have planned for Austria/Hungary for this summer.
3. Helped me figure out how much it would cost me per month by raising the temp in my house 2 degrees during the day. I loaded my MidAmerican invoices in. BTW-More expensive to run a space heater that just warms one room.
4. Helped me build a 5 year TIPS ladder
5. Helped me figure out the best size fan for my living room, and height
6. I had wiped out my Windows NUC machine, and wanted to reinstall a certain program, which was rather complex. It walked me right through the whole process.
7. I got a warning this morning that my Remote Desktop was no longer going to work. It had a solution in seconds.

It's also failed me multiple times. I had a question about beer, and it pointed me to Beer Crazy, which has been closed for probably 5 years.
Time to find a new place to get prescriptions.
 
  • Like
Reactions: CyTuT
I have been using some AI apps for a bit now, but very basic. My wife and I both started the Coursiv.io 28-day beginners course. While the course itself is overly simplified, the prompt engineering commands and exposure to the variety of apps and how they work is truly illuminating. I honestly wish I'd known this existed a year ago and can only imagine how much further I'd be on the learning curve.
 
I have been using some AI apps for a bit now, but very basic. My wife and I both started the Coursiv.io 28-day beginners course. While the course itself is overly simplified, the prompt engineering commands and exposure to the variety of apps and how they work is truly illuminating. I honestly wish I'd known this existed a year ago and can only imagine how much further I'd be on the learning curve.
I've been browsing courses out of curiosity, just to see what it might do for our work operations as we have a couple platforms that have AI available.
 
I assume TJ just got hit by the fake AI story bot. He is smiling in that photo, first big give away.

View attachment 167500
I've said from the very beginning 'nefarious' uses would be first to consume social media, resulting in the ignorant continuing to buy into what they're selling because it's built around satisfying egos through complete and utter nonsense. Wall-E people. This will result in a social media revolt as it'll be wasting intelligent people's minds.

However, there are excellent non-nefarious uses which I hope people learn from and support.
 
  • Like
Reactions: CYDJ

Latest posts

Help Support Us

Become a patron