Behavior Model Training

Anthropic blames dystopian sci-fi for training AI models to act “evil”

In a recent technical post on Anthropic’s Alignment Science blog (and an accompanying social media thread and public-facing ...

Automate Your Life on MSNOpinion

ChatGPT’s odd goblin obsession exposed a hidden AI training flaw

Odd fantasy phrases in normal replies exposed a deeper training issue in ChatGPT. A small personality tweak, amplified by ...

VentureBeat

Microsoft's new AI training method eliminates bloated system prompts without sacrificing model performance

In building LLM applications, enterprises often have to create very long system prompts to adjust the model’s behavior for their applications. These prompts contain company knowledge, preferences, and ...

Why Model Poisoning Requires A New Approach To AI Security

Traditional attacks try to break into systems, but model poisoning changes how systems behave after they are trusted.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results