Instrumental goal

Is there an instrumental goal (common goal that will be helpful for most possible goals) that cannot be described as curiosity or empowerment?

I am quite unsure.

I have investigated Artem Kirsanov's channel to collect more data about intelligence systems. It seems that Compositionality of intelligence is really similar to Confirmation bias in such a manner that we need to use current structures first to generalize well.

That is interesting, but it just leads us again back to the Curiosity definition.
Other popular instrumental goals from here may be defined through curiosity and empowerment too:

Absence of interference - E
Recursive self-improvement - C/E
Resource acquisition - E
Self-preservation - E + C
Technological perfection - C/E
Counterfeit utility prevention - C/E


"In 2009, Jürgen Schmidhuber concluded, under conditions where agents seek proof of possible self-modifications, that "any rewrites of the utility function can occur only if the Gödel machine can first prove that the rewrite is useful according to the current utility function."

That's cool, but it relies on the agent being intelligent enough to create such a machine, or on the assumption that this function is not rewritten during the agent's lifetime due to its internal structure. Do not think that this will be useful in agency.

Bostrom's Orthogonality Thesis also doesn't really help, because in the event of an intelligence explosion, in the worst case, we risk an s-risk, and in the best case, suboptimal alignment.


Regarding https://www.alignmentforum.org/posts/7Z4WC4AFgfmZ3fCDC/instrumental-goals-are-a-different-and-friendlier-kind-of

Interesting reasoning, it aligns with Compositionality of intelligence.

Generalizable takeaway: unlike terminal goals, instrumental goals come with a bunch of implicit constraints about not making other instrumental subgoals much harder.

And although in reality this might somewhat contradict the terminal goal, the presence of instrumental goals makes the agent's trajectory predictable and prone to defining subgoals and creating mesa-optimizers for their achievement.

All The Way Up

The authors put forward a radical idea: what if the AI's core goals are inherently "the same" as instrumental goals?

This idea literally completely aligns with my understanding of agency!!! It's nice to know I'm not the only one thinking this way.

"If instrumental convergence is strong enough across the global environment, then an AI that "tries not to step on the toes" of other instrumentally-convergent subgoals is sufficiently aligned."

I thought about this in the context of my ruminations on a funding application for Cooperative AI. It feels like a compromise, but not the solution we'd ideally want to work with.


Regarding https://arxiv.org/pdf/2510.25471

The idea that complex systems like AI possess externally imposed behaviors relative to "natural" systems, and therefore the inevitability of their internal drives, means our task is merely to utilize them. I don't think this is helpful.


Summing up: I have found no instrumental goals that could not be described via Active Inference. That does not mean that they do not exist, but I picture becomes even more clearer for me in this sence.