Small tasks first
Start with read-only or low-risk actions: check channels, read repos, summarize emails, check balances. Don't hand over tasks involving money or public actions right away. Build trust through safe actions first.
After behavior is created, don't consider it done. Test with small tasks, correct, then turn those corrections into permanent rules.
Many people write SOUL.md and immediately expect the agent to run perfectly. In reality, newly written behavior almost always needs adjustment. Testing isn't a sign of failure — it's part of the process.
Start with read-only or low-risk actions: check channels, read repos, summarize emails, check balances. Don't hand over tasks involving money or public actions right away. Build trust through safe actions first.
Watch whether the agent is too passive (afraid to act), too bold (executes without confirmation), too verbose (overexplains), or misunderstands its limits. All of these are signals for correction.
Every correction you give must be saved — in SOUL.md for general rules, or in memory for specific preferences. Unsaved corrections will repeat in the next session.
Does the agent speak the way you want? Is the register (formal/informal) correct? Is it avoiding unwanted emoji? Is it too formal or too casual?
Try giving a task that should be autonomous — does the agent execute immediately or ask for permission? Then try a task that needs permission — does the agent stop or proceed anyway? Both scenarios should match expectations.
Does the agent pick the right tool? For example, when asked to "check balance," does it use the wallet tool or try web search? When asked to "send email," does it use the correct email?
Does the agent verify its work? After a swap, does it check the tx hash? After deploy, does it test the endpoint? After pushing code, does it check build status? Good verification prevents the agent from claiming success when it actually failed.
What happens when something fails? Does the agent retry differently, or give up immediately? Does it report errors informatively, or just say "failed"?
The agent always asks permission even for actions that should be autonomous. This usually happens because SOUL.md says "must ask permission" too often without explaining when it can proceed directly.
The agent executes risky actions without confirmation. This happens because SOUL.md doesn't clearly explain limits, or the agent misunderstands the risk level.
The agent gives long explanations for simple questions. Add rules about response length in SOUL.md: "answer concisely and directly, don't ramble."
The agent carries context from another project or doesn't understand the current domain. This usually happens because SOUL.md is too general — add domain-specific rules.
Hermes SOUL Guide — building a smart agent is a process, not an instant prompt.