I have started noticing a small pattern the productivity blogs missed.
Someone on a team uses AI to write their OKRs for the quarter. The OKRs come back tighter than anything they would have produced themselves. Specific. Measurable. Time-bound. The template, executed cleanly.
Then they put the document in a drawer.
Not literally. The OKRs sit in Notion. They come up at the next quarter review. The person says, "Yeah, I haven't really moved on that one." The wording is fine. The follow-through is not.
I had been treating this as a discipline problem.
This week a preregistered experiment from a team of behavioral researchers said it is not.
A head of product at a mid-stage B2B SaaS company opened her laptop during a 1:1 and pulled up her Q1 personal OKRs. She had used ChatGPT to draft them three months earlier from a long voice memo she had recorded on a flight. The wording was excellent. Specific metrics, time bounds, a clean cascade from her team's objectives. I asked which one she had moved the furthest on.
She paused. Then she said, "Honestly? I haven't really touched any of them."
She was not avoiding the work. She had shipped a major release that quarter, hired two PMs, killed a flagging product line. The work happened. The OKRs just weren't the work she did. They sat in a Notion page, polished and unused, while she ran the quarter off a separate scrawled list in a notebook she'd started the week she got back from vacation. The notebook list was uglier. It was also the one she was living by.
Vivienne Chi and her co-authors gave 470 people a personal reflection exercise. Half were asked to write their own goals from that reflection. The other half had a large language model write the goals from the same reflection.
Then the researchers measured two things. First: were the goals well-formed? Did they pass the SMART test — specific, measurable, achievable, relevant, time-bound? Second: would the person actually act on them?
The first answer was clean. The AI wrote better goals. Not slightly better. Two standard deviations better on the SMART scale. The authors report a Cohen's d of 2.26. That is the number you get when one group is essentially playing a different sport.
If you stopped the experiment there, you would conclude that AI is a generous gift to anyone trying to set goals. A lot of vendors are selling exactly that story.
The researchers did not stop there. They asked the participants how the goals felt. Did they own them? Were they committed to them? Did the goals seem important?
The pattern reversed. People who wrote their own goals — sloppier, vaguer, less measurable goals — reported much higher psychological ownership. Higher commitment. Higher perceived importance. Effect sizes of 1.13 to 1.38, all in favor of the human-authored goal.
Then the researchers came back two weeks later.
72.8% of the people who had written their own goals had acted on at least two of them. In the AI-authored group, the number was 46.6%.
The cleaner the goal, the less likely the human was to do it.
The authors ran a mediation analysis to ask which variable was actually doing the work — whether quality or ownership predicted follow-through. Quality did not. Ownership did.
Read that again, because it inverts what most teams assume about AI's role in goal-setting. The implicit theory in every productivity-tool pitch is that a better artifact produces a better action. Tighter goal, more progress. The data says the opposite. A goal you did not write is a goal that does not move you. The artifact and the action come apart.
There is a sub-finding that is worse than the main one. The participants who showed the steepest ownership erosion were the people who scored low on trait self-efficacy. The participants most likely to ask AI for help in the first place lost the most ownership when they accepted it.
Translate that. The people who most need to follow through on their goals are the people who get the smallest benefit from AI-assisted goal-setting. They get the cleaner SMART scoring and a smaller share of the actual doing.
If you're running a team that has quietly let AI into the goal-setting process, pull it back out this quarter. Run the next OKR cycle with no model in the drafting room — only in the editing room, and only after the person has committed in their own words to what they are going to do. The wording will be rougher. The follow-through will be better. That is the trade the data is asking you to make, and it is the trade most leadership teams are currently making in the wrong direction.
This finding does not say AI is bad at goal-setting. It says AI is too good at it in the wrong way. The model produces an artifact more polished than the human is invested in. The polish is the bug. Polish someone did not earn slides off them in two weeks.
Three things change for anyone running a team that uses AI in performance settings.
Stop having AI write the OKRs. The OKR document is not the point. The OKR is a forcing function for the person to decide what they are going to commit to. The minute the wording arrives pre-written, you have lost the function. The deliverable looks better and dies sooner.
Stop using AI to draft development plans inside 1:1s. The reason a manager spends time helping a report shape their own plan is not to produce a tidy document. It is to make the report own the plan. Outsource the writing, you have already outsourced the ownership.
Stop assuming a higher-quality artifact is a better artifact. Most management work is not about the artifact. It is about who decided. SMART criteria are a useful summary of what a goal should look like once a person has decided to do it. They are a terrible substitute for the deciding.
The broader frame is the one to keep. The AI productivity literature has spent two years measuring output quality — did the model help you write a better email, a better plan, a better draft? It has spent much less time measuring whether the person whose name is on the deliverable will actually do anything in response to it. Output quality and downstream behavior are not the same variable. We have been treating them as if they were.
AI is fluent at wording and bad at meaning. Fluency without ownership is a hollow object — a clean sentence sitting in a document nobody is going to live by.
Set the goal yourself, even if it comes out messy.
The messy version is the one that moves.