If we improve the tone prompts using each user’s LinkedIn writing history, then we will increase average tone match score from 5.26 to 8 out of 10.
Tone baseline analyses run on 10 users on March 9 showed scores ranging from 2.1 to 6.9 out of 10, averaging 5.26/10. AI-generated content is measurably diverging from how these users actually write.
The current prompts lack sufficient personalization signal. Each user’s existing LinkedIn post history contains enough stylistic data to close the gap.
Same 10-user cohort. The improved tone prompt — trained on each user’s LinkedIn post history — runs live for 14 days. All 10 users are re-scored using the same methodology as the March 9 baseline. A one-week check-in reviews directional movement before the final read.
Tracked in tone_delta_experiments — same scoring model, same 10-user cohort as March 9.
| User | Baseline Score | Posts Analyzed | LinkedIn Posts (Total) |
|---|---|---|---|
| Kamil Rextin | 2.1/10 | 9 | 463 |
| Daryl Driedger | 3.9/10 | 11 | 69 |
| Alejandra Céspedes | 4.8/10 | 4 | 15 △ |
| Matt Dorman | 4.8/10 | 6 | 223 |
| Eric Voyer | 5.7/10 | 6 | 105 |
| Olanike A. Mensah | 5.9/10 | 10 | 0 △ |
| Meeky Hwang | 6.2/10 | 16 | 165 |
| Rameez Faheem | 6.5/10 | 6 | 12 △ |
| Bill Wilson | 6.8/10 | 5 | 710 |
| Renée Cormier | 6.9/10 | 10 | 422 |
| Cohort Average | 5.26/10 | — | — |
△ Flagged — thin or missing LinkedIn data. Scores for these users should be read with lower confidence.
| Hit 8/10 | Roll improved prompts out to all content module users. |
|---|---|
| Miss 8/10 | Identify which users diverged furthest and determine whether the gap is a data problem (insufficient post history) or a model problem. |
All 10 records in tone_delta_experiments still show generatedAt: 2026-03-09. No re-scoring has run. The agents collection shows zero configuration changes since March 9. The improved prompt exists as a stored field in the database but has not been applied to the live system. The experiment cannot start until the prompt is deployed.
cowlickrocks@gmail.com (Daryl Driedger) shows zero content ideas and zero posts in the last 30 days. Every other cohort member has generated 5–21 ideas in that window. A user who is not generating content under the new prompt cannot be re-scored, reducing the effective cohort to 9.
Alejandra Céspedes has 15 LinkedIn posts in the system (4 analyzed). Rameez Faheem has 12 (6 analyzed). Both are on the lower end of the score range. Re-scored results for these two will carry lower confidence than the rest of the cohort. Not a blocker, but should be noted on the final read.
Note → Flag these two users explicitly when interpreting the March 30 results.