五年半的日记,只剩下了形状。
I've kept a diary on and off since I was 15. This year I read the whole
thing through, and gave each entry a mood score from −1 (a really
hard day) to +1 (a really good one).
The chart below splits at September 2023, when I started university —
the cleanest natural boundary I had between two phases of life. The
two halves are analyzed independently. Each half has a time-series
with short and long moving averages, a small set of headline numbers,
a scatter against word count, and a distribution.
I've taken out everything else: names, quotes, the actual dates within
each month. What's left is only the shape.
十五岁开始断断续续写日记,今年回头读了一遍,
把每一篇当天的心情量化成了一个 −1(很糟糕的一天)到 +1(很开心的一天)之间的数字。
下面以 2023 年 9 月(我上大学的时间,也是我人生中最干净的两段分界)
为界拆成两段,独立分析。每段都包含一张时序图(带短/长两条均线和当日字数)、
几个简单统计、一张和字数的散点、一张分数分布。
所有人名、原话、月份内具体日期都被剥掉了,剩下的只是它的形状。
Each column is one calendar month; each row is a day of the month (1 at top, 31 at bottom). Brighter squares are better days, darker squares are harder days, and faint squares are days with no entry.
每列是一个自然月,每行是月份中的第几天(1 号在上,31 号在下)。 越亮代表情绪分数越高,越暗代表越低,极浅的格子是没有记录的日子。
Each dot is one entry. The dark line is a 30-entry simple moving average; the faint red line is a 10-entry exponential moving average, which reacts faster. Bars at the bottom are how many characters I wrote that day.
每个点是一篇日记。深色线是 30 条窗口的简单移动平均(基线), 浅红色线是 10 条窗口的指数移动平均(更敏感)。底部柱状是当天字数。
Each dot is one entry. Horizontal axis is characters written that day, vertical is mood score.
每个点是一篇日记。横轴字数,纵轴分数。
Same data, grouped by score into sixteen buckets, an eighth of a unit wide.
同样的数据,按分数分到 16 个 0.125 宽的区间里。
Same chart, same conventions. Each dot is one entry; the dark line is a 30-entry SMA, the faint red one a 10-entry EMA; bars at the bottom are characters written that day.
同样的图,同样的画法。每个点是一篇日记,深色线是 30 条窗口的简单 移动平均,浅红色是 10 条窗口的指数移动平均,底部柱状是当天字数。
Each dot is one entry. Horizontal axis is characters written that day, vertical is mood score.
每个点是一篇日记。横轴字数,纵轴分数。
Same data, grouped by score into sixteen buckets, an eighth of a unit wide.
同样的数据,按分数分到 16 个 0.125 宽的区间里。
After September 2023, the mean dropped by about 0.2, the share of days above neutral fell from 79% to 56%, and entries got noticeably longer. The data describes how a pen changed. It can't describe the person holding it.
A few small observations that don't require interpretation:
23 年 9 月之后,均值降了大约 0.2,正向天数比例从 79% 降到 56%,篇均字数显著增长。
这套数据描述的是一支笔的变化,描述不了拿笔的人。
几个不需要解读就成立的小观察:
Each entry was scored independently by three language-model agents reading the raw text. The final number is a weighted blend of all three, then manually reviewed and corrected across every entry by the author. The overall inter-rater agreement before correction was high (Pearson r ≈ 0.89, mean |diff| ≈ 0.28).
每条由三个语言模型独立打分,都只看原文。最终分数是三者的加权混合, 随后由作者本人逐条审核并做了全量的二次修正。 修正前模型间的整体一致度 Pearson r ≈ 0.89,平均绝对差约 0.28。