现在我找到诀窍了,如何让codex写复杂逻辑,就是让他do math,先用形式化数学语言写文档,与人商量敲定各种条件之后,然后让它实现,可一次成型无虫代码。
否则的话,无穷无尽的debug循环,烧多少钱也没有用。
经过与codex几小时的讨论,最后成型的算法文档长这样:
Fast-Path Restart Math
This note captures the exact arithmetic needed to service leaf-prefix insert
and delete without rebuilding the whole page. The fast path only decodes the
local window around the edit, runs the prefix analyzer once, and then evaluates
one of four cases before applying a single in-place mutation.
Shared definitions
- interval = mdb_prefix_interval(db)
- threshold = mdb_prefix_threshold(db)
- RP = {r0 < r1 < … < r_{m-1}}current restart (anchor) indexes
- anchor_left(x)– largest restart ≤- x(binary search over- RP)
- run_len(x) = x - anchor_left(x)(- 0for a restart slot)
- encoded_len(key_len, shared) = sizeof(uint16_t) + key_len - shared
- node_size(enc_len, data_len, flags) = EVEN(NODESIZE + enc_len +
 (flags & F_BIGDATA ? sizeof(pgno_t) : data_len))
Only two page slots are touched during a fast-path edit: the slot being inserted
or removed (i) plus its successor (i for insert, i+1 for delete). All math
is computed before the page is mutated.
Minimal decode window
- Locate r = anchor_left(target)wheretargetis the slot being inspected
 (i-1for insert,ifor delete). We walkRPlinearly whiletarget
 remains within the first handful of slots (≈8); otherwise we fall back to
 the binary search over the restart list.
- Decode keys starting at slot runtil the predecessor (K_prev), the key
 of interest, and the successor (K_succ) are fully materialised.
Analyzer usage
- Insert (slot - i): run- mdb_leaf_prefix_analyze(db, run_prev, K_prev?, K_new)
 where- run_prev = i ? (i-1) - r : 0. The result yields:
 - is_restart_new
- shared_new(- 0if- is_restart_new)
 
- Delete (slot - i): run- mdb_leaf_prefix_analyze(db, run_prev, K_prev?, K_succ)
 with the same- run_prevdefinition. The result yields:
 - is_restart_succ_new
- shared_succ(- 0if- is_restart_succ_new)
 
The analyzer is only invoked once per edit (incoming key for insert,
successor for delete).
Insert math
Let i be the insertion index, K_prev the predecessor key (if any),
K_new the incoming full key, and K_succ the current key at slot i.
- New-key encoding - 代码: 全选 -    enc_new  = encoded_len(|K_new|, is_restart_new ? 0 : shared_new)
   data_new = F_BIGDATA ? sizeof(pgno_t) : data_len_new
   sz_new   = node_size(enc_new, data_new, flags_new)
 
- Successor decision (pure math) - 代码: 全选 -    run_succ_old = i - r
   shared_succ  = lcp(K_new, K_succ)   // 0 when there's no successor
   wants_anchor = is_restart_new ||
                  run_succ_old + 1 >= interval ||
                  shared_succ <= threshold
   is_restart_succ_new = wants_anchor ? 1 : 0
   enc_succ_new        = encoded_len(|K_succ|,
                                     is_restart_succ_new ? 0 : shared_succ)
   data_succ_old       = flags_succ & F_BIGDATA ? sizeof(pgno_t)
                                                : NODEDSZ_old(succ)
   sz_succ_old         = node_size(NODEKSZ_old(succ), data_succ_old, flags_succ)
   sz_succ_new         = node_size(enc_succ_new, data_succ_old, flags_succ)
 
- Space check - 代码: 全选 -    delta_payload = sz_new + sz_succ_new - sz_succ_old
   lower_after   = MP_LOWER
                 + sizeof(indx_t)                           // new pointer
                 + sizeof(indx_t) * (is_restart_new
                                     + is_restart_succ_new
                                     - was_restart_succ)
   upper_after   = MP_UPPER - delta_payload
   require upper_after ≥ lower_after
 
- Pointer adjustments (math only) - ptr_new = upper_after
- ptr_succ_new = ptr_new + sz_new
- Any pointer whose payload offset is below the old successor block shifts
 downward bydelta_payload
- Shift pointer indices ≥ iright by one; writeptr_newati
 
- Restart table math - Write the current restart array as - RP = (r_0, r_1, …, r_{m-1})with
 - r_j < r_{j+1}. Let
 - 代码: 全选 -    t      = #{ r_j | r_j < i }                    // anchors strictly left of i
   succ_o = 1 if any r_j = i, else 0              // old successor anchor at i
   tail   = m - t - succ_o                        // anchors strictly right of i
 
- The new restart indexes are: - For - 0 ≤ j < t:- r'_j = r_j.
 
- Initialize - write = t. If- is_restart_new = 1, store- r'_write = i
 and increment- write.
 
- If - is_restart_succ_new = 1, store- r'_write = i+1and increment- write.
 
- For - q = 0 … tail - 1:- r'_{write + q} = r_{t + succ_o + q} + 1.
 - When the old successor was a restart (- succ_o = 1) and the new decision keeps
 it anchored, the slot is re-emitted at- i+1; otherwise it is dropped.
 
 
Only after these values are known do we memmove the payload block and write the
two encoded nodes, followed by header, pointer table, and restart array updates.
Delete math
Let slot i be removed. K_prev is the predecessor (if any), K_succ is
slot i+1 (if any), and flags_i / flags_succ are the node flags.
- Deleted node payload - 代码: 全选 -    data_del = flags_i & F_BIGDATA ? sizeof(pgno_t) : NODEDSZ_old(i)
   sz_del   = node_size(NODEKSZ_old(i), data_del, flags_i)
   del_is_anchor = (i ∈ RP)
 
- Successor decision - If there is a successor: - Analyzer already provided - is_restart_succ_newand- shared_succ
 
- enc_succ_new = encoded_len(|K_succ|,
 is_restart_succ_new ? 0 : shared_succ)
 
- data_succ_old = flags_succ & F_BIGDATA ? sizeof(pgno_t) : NODEDSZ_old(succ)
 
- sz_succ_old = node_size(NODEKSZ_old(succ), data_succ_old, flags_succ)
 
- sz_succ_new = node_size(enc_succ_new, data_succ_old, flags_succ)
 - Without a successor the adjustments for that slot vanish. 
 
- Space check - 代码: 全选 -    delta_payload = sz_del + (sz_succ_new - sz_succ_old)   // 0 if no successor
   lower_after   = MP_LOWER
                 - sizeof(indx_t)                         // pointer removed
                 + sizeof(indx_t) * (is_restart_succ_new
                                     - was_restart_succ
                                     - (del_is_anchor ? 1 : 0))
   upper_after   = MP_UPPER + delta_payload
   require upper_after ≥ lower_after
 
- Pointer adjustments - Remove pointer entry i, shift later entries left
- Any pointer whose payload offset lay below the successor block shifts
 upward bydelta_payload
 
- Restart table math - Partition as before: - RP_< = { r ∈ RP | r < i }
 
- RP_> = { r ∈ RP | r > i }
 - Because - RPis sorted by slot index, describe it as
 - RP = (r_0, r_1, …, r_{m-1})with- r_j < r_{j+1}. Let
 - 代码: 全选 -    t      = #{ r_j | r_j < i }                      // anchors strictly left of i
   del    = 1 if any r_j = i, else 0                // deleted key was anchor
   succ_o = 1 if any r_j = i+1, else 0              // old successor anchor
   tail   = m - t - del                             // anchors strictly right of i
   skip   = succ_o                                  // entries to discard from tail
 
- After deletion all tail anchors shift left by one. The new restart array is
 still sorted and is given component-wise by:
 
- For - 0 ≤ j < t:- r'_j = r_j.
 
- If the successor must anchor (- is_restart_succ_new = 1):
 - 代码: 全选 -      r'_t = i
     for q = 0 … tail - skip - 1:
         r'_{t + 1 + q} = r_{t + del + skip + q} - 1
 
- If the successor must not anchor (- is_restart_succ_new = 0):
 - 代码: 全选 -      for q = 0 … tail - skip - 1:
         r'_{t + q} = r_{t + del + skip + q} - 1
 
- The - skipterm ensures that when the old successor was an anchor (- i+1),
 its shifted index- iis retained only if the new decision demands it.
 
 
With all numbers established, the mutation consists of a single memmove of the
payload block, writing the successor’s new encoding (if present), and updating
header fields, pointer table, and restart array.
Notes
- The restart array used here does not need to match the builder’s output
 byte-for-byte; it only has to be consistent with the encoded keys we store.
- We only invoke the prefix analyzer once per edit (incoming key on insert,
 successor on delete) and only decode the small segment from the controlling
 restart through the successor.
- All space and pointer arithmetic is resolved before touching the real page,
 so the page update is a single, deterministic write sequence.
Bailout conditions
Fallback to the full-page builder whenever any of these checks fail:
- Page not eligible: the leaf is not prefix-enabled (no MDB_PREFIX_COMPRESSION),
 is aLEAF2page, or is a sub-page. The fast path assumes the restart table
 is present.
- Decode/analyze failure: mdb_cursor_decode_leaf_keyor
 mdb_leaf_prefix_analyzereturns an error for any slot in the minimal window.
- Space exhaustion: the computed upper_afteris< lower_after, meaning
 the node payloads, pointer table, and restart array would overlap. This is the
 fast-path equivalent ofMDB_PAGE_FULL.
- Interval overrun beyond the successor: when inserting a non-anchor,
 if keeping the successor non-anchor would stretch the run past the next
 restart (i.e.anchor_right(i) - r >= intervalafter accounting for the new
 slot), we would need to re-encode more than one successor – bail out.
- Unsupported flag mix: any key in the window carries flags the fast path
 doesn’t handle (e.g. sub-databases that require special handling). Until that
 support exists, fall back.
- Reserve bookkeeping issue: if an insert with MDB_RESERVEor bigdata
 cannot produce the required payload pointer in-place (e.g. overflow page not
 available yet), defer to the builder.
These guardrails keep the optimized path constrained to “two-key” surgery. The
moment a wider transformation is required, the code should bail early and let
the general rebuild logic take over.
hci 写了: 2025年 10月 15日 17:43
用Codex太贵了,Vide coding一个星期就花了400美刀,还搞不定稍微比较复杂的问题,关键还慢得要死。
现在换成中国公司z.ai的开源模型GLM4.6,用在Claude Code里面,速度快太多了,爽。还便宜,一个季度随便用也就180美刀。
Codex完全没法比。
美帝AI完蛋了,泡泡绝对爆。