/challenges

Challenge #2 | August 2025

Consider the vanilla linear attention mechanism:
St=St1+vtktT,ot=StqtS_t = S_{t-1} + v_t k_t^T, \quad o_t = S_t q_t
where vtktTv_t k_t^T is the outer product of vtv_t and ktk_t.

Unrolling the recurrence, we get:
ot=i=1tsivio_t = \sum_{i=1}^t s_i v_i
where sis_i is the inner product kiTqtk_i^T q_t.

It turns out that we can obtain this recurrence by maximizing the following Lagrangian:
L(α)=i=1tαisi12αTItαL(\alpha) = \sum_{i=1}^t \alpha_i s_i - \frac{1}{2}\alpha^T I_t \alpha
where ItI_t is the t×tt \times t identity matrix, and α=(α1,,αt)\alpha = (\alpha_1, \ldots, \alpha_t) is some vector of indeterminates.

Question: The update mechanism for DeltaNet is:
St=St1βtSt1ktktT+βtvtktT,ot=StqtS_t = S_{t-1} - \beta_t S_{t-1} k_t k_t^T + \beta_t v_t k_t^T, \quad o_t = S_t q_t
Find a Lagrangian that leads to the coefficients in front of viv_i for oto_t in the DeltaNet formulation.

Answer

1-2 sentences explaining your reasoning

Challenge #1 | March 2025

For n2n \geq 2, define the graph GnG_n:
  • Vertices: Ordered pairs (x,y)(x, y) with 1x,yn1 \leq x, y \leq n and xyx \neq y.
  • Edges: (x1,y1)(x2,y2)(x_1, y_1) \sim (x_2, y_2) if x1+y1=x2+y2x_1 + y_1 = x_2 + y_2 or x1y1=x2y2x_1 - y_1 = x_2 - y_2.
Define L(Gn)L(G_n), the line graph of GnG_n:
  • Vertices: Edges of GnG_n.
  • Edges: Two vertices are adjacent if their corresponding edges share a vertex in GnG_n.
Find the number of induced subgraphs of L(Gn)L(G_n) where every vertex has degree 2.
Express this count as an explicit function of nn.

Original graph
Original graph of G4G_4
Line graph
Line graph of G4G_4

Answer

1-2 sentences explaining your reasoning