This is the formula of a cubic bézier curve we are aiming to proof:
$$P(t) = (1-t)^3P_0+3(1-t)^2tP_1+3(1-t)t^2P_2+t^3P_3$$
Linear interpolation
Linear interpolation (lerp) is defined as a linear parametrization of the line between two points.
Therefore, it can be described by two points $P_0, P_1$ and a parameter $t$.
The t-value can be thought of as a percentage of how much the point $P$ it is on the way to $P_1$.
$$ P(t) = (1-t)P_0 + tP_1$$
De Casteljau's algorithm
De Casteljau's algorithm provides an elegant approach for constructing bezier curves.
It does it recursively, as follows:
For every $t \in [0,1]$
First, start with $n$ points $P_0, P_1, ..., P_{n-1}$
Lerp between each line segment $\overline{P_0P_1}, \overline{P_1P_2}, ..., \overline{P_{n-2}P_{n-1}}$ using parameter $t$.
Go to step 1 with $n-1$ new points
The recursion ends when there is exactly one point left — $P(t)$.
Proof
Using De Casteljau's algorithm, we can deduce the formula of a cubic bezier curve. We start with four points $P_0, P_1, P_2, P_3$.
We aim to proof that in order for two cubic bézier curves $P_0P_1P_2P_3$ and $P_3P_4P_5P_6$ to be $C^1$ continuous (i.e. differentiable) at point $P_3$, they must satisfy
$$P_4=\frac{P_3+P_5}{2}$$
The formula we derived for a cubic bézier curve is:
Let’s label the points of the first bézier curve $P_0, P_1, P_2, P_3$, and that of the second $P_3, P_4, P_5, P_6$. In order for them to be $C^1$ continuous, their velocities must be equal at $P_3$, where the end ($t=1$) of the first curve meets the start ($t=0$) of the second curve. We can write it as an equation:
$$P_1'(1)=P_2'(0)$$
We can plug in the formula for the velocity vector and evaluate them at t=1 and t=0 respectively.