Deriving a formula for the amount of time you should study for multiple tests
Nice question.
I first of all disagree with your 'conclusion' that the percentage of time allocated should be inversely proportional to the time until the test. My reason is pretty basic. Say you have two tests, one in 80 days and one in 81 days, worth the same amount. Then your 'conclusion' would mean that, on the first day, roughly the same amount of time should be spent studying for both tests, and after 79 days, twice as much time should be spent on one than the other. However, the days $1,2,\dots,79$ are indistinguishable from each other, so you can just interchange what you do on days $1$ and $79$.
In any event, here is my model. Say there are $n$ tests, with test $j$ being a proportion/weight $w_j$ of your final grade. For ease, I will normalize so that you have one hour, or one unit of time, to study each day (you can easily change this to $8$). You're given days $d_1,\dots,d_n \in \mathbb{N}$. To have a more advanced model, I will also introduce learning parameters, $(\lambda_j)_{j=1}^n$, measuring how quickly you can learn the subject that test $j$ is testing on. The smaller $\lambda_j$, the more slowly you learn.
Finally, for a given $\lambda$, we'll have a function $f_\lambda : \mathbb{R}^{\ge 0} \to [0,1]$, where $f_\lambda(t)$ reflects how much you have learned after spending time $t$ studying, if your learning parameter (for the given material) is $\lambda$. The expected grade you get on the test after studying for time $t$ is $100f_\lambda(t)\%$, which I'll just call $f_\lambda(t)$ for ease.
What should $f_\lambda$ be? We assume $f_\lambda(0) = 0$; that is, you start off knowing nothing (it is easy to change this if you wish). The function $f_\lambda$ should of course be increasing and reflect the fact that learning is most rapid at the beginning and then slows down and asymptotically approaches a limit (of perfect knowledge). It makes sense to have $$\frac{\partial f_\lambda}{\partial t}(t) = \lambda (1-f_\lambda(t));$$ intuitively, starting at $t=0$, $f_\lambda$ starts at $0$, starts out increasing with rate $\lambda$, and then increases at a slower (and slower) rate as $f_\lambda$ increases. Solving the differential equation yields $$f_\lambda(t) = 1-e^{-\lambda t}.$$
With all these parameters (and given context), what is the appropriate utility function? It's pretty clear to me that it is the sum of the expected grades, weighted by the worth of the test: $$U = \sum_{j=1}^n \left(1-e^{-\lambda_j t_j}\right)w_j,$$ where we studied time $t_j$ for test $j$.
Therefore, we come to the following optimization problem. First, let us write $\vec{t}^{(d)} = (t^{(d)}_1,\dots,t^{(d)}_n)$ to denote the times spent studying for each test on day $d$.
$\textbf{Optimization Problem}$: Given $n \in \mathbb{N}$, learning parameters $\lambda_1,\dots,\lambda_n > 0$, weights $w_1,\dots,w_n \in [0,1]$, and days $d_1,\dots,d_n \in \mathbb{N}$, choose times $(\vec{t}^{(d)})_{d \ge 0}$ satisfying $\sum_{j=1}^n t^{(d)}_j = 1$ for each $d \ge 0$ in order to maximize $\sum_{j=1}^n \left(1-e^{-\lambda_j t_j}\right)w_j$, where $t_j := \sum_{d=0}^{d_j-1} t^{(d)}_j$.
One can solve this optimization problem for any inputs. We will do so for $n=2$. Let me first remark though that the main regime of interest is when, roughly, $\lambda_j \approx 1/d_j$ (the point is that $e^{-y}$ is close to $1$ when $y$ is very small and is close to $0$ when $y$ is large).
$\textbf{Solution for $2$ Tests}$ Say $d_2 = d_1+\Delta$, where $\Delta \ge 0$. Of course, on days $d_1,\dots,d_1+\Delta-1$ (if there are any), we will only study for test $j=2$. So write $t_2 = \Delta+\overline{t_2}$. We wish to maximize $(1-e^{-\lambda_1 t_1})w_1+(1-e^{-\lambda_2 t_2})w_2$, which is of course equivalent to minimizing $e^{-\lambda_1 t_1}w_1+e^{-\lambda_2 t_2}w_2$ (since $w_1,w_2$ are fixed). We may write this as $$e^{-\lambda_1 t_1}w_1+e^{-\lambda_2\Delta}e^{-\lambda_2 \overline{t_2}}w_2.$$ We want $t_1+\overline{t_2} = d_1$ (we can then do $t^{(d)}_1 := t_1/d_1$ and $t^{(d)}_2 := \overline{t_2}/d_1$ for each $0 \le d \le d_1-1$). So we wish to minimize $$e^{-\lambda_1 (d_1-\overline{t_2})}w_1+e^{-\lambda_2\Delta}e^{-\lambda_2 \overline{t_2}}w_2.$$ Differentiating with respect to $\overline{t_2}$ to find the minimum yields $$\overline{t_2} = \frac{\ln\left(\frac{\lambda_2 w_2}{\lambda_1 w_1}\right)+(\lambda_1 d_1 - \lambda_2 \Delta)}{\lambda_1+\lambda_2}.$$ You can now obtain $t_1$ and thus how much you study on each day.