You ought to have the system detect slow progress / low success during the first 5 minutes, and then go "wait, this isn't working, try Plan B with much smaller chunks", and switch to drilling on a smaller number of questions over and over until the recall rate is high. Slogging through a long sequence of fail,fail,fail,fail does not generate enthusiasm or a sense of progress.
From your description, it sounds like, when a user flubs a question in a session and is shown the answer, you do not quickly re-test them on the same question during the session to improve recall, but just go on to other questions instead.
From your description, it sounds like, when a user flubs a question in a session and is shown the answer, you do not quickly re-test them on the same question during the session to improve recall, but just go on to other questions instead.