We have 2 main tasks – parse the data from the Wayback Machine HTML dumps, then insert it into the Taaalk database. Parsing will be easier than inserting the data because it doesn’t depend on the rest of the Taaalk application. I think we should therefore start by focussing in inserting.
Do the HTML dumps have all the data we will need? Yes, mostly. They have message contents, usernames, user bios, message created times. As far as I can tell they don’t contain user emails.
We need to think about how to handle migrating users, especially since we don’t have email addresses for them. Are we happy making these pages read-only and having the Taaalks finalised? If