1. new in this version
  2. build a web 2.0 app in happstack
  3. why happstack is cool
  4. getting started with happstack
  5. prerequisites
  6. cabal install me
  7. first shot at happstack
  8. url handling
  9. basic HTML inclusion
  10. templates
  11. stringtemplate basics
  12. debugging
  13. form data: get and post
  14. form data: file uploads
  15. cookies
  16. introduction to macid
  17. first steps with macid
  18. scaling with multimaster
  19. using macid safely
  20. macid dummy data
  21. changing the data model
  22. macid stress test
  23. limitations of macid
  24. foreign characters
  25. IxSets
  26. cron jobs
  27. thanks
  28. appendix (floundering in ghci)

Keeping your MACID data safe

If you are using php, ruby on rails, or one of the other popular web frameworks, your user data is likely in a mysql database. If you have outsourced your server hosting, maybe you have a database administrator that takes backups for you on a regular basis. That probably helps you sleep at night, assuming that you can really trust that your dba is doing their job.

As we learned in the previous lesson, if you are using Happstack with MACID, your data is right there on your filesystem, by default in the directory called _local.

~/happs-tutorial>ls _local/happs-tutorial_state/
current-0000000000 events-0000000000 events-0000000001 events-0000000002

If there is money on the line, you are going to want to be careful with this directory.

When migrating MACID data to a new schema, you are also going to want to be extra cautious.

For now, since you don't have any valuable data, the following procedure is probably enough to remind yourself to be careful while learning about Happstack in the tutorial sandbox.

  • Stop Happstack by doing ctrl-c if you are running the ./happs-tutorial app from a shell or ctrl-c and completely exiting ghci if you are doing runInGhci within ghci.
  • ~/happs-tutorial> mv _local _local.20081001-0917am.bak
  • Start the Happstack application again. All users, profiles, jobs, and sessions should be gone, and a new _local directory with nothing in it should have been created. A fresh start.
  • If you want your old data back, backup your existing _local directory somewhere safe (or just rm -rf if you want to get rid of it) , and rename the .bak directory back to _local
The above procedure raises some questions.

Q: Do you have to shut down the Happstack server every time you migrate data to a new schema?

A: No, but online migrations are a topic that will be covered in a future chapter.

Q: Is MACID safe? Could I wake up one day with corrupted data under _local and no way to recover from it?

A: Let's be realistic. Compared to, say, mysql, Happstack hasn't been stress-tested much in critical high-volume web sites. On the other hand, stress testing is on the docket for the Happstack team and when more data is known I'll be including it in this tutorial.

That said, the unix filesystem is pretty good at not losing your data -- a point famously made by startup guru Paul Graham, who created viaweb (now yahoo stores) with all the application state in flat files.

If you use Windows or Mac, you probably believe these filesystem are pretty reliable too.

Taking a closer look at what is under _local...

thartman@thartman-laptop:~/happs-tutorial/_local/happs-tutorial_state>ls -lth
total 12K
-rw-r--r-- 1 thartman thartman 0 Oct 1 13:55 events-0000000003
-rw-r--r-- 1 thartman thartman 0 Oct 1 11:55 events-0000000002
-rw-r--r-- 1 thartman thartman 792 Oct 1 11:04 events-0000000001
-rw-r--r-- 1 thartman thartman 491 Oct 1 11:00 events-0000000000
-rw-r--r-- 1 thartman thartman 25 Oct 1 10:59 current-0000000000
thartman@thartman-laptop:~/happs-tutorial/_local/happs-tutorial_state>

MACID serialization works by writing state change event data one file at a time. At server startup, Happstack "replays" all the information here in the order specified by the file names. This is similar to the database transaction log used by many rdbms systems.

So, if I woke up one morning with my Happstack application in a corrupt, non-startable state and my inbox full of angry customer email, probably what I would do is move files, one at a time, out of the serialization directory, last-file created first, and keep trying to restart Happstack.

Q: What if my hard drive dies and I can't get my data back?

A: Like with any other data storage system, if there's valuable data, you need to be making backups. In the case of Happstack data stored under _local, I would probably be rsyncing the _local directory to a remote server, or maybe multiple remote servers for extra safety. For now I am not worried about securing data, but when that day comes I'm pretty confident I'll be ok.

Let's now populate our web application with dummy data.