Sunday, July 25, 2010

Dealing With Dashcode, Part 3: Dashcode, WebDAV, and Trailing Slashes

(This is a continuation of a previous post.  You may want to read the first post in this series for an overview of the architecture.)

At this point, I had a WebDAV-accessible directory that I had configured Dashcode to deploy to.  It worked to deploy once, but after that, future deploys would fail.  A dialog box would appear with the title, "An error occured while deploying the project to the server." and the text, "The folder “myapp” cannot be created on the server."  After that, the status at the top of the Run & Share page would show "Last deploy attempt failed: The folder “myapp” cannot be created on the server."

The problem is that Apple's WebDAVFS will send a WebDAV PROPFIND request for "serv/myapp" (which is a folder), get a 301 redirect to "serv/myapp/" (with a trailing slash this time), and then send a PROPFIND for "serv/myapp/", but using credentials which are only valid for "serv/myapp".  Apache doesn't like that, sends a 401 Forbidden response, and WebDAVFS gives up and sends an error to Dashcode.  (I'm not sure whether this particular problem of retaining credentials across redirects is specific to WebDAVFS, or if it's actually in CFNetwork.)

The solution is to not have Apache send the 301 redirect.  Adding this to the Location section for my DAV URL does the trick:
<Location /myapp-devel/dav/>
# In addition to the DAV-related directives
# I talked about in Part 1, add:
DirectorySlash Off
</Location>
This is probably a good thing to do on all DAV directories, since redirection isn't exactly a great thing to do within a filesystem.

After this, I could deploy repeatedly from Dashcode with no problems.

This is as much as I needed to deploy my app.  Debugging it, though, was still of an issue.  See the next post!

Dealing With Dashcode, Part 2: Apache, WebDAV, and umasks

(This is a continuation of a previous post.)

At this point, I had set up a directory that I could publish to using WebDAV and also edit as my regular user id.  Since WebDAV works as user www, my normal user id (piquan) wouldn't have access to edit them.  So I needed to set up a directory using FreeBSD's ACLs to give write access to both me and www.
$ mkdir serv
$ setfacl -m u:piquan:rwx,u:www:rwx piquan
$ setfacl -d -m u:piquan:rwx,u:www:rwx,u::rwx,g::rx,o::rx,mask::rwx serv
The problem is, when WebDAV created a directory, it created it with mode 755 (rwxr-xr-x).  The problem is, if a program that's not aware of ACLs creates (or chmods) a file or directory, the ACLs mask bits get taken from the group access bits.  In other words, the permissions that the program gives for the group restrict the permissions of any u:… or g:… entries in the ACLs.  The r-x mask that WebDAV created my directory with masked my u:piquan:rwx ACL entry to an effective u:piquan:r-x.  (Since www was the directory's owner, its u::rwx entry didn't get masked.)

In other words, I could no longer add, rename, or delete files from the directories that WebDAV created.  I also couldn't edit the files, since they were created with mode 644.

After some investigation, I found that WebDAV creates files and directories based on the Apache process's umask.

On one hand, I could edit /usr/local/sbin/apachectl and /usr/local/etc/rc.d/apache22 to set the umask before launching Apache.  However, I didn't like that idea, since then any Apache upgrades would blow away my changes.  I wanted something a little more lasting.

Instead, I used – or rather abused – a mechanism in rc.d/apache22 that's designed for setting process limits.  It includes a bit that runs some shell code and evaluates the output.  I added to /etc/rc.conf:
apache22limits_enable=YES
apache22limits_args="echo umask 0002"
After that, I ran "sudo /usr/local/etc/rc.d/apache22 restart", sudo rm'd the files that WebDAV had created already, and published it again.  This time, WebDAV created files I could edit.  Mind you, after that, Dashcode couldn't delete the directory it had created to republish it, but that's for my next post.

Dealing With Dashcode, Part 1: Architecture and Basic Configuration

I've been writing a web app for the iPhone recently.  The main tool for this type of web app is Dashcode, which is part of Xcode.  There's a couple of problems with my setup: one of them I've solved, and the other I haven't.  In the hopes that somebody else on the Internets might find this information useful, here ya go.

The solved problem is how to publish this on my own server.  Most of this revolved around my Apache configuration.  Most of this should be pretty obvious to anybody who's dealt with WebDAV, but one problem with authentication wasn't obvious; that's at the end, but first I'll talk about my setup.

My app involves a combination of three major elements: static HTML documents, dynamically generated data, and the web app that's generated by Dashcode.  (The web app is served as a static page by Apache, of course – it's the client that does all the computation there – but it has its own considerations in the Apache configuration.)  My web site, therefore, has three sections.  (All the project names, domain names, etc. in this are replaced with more generic versions.  These are not real URLs.)
  • http://www.piquan.org/myapp/gen/   : Dynamic content, generated by Python scripts.  This is being read by Ajax, so there's some files in here that generate XML, some that generate JSON, and for my own convenience it's nice to have a few static files in here as well.
  • http://www.piquan.org/myapp/htdocs/   : Static content, just a bunch of HTML files.  The web app will sometimes send the user to these files (and out of the app entirely) by setting window.location.
  • http://www.piquan.org/myapp/serv/   : This holds the app generated by Dashcode.  I want Dashcode to be able to blow everything under here away, and replace it.
All of the files are currently in ~/src/myapp, and while I'm just working on the initial version, I want it to be served straight out of there.  While I'm doing development, all of this is under http://www.piquan.org/myapp-devel/ , and I'll move to /myapp/ once I've got a releasable version.

To get the static files to be served is pretty easy.  I just put this in my Apache config (within the VirtualHost section for www.piquan.org):
Alias /myapp-devel/ /home/piquan/src/myapp/
<Directory /home/piquan/src/myapp>
AllowOverride All
Order allow,deny
Allow from 192.168.42.0/24
</Directory>

(The "AllowOverride All" is to make it easier for me to experiment using .htaccess instead of needing to change my Apache config and restart the server. Of course, .htaccess is somewhat limited in some ways, as we'll discuss later.)

Ok, so what next? Well, I need to serve the dynamic content.  Remember that there's both static and dynamic files in there.  Ideally, it should be transparent whether the content is coming from a static file or being dynamically generated.

To generate the dynamic content, I chose to use Python with raw WSGI.  This is a very simple way to write simple dynamic content.  For complex projects, then a web application framework like Django would be more appropriate, but here I'm looking at about 300 lines of code, so Django would be overkill.  Also, remember that here I'm only sending either simple XML or JSON.  Now, WSGI is actually an interface standard (like CGI), not an implementation; the Apache WSGI implementation is mod_wsgi.  I installed this (using FreeBSD's Ports mechanism).

The default for mod_wsgi is to serve data from within the Apache process.  On one hand, this is very efficient.  On the other hand, it's very inconvenient for development, because if you change your code, you have to restart Apache.  There's an easy solution: put the WSGI handlers in a separate group of processes that Apache automatically manages.  When you change your code, mod_wsgi will automatically kill and restart those.  If you don't understand that, then don't worry: the config file additions are quite simple.  I just added to the VirtualHost section:
WSGIDaemonProcess piquan.org display-name=%{GROUP}
WSGIProcessGroup piquan.org
This sets up a simple set of processes to manage all WSGI requests.  In this configuration, the same set manages everything across my entire domain.  (That's not because of "piquan.org" there; that's just an identifier for the process group.  It applies to my entire domain because it's within the VirtualHost section, not within a Directory section.)  Once it's time to release, I'll change the name from "piquan.org" to "myapp" or something, tune the WSGIDaemonProcess line to use the appropriate amount of resources (in terms of processes, timeouts, etc), and move the WSGIProcessGroup to within a Directory section so that it only applies to my program.

The mod_wsgi Quick Configuration Guide says to use WSGIScriptAlias to tell mod_wsgi what directory should be considered WSGI programs.  However, remember that I have a mix of dynamic and static files, so I took a different tack.

To deal with this, I put the static files in files named things like staticdata.js, staticdata.xml.  The dynamic files are named things like dynamicdata.js.wsgi and dynamicdata.js.wsgi.  Then, a bit of .htaccess magic, along with the miracle of MultiViews (which I don't know WHY it's not more widespread) lets me tell Apache to send things the right place.
Options All MultiViews
MultiviewsMatch Handlers Filters
AddType application/json .js
AddType text/xml .xml
AddHandler wsgi-script .wsgi

This is actually a bit of overkill. I specified a very widespread Options line.  The only option I really needed for this was MultiViews.  [Edit: I also need ExecCGI.  Thanks, Graham!]  It's just more convenient during development to have things like Indexes and FollowSymLinks on.  Also, I set the MultiviewsMatch to include Filters, when really I only need it to deal with Handlers.  (I'll probably turn on the mod_deflate filter later, since this is an iPhone web app that will sometimes be sent over 3G and even EDGE networks, but that's not something that MultiviewsMatch needs to be involved in.)  I generally tend to put in pretty broad web capabilities during dev, and then tighten it when I deploy.

The two AddType directives are because I thought one of the libraries I was using was being a bit picky about the Content-Type it gets back.  (I could have named my JSON files with .json instead of .js, since Apache already associates application/json with the .json extension, but the .json extension irks me a bit.)  As it turns out, the library wasn't as picky as I thought (for instance, it would be happy with Apache's default of application/xml, which is arguably more appropriate for this purpose; note that both are valid MIME types for XML), but I left them in anyway. Note that the AddType directives only apply to the static data; they don't apply to WSGI scripts, since those send their own Content-Type header.

(By the way: when you're writing apps like this, you can handle REST-style URLs pretty easily.  For example, http://www.piquan.org/myapp/gen/person/piquan could be handled by a script named gen/person.wsgi, which can look at environ['PATH_INFO'] to read the "/mary" bit.  Note that this is the environ passed to application, not sys.environ.  If you really wanted to get fancy, you can probably add some MultiViews magic along with sections to separate this into person.GET.wsgi, person.PUT.wsgi, person.POST.wsgi, etc.  Hmmm... maybe I'll write a filter module to let that happen easily.)

Finally, we have the Dashcode-generated content.  This is created on my Mac, and I need to send it to my web server.  Hello, WebDAV!  WebDAV is pretty nice to have on your web server if you use a Mac as your desktop: it lets you keep an iCal calendar shared on a website (without paying for MobileMe!), gives you a pretty convenient and WAN-accessible file storage from the Finder, you can publish from iWeb, and so on.  (Ok, blatant advertising done.)

WARNING: Don't configure WebDAV until you have secured your web server!  WebDAV lets people write to your disk.  That's its point.  If you don't have a secure server, then you may find people you don't want writing to your disk.

In fact, don't trust my configs on this.  Read over the docs for mod_dav, and you should have a decent understanding of web server security.  At a bare minimum, read Apache's docs on Security Tips, and also the docs on Authentication, Authorization and Access Control.  You should also know why the latter is insecure.  (Hint: Don't use Basic auth; use Digest instead!)

Ok, now that I've made it clear that I don't want you to open your disk to the whole world, here's a bit of my configuration to allow Dashcode to publish using WebDAV.  I actually opened up a lot more than I needed to: instead of just the area where Dashcode puts its static files, I have it set up to allow WebDAV access to my entire program.  This lets me fiddle with stuff in the Finder if I need to.

Now, here's why that's a really bad idea from a security perspective.  WebDAV lets people write files.  WSGI lets people run files.  That means that with the two together, an attacker could write to, and then execute, a file on MY computer.  Bad news.  Don't do this unless you're satisfied with your security: in particular, and at least put a reasonable Allow clause in place.

Here's the configuration I used.  Again, this is back in the Apache config file, in the VirtualHost section.
Alias /myapp-devel/dav/ /home/piquan/src/myapp/
<Location /myapp-devel/dav/>
Dav On
Options Indexes
Require user piquan
</Location>
And then in .htaccess:
AuthType Digest
AuthName "myapp DAV area"
AuthUserFile /home/piquan/.htdigest
(By the way: everything I put in my .htaccess, you can put in the regular Apache config.  I just prefer to use .htaccess for everything I can, since it lets me change options without restarting Apache.)

Finally, I set up my password file by using the htdigest shell command on my web server:
$ htdigest ~/.htdigest "myapp DAV area" piquan
(I already had a ~/.htdigest, and just was adding a new realm+username+password tuple.  If I didn't have an .htdigest, I would need to specify the -c option.)

Now, the mod_dav docs recommend using completely different URLs for the WebDAV URL and the "regular" serving URL.  That's probably a little bit easier to configure.  As for me, I kinda wanted to keep my entire project under the same top-level directory (myapp-devel), so I made the DAV-enabled section a subtree.

Ok, there's one element left: permissions.  I needed the web server to be able to write to the serv directory (where Dashcode was to write its files), but I also wanted to be able to – as my normal user on the web server – edit those files (to edit the manifest, or for whatever other post-deploy changes I wanted to make).  Since WebDAV accesses files as user www, and I access them as user piquan, I needed to make serv writable by both.  But I didn't want to make it globally writable, either.

For this, I turned to FreeBSD's ACL support.  I'm not going to go into details on how to configure a filesystem or kernel for ACL support; there's guides for that online already. (If you don't like the Handbook's dry style, try the O'Reilly ONLamp article instead.)

I needed the serv directory itself to be writable by www and piquan, and also I needed any files within that either I or WebDAV create to be the same (unless explicitly changed).  This means configuring both an access ACL and a default ACL.
$ mkdir serv
$ setfacl -m u:piquan:rwx,u:www:rwx piquan
# The next command is all on one line.
$ setfacl -d -m u:piquan:rwx,u:www:rwx,u::rwx,g::rx,o::rx,mask::rwx serv
After all this was done, I went back to Dashcode's "Run & Share" section, and added my web server as a new destination. Finally, it was time to publish!

At this point, everything I've talked about so far worked. But soon after that, I started having problems.  I'll tackle each of these in a separate post.