I've been reading about the Google App Engine more over the past few days. I set up an account a long time ago, but hadn't really had much of an opportunity to try it out. So, I did the logical thing and walked through the quickstart tutorial. It was pretty good, and produced something better than your typical "Hello World" programming example. That's a good thing since they wouldn't really be able to show some of the more powerful features with such a simple example.
Recently I wrote a pretty simple (read: crappy) neural net in Java, and I was trying to figure out a way to move it to a server environment so I could get more scalability and power than my iMac. I thought about using AWS, but the costs involved are pretty steep, and I'm not just talking about money here: environment setup, getting an engine running, etc. So, I decided to take a look at the App Engine setup and see if I could actually build something there to run my node network.
The first roadblock I found is that App Engine does everything by URL requests. That could work, as I could just kick off the whole thing from a web front end, but it seems a bit hokey if I just want the network to be running all the time. I did find the ability to create cron jobs and put tasks on a queue, but I'll get to my analysis of that in a minute.
There are two big limitations with the App Engine that prevent kicking off an engine from a URL request: 30-second response generation time limit and can't start additional threads. The former isn't a problem if you can do the latter. Obviously you don't really want a web request to take a long long time, so it makes absolute sense for a 30-second time limit. This, in fact, is a huge time for a user to wait for a page to load or other interaction to get handled, so I would argue no web application should take more time than that for a typical request/response. So, this really isn't a problem per se. I didn't want to go down that approach anyway. I really just wanted the URL to kick off the engine and return saying "yep, kicked it off!"
The second issue is bigger problem for typical parallel processing. If I can't kick off a daemon thread which can kick off its own set of worker threads, then the engine dies when the request is processed, until another request comes in. So, I'm really limited to 30-second chunks of work, and those chunks can't easily kick off other chunks of work.
Or can they?
I then learned about cron jobs and task queues. Cron jobs are basically just scheduled tasks. It's named after a *nix program used for scheduling tasks to be run on a schedule, or at a specific time. The App Engine version of cron allows a URL to be called at a particular time or on a schedule, thereby kicking off a request. It sounds interesting, but only if I want to kick off the engine every minute or so. Again, this wouldn't necessarily be a problem if the engine could run for more than 30 seconds at a time. However, URLs called by cron jobs are limited by the same thing as any other request: 30 seconds and no starting threads. That seems like it's going to be a bit slow for my needs, and not really what cron should be used for in my opinion — not at every minute anyway.
The task queue holds more promise. Your application can add tasks to the task queue, and they will get processed by the task queue engine. Tasks are basically URL requests, just like cron jobs, so they are under the same limitations there, but depending on load, tasks can be handled at more like 5 or 10 per second (throttled at up to 20 max). This is more reasonable since each request could take a second or two to process 1000 neurons or so, and then go away. Even at only 20 tasks per second, that's 20,000 neurons handled per second as opposed to 10,000 per minute. This is still limiting, but not quite as much. The kicker? There is a limitation of only 100,000 tasks per day, which means we would run out in about an hour. That's the limit on free accounts, but paid accounts are still limited to a million, so about 10 hours of meaningful work unless it's throttled down, but then fewer neurons get handled at a time. The docs say that this feature is experimental, so maybe they're increase quotas, but the limits are still too low to handle a meaningful number of neurons, especially as it scales.
Next, I will look into using MapReduce — possibly using AWS Elastic MapReduce — for handling some of the core neural engine requirements. From what I've read, I'm not sure it's the right parallelization technique for this, but it should make for some interesting reading.
I'm willing to pay money for the hardware use — it's the setup time that I'm trying to avoid. What we need is an online neural net engine as a service. Maybe that's what I'll end up building in the end.