1. Profiling Django applications – part1

    I have been doing some profiling of a Django application in different ways, and this is something that can be very useful to understand where bottlenecks are, why code behaves in certain ways and where we can trim the fat.

    There are multiple places where we can profile Django, and many of the pointers and resources from Python at large can be used when profiling Django. Things like Hotshot, cProfile and timeit are available, and there are helper libraries around these tools.

    In this series of posts I am going to be looking at what we can do to look at the performance of our Django application.

    Logging.
    Firstly if you are not using logging in your app then start here. Without logging you are seriously making life more difficult for yourselves. The Python logging module is fine, Django 1.3 and onwards includes built in ability to log and setup logs from the settings.conf file, before that you could get email logs from setting up email subsystem and admins. The error email gave you stack traces of errors. It is better than nothing, and you will get notification of others having an error on your site but having debug level logs for development and error logs on exceptions and exception catching middleware that logs will help you a great deal.

    Logging daemons
    If you are logging in a production environment a good first step is to use the tools that you have available to you on the operating system that you deploy to. This means use SyslogD or RSyslog on BSD and unix systems (unless you have very long logs – syslogd truncates long lines). This will give you automated rollover, the ability to configure central logging daemons and a logging infrastructure that most sysadmins will understand.

    The Django settings.conf allows you to setup logging to the daemons and as it uses the Python logging module you can benefit from logging to several places if you need to. For instance a common pattern I use is to email out exceptions to admins, log normally to syslog.d but when an exception occurs log to a separate file so that the stack trace doesn’t get truncated.

    Another reason to use a logging daemon and not just write to a file is that logging daemons can accept input from multiple sources. This really helps when you have multiple worker daemons on the same machine (which is most probably true if you are trying to take advantage of a multi-core system). Multiple daemons writing to the same file is not recommended with Python.

    External logging services.
    Moving on from here, if your site is in need of more logging capabilities consider using the excellent Django Sentry that was rolled out of Disqus. It will log to a central database, giving you excellent tools to deep dive into your logs for analysis.

    Lastly if you prefer a hosted solution you should consider newrelic – it costs money, but in my testing of it it has been excellent. Not only does it give you the abliity to collect logging information, it also provides centralised profiling and indepth information on the permformance of your applications.

    By timc3 on the
    May 10th, 2012
  2. Disqus

    In the battle to stop the hacking of WordPress I have migrated the comments over to Disqus.

    Looking at the logs over the past few days its quite clear that there is something wrong with the way that WordPress handles incoming comments, probably all it takes is to buffer overflow the comment URL and the hacker can inject code.

    So I have disabled the comments in WordPress and moved over to disqus, which was something that I have been wanting to do for a long time anyway. I want to migrate more and more away from WordPress until I get to the point of either using it as a static site generator or building a from a real static generator.

    By timc3 on the
    May 7th, 2012
  3. Firefox video performance

    There is not going to be much decided in this post, rather looking to vent.

    After spending about 7 hours yesterday looking into performance problems with Firefox I can’t but help be dismayed that webm and ogg playback take up so much CPU time.

    In my tests it is currently using 100% of one of my cores, but what is worse is that it effects anything else going on at the same time, meaning that I have to be really careful running loops, creating JavaScript objects and generally doing anything else on the same page.

    The sooner they implement support for native operating systems video support the better.

    By timc3 on the
    May 1st, 2012
  4. Celery ghetto queue

    Currently wondering how the ghetto queue might work for smaller installations.

    I have done larger installations using RabbitMQ, celery’s preferred message mechanism, but for smaller loads and services running straight from django and postgres might be a good idea.

    I guess it maintains the flexibility of having an upgrade path but is less to install and maintain, thus being kinder to sysadmins and support staff.

    Going to try it I think.

    By timc3 on the
    April 27th, 2012
  5. Cantemo update

    Just a quick update to what is happening with Cantemo these days.

    Firstly we have two main products in the marketplace at the moment, MediaBox and Portal DAM. Portal is our enterprise product, a Digital Asset Management system that can be tailored to suit just about any need or requirement. For instance if you need to build

    • Cloud hosted Digital Asset Management system
    •  System that has multiple workflows for multiple user groups
    • A system that integrates with many different parts ( think Business Process Management systems, Transcoder farms, Newsroom Systems, NLEs, Digital Rights Management, Playout servers) and can work as the controlling force for all of that.
    • System that can scale to many hundreds of thousands of users and millions of assets with multiple storage solutions.

    We also have the MediaBox aimed at smaller workgroup solutions, such as PostProduction, small workgroups, Education, and other areas.

    Common to both is the following:

    • Sold through a network of resellers and system integrators
    • Can be extended with Apps such as Annotation Tools, usage reporting.
    • Has a open and unique API to build your own tools.
    • Can integrate with the Apple Final Cut Pro, Adobe Premiere and other NLEs

    We are showing these systems at NAB along with the apps and the integration with partners such as Object Matrix and Vidispine. If you would like a online demo, that can also be arrange.

    By timc3 on the
    April 19th, 2012
  6. Removing hacked WordPress files

    The other day someone kindly told me that this blog had been “hacked”. Actually what had happened was that someone had managed to inject PHP code in to the wordpress theme files, the wordpress blog files and all the plugin files.

    What was particularly interesting was that it only showed up if you haven’t visited the site before, making it harder to spot. In the header of each PHP file there was a php eval base 64 encoded string which contained this redirect code.

    If you have a simliar problem you either need to grep each file containing base64 encoded PHP or, replace the main wordpress blog, re-upload the theme and reinstall all the plugins. If you want an easy fix, sorry, next time consider using Fabric/Puppet/Chef and having a backup version of the site that you can deploy at the drop of a hat.

    Suffice to say that I am only using WordPress really because its been making my life easier and I can concentrate more on Python, Javascript, GoLang and Erlang efforts for work and the fun stuff but I am seriously thinking about jumping platform. Also it doesn’t help that the host this blog is quite slow these days.

    By timc3 on the
    March 24th, 2012
  7. Postgres and psql for beginners

    Actually I am not sure whether I can put myself in the beginners pot, having run Postgresql for many years and built several products and service upon it. However a round of using PSQL for common use cases is needed I think. So this basically is for that usual situation where you have logged in to the server that postgres is running on and have access to the user postgres or have the rights to use psql.

    Startup psql command line interface
    psql

    Meta commands
    Meta commands are excellent short commands processed by psql itself. Some are shortcuts to longer SQL statements that get executed, others are system commands such as change directory.

    List databases on the system

    \l
    List of databases
    Name | Owner | Encoding | Collation | Ctype | Access privileges
    -----------+----------+----------+-------------+-------------+-----------------------
    database1 | tim | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
    postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
    template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres
    : postgres=CTc/postgres
    template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres
    : postgres=CTc/postgres
    database2 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
    (5 rows)

    List users on the system

    postgres=# \du
    List of roles
    Role name | Attributes | Member of
    -----------+-------------+-----------
    tim | Superuser | {}
    : Create role
    : Create DB
    postgres | Superuser | {}
    : Create role
    : Create DB
    another | | {}

    List help

    \?

    List help

    \?

    Change to database called “databasename”

    \c databasename

    By timc3 on the
    February 1st, 2012
  8. Going back to Java applets for upload

    This is something that I didn’t think that I would be writing in 2012 – we are switching to a Java based applet for our uploader.

    The problem has been that browsers and their operating systems handle uploads differently. Really differently. Evening using a project like plupload, you can’t polyfill over the cracks. There are new APIs coming out (if you want to call it HTML5, not strictly true though), but to handle backwards compatibility, including IE9, there needs to be a mechanism to make it generic and the only thing is Java.

    • Flash can be ok, but there are limitations in filesize,
    • Javascript depends on the browser implementation.
    • Gears & Silverlight – not exactly something that I want to deploy.
    • Java – the cross platform way.
    • Depending on the browsers built in widgets – Not possible for large files.

    We need to transfer large files of over 2Gb, and some browsers will actually try and load the entire file into memory before transferring. Certain death of the browser on old machines.

    Java gives us chunking, it gives us a cross platform implementation, it gives us a easy testing mechanism. 2012 – roll out your applets.

    By timc3 on the
    January 25th, 2012
  9. Javascript MVC frameworks with MVC server frameworks

    Really interesting article about Javascript MVC frameworks at CodeBrief.

    Personally and for work I am totally into backbone.js. It is changing the way that I work with Javascript in my web applications for the better by bringing greater structure and organisation to my code and having a common methodology in the way that I approach server-browser communications.

    The only problem that I have is that having a MVC stack in the browser is fine, but then we have the same on the server side with Rails (ok not exactly MVC), Django, Node.js or what have you. We end up with MVCMVC or rather MCVCMV, and this is of course a pattern that causes code duplication if we are not careful.

    I might look into creating backbone.js models from Django at some point, but I am not sure that this is the correct route. Obviously the way that GWT and PyJamas work becomes more compelling. Well for me GWT is not that compelling as I don’t really want to have to work on frontend in Java, I don’t think there is the control that one needs over the presentation and I can’t stand Eclipse/NetBeans/xxxx IDE’s as they are slow and cumbersome.

    Regarding the Codebrief article ember.js came out on top, and I can see the negatives of backbone.js in its somewhat boilerplate approach though I think this can be minimised somewhat.

    Personally I think my goals going forward for Javascript is:

    1. Find a decent testing enviroment that I am happy with
    2. Loading of modules should be optional, and I need to work to convert my existing application to load intelligently our backbone.js models and views
    3. Greater reuse or automatic generation of code

    I am intending to write about all of this on this blog in the near future as I come up with more conclusions.

    By timc3 on the
    January 21st, 2012
  10. The new web stack

    There is a really interesting article over on Ilya Grigorik’s blog titled . Be sure to read the comments if you are interested in how AOL built a similar system many moons ago.

    I have been thinking about next generation web stacks, with proper (M)VC in the browser, communication with server using SPDY / WebSockets / SSE or similar with SSL, and then the topology of the stack in the background being event driven and that article has some really nice points. Kinda interested in playing with 0MQ now and perhaps hooking it up to Go, Erlang or just plain old Gevent.

    Probably wouldn’t use a NoSQL backend because there is little reason most of the time.

    The hackernews article is here: http://news.ycombinator.com/item?id=3481140

    Damn I feel old fashioned having this blog on Apache | PHP | MySQL. At least my work environment is Nginx | Gunicorn | Python | Postgresql

    By timc3 on the
    January 19th, 2012