How to Set an Empty Dictionary/Hash as Default Value for a PostgreSQL HSTORE Column

Do you have an HSTORE column on your PostgreSQL database that you don't want to be `null` but need to have a default value? The syntax for this is a little irregular; so I'm posting it here for my own reference and yours:

my_column HSTORE DEFAULT '' NOT NULL

is the line in your `CREATE TABLE`command that you want.

In Ruby on Rails, using an ActiveRecord migration, you would use:

t.hstore :my_column, default: {}

In Python, using a SQLAlchemy migration, you would use:

sa.Column('my_column', HSTORE(), nullable=False, server_default=sa.text("''")),

Additionally, if you want your SQLAlchemy model object to initialize this column with said empty dictionary (instead of `None`), per this StackOverflow post you need take a couple of extra steps in your model:

from sqlalchemy.dialects.postgresql import HSTORE
from sqlalchemy.ext.mutable import MutableDict

class Item(db.Model):
    my_column = db.Column(MutableDict.as_mutable(HSTORE), nullable=False, default={}, server_default='')

    def __init__(self, **kwargs):
        kwargs.setdefault('my_column', {})
        super(Item, self).__init__(**kwargs)


The More You Know!

Fix Broken Beatport MP3 Downloads in Serato

It kinda be like that...


If you're experiencing a problem with Serato DJ, where MP3s purchased on Beatport cannot seem to save key, BPM, or track analyses and overviews, then perhaps a tool that I have written might help!

Check out MP3TagRebuilder, a simple Python script I wrote to address this issue with my own DJ library!

This tool addresses an issue I've been encountering somewhat frequently over the last few years, where my Beatport music purchases have a weird glitch in Serato where overviews and tag data won't save, even after using the "Analyze" feature. The only solution I have found, even after writing Serato support, is to rebuild the MP3 files' ID3 tags destructively.

However, every program that I know of that does this ends up dropping important tags, such as Album Art, because none of them provide a direct pathway to simply destroying the ID3 tags and then rebuilding them with a new datastructure; most of them only seem to support converting from ID3v2 to ID3v1 and back again. So I wrote my own!

If you are encountering this issue and are feeling bold enough to test my code on your own library (MAKE SURE YOU HAVE A BACKUP AND TEST IT!!!!), head on over to Github and check it out!

You will need a working Python environment and must be comfortable with a command prompt. Instructions for running this tool are included in README.md, and instructions for installing Python can be found here.

Get it from Github here: https://github.com/lyjia/mp3-tag-rebuilder



Surviving the Slashdot Effect: Caching Web Traffic with Rails and Cloudflare

So you've got a content-oriented website, maybe your own blog or something, and maybe you've (like me) decided to ignore the advice about using static site generators. You build your Rails site, and it is wonderful and beautiful and dynamic, and every page it replies with delights readers with your artisanally-crafted HTML. Maybe you've got some internal caching (Rails has you covered here), maybe it's all roundtrips to the database. But who cares! Your site is up and receiving traffic!

Then, suddenly, a storm hits. Congratulations! You've made it to the top of Reddit/Slashdot/Hacker News! You now have thousands, if not millions, of people beating down your door to read your content. But now your site is down! The link's comment thread is filling up with complaints about being hugged to death, and a few helpful souls are posting the archive.org equivalent of your link and siphoning away your traffic.

How do we fix this?

You could throw more compute resources at it -- think "scaling vertically/horizontally" -- which I'm sure your server/application host would ab$olutely love.

Or,
you could install some sort of proxy cache in front of it. The traditional answer here is to use nginx or Varnish as a caching proxy, but if you use a content delivery network (such as Cloudflare) it may be better to use that CDN's caching features instead. (Some might recommend using both your own cache and your CDN's cache, but I wouldn't advise this because troubleshooting cache issues is already difficult enough, and having multiple layers only makes debugging even more confusing. If you do this, you should understand your web application thoroughly.)

Since this site is fronted by Cloudflare, I want to make use of its page cache: it's free and comes with the service!

However, setting this up is not as simple as it may first appear: in a default configuration, Rails doesn't permit caching (the Cache-Control headers it sends don't allow for it), and as a result, nearly every request you receive bypasses the cache and gets passed directly to the app server. This is a screenshot of my Cloudflare dashboard showing the percentage of page requests cached before I applied the fixes I describe here (those peaks top out at ~10%):

Uh.... that's not very good!


Now, you can set up rules in the Cloudflare dashboard to override Rails' requested caching behavior, but this does not solve the underlying root cause: Rails is requesting no caching, because the Cache-Control request header it sets explicitly forbids it:

Cache-Control: NO CACHE!


Setting the Correct Cache-Control Headers with Rails

NOTE: The directions given here apply to Ruby on Rails version 7, though expires_in and fresh_when have existed since at least version 2, and concerns have been available since version 4.

Luckily, Rails makes changing this behavior fairly simple. We don't even need to really dive into how Cache-Control works! (Though here is a good guide if you want to know.) You simply call the expires_in and/or fresh_when functions in your controller, supplying an expiry and ensuring that you set public to true. Like this:

expires_in 1.hour, public: true
# or
fresh_when(@article, public: true)

However, setting this for every route is both tedious and a pretty egregious violation of DRY. Instead, we can set as much as we can once and then propagate it through our application using either class inheritance or composition (via ActiveSupport's concerns) feature. And while inheritance may be slightly easier, composition is a bit more modern and flexible; here we will be taking the latter approach.

To start, we will want to make a new concern and call it "Cacheable". The easiest way to do this is to simply go to the $RAILS_ROOT/app/controllers/concerns folder and create a new file, naming it cacheable.rb. In this file, we want to make one small action (called "set_cache_headers") and call expires_in within it. Here is a very basic and usable example, which also prevents page caching when a user is logged in:

# app/controllers/concerns/cacheable.rb
module Cacheable
  extend ActiveSupport::Concern

  included do
    before_action :set_cache_headers, unless: current_user
  end

  def set_cache_headers
    expires_in 1.hour, public: true
  end

end

Then, for each controller whose content you wish to cache, simply add "include Cacheable" at the top right below the class declaration. Here is an example pulled directly from this site's code, for the controller that powers the "About" feature:

# app/controllers/static_controller.rb
class StaticController < ApplicationController
  include Cacheable
  def about
    @page_title = "About"
  end
end

Once this is done you will see that Cache-Control is indeed being set correctly:

Objective achieved!


But! You are not finished yet! You may notice that while your 'cached' stats are going up, they aren't going up as much as one might think. This is because there is another component to page caching that we have not yet discussed: etags.

Setting the Correct Etag Headers with Rails

This is where things get a bit more tricky: Rails generates another header, called an Etag, that, in theory, is supposed to be unique for each page.  (For the more technically inclined, you can think of an Etag as like a SHA256 hash for your page.) But Rails, by default, makes this tag unique per request. Both your browser cache and your CDN cache read this header to determine whether a given request is a cache hit or cache miss, and so we will need to configure Rails' to set it correctly, based on our rendered page content (or other context).

Enter fresh_when, which provides further direction to Rails on how to render the correct etag header. You provide it with an object that describes what the page renders -- generally the model instance for the given page (the Rails docs use @article in their examples) -- and it generates a hash that is used for the Etag header.

Using fresh_when with Dynamic Pages

For dynamic pages, such as the example described above and in the Rails docs, simply call fresh_when and pass the it your model instance as its first parameter, inside the controller route. Like so:

# app/controllers/articles_controller.rb
def show_article
  fresh_when @article
end

When combined with aforementioned Cacheable, this is sufficient to avoid page-caching in the case of logged-in users, as the expires_in directive is never called when current_user exists, and Rails reverts to its default, zero-expiry "private" cache behavior.

If you aren't using Cacheable, as described above, you will need to consult the documentation as you need to provide additional information.

Using fresh_when with Static Pages

Static pages are a bit more tricky here, as the examples in the docs do not cover this circumstance. Once again by default the etag will always be different, so we need to pin it to something, yet appropriate data (namely, the path to the view and a timestamp for when it was last updated) isn't really available to our controller. And, like above, we don't want it to get mixed-up when the site is being viewed by a logged-in user vs. the anonymous public.

I haven't figured out a great solution here, but we can build a string using params and current_user and set that as the value of our etag, and it should work well enough for our purposes. But! You will need to manually purge the cache on your CDN when these pages are updated, or wait for them to expire. For this case, a short expiry (say, 15 minutes) is useful here.

(Note that if we can get information about the last-modified timestamp of the template to be rendered, we could include that data in the etag so it would naturally invalidate when it is updated, but I don't know of a non-hacky way to do this and some preliminary research in this area yields nothing.)

So, we craft another concern, this time called StaticCacheable, and include this in each controller serving static content. Once again, like Cacheable, this is a controller-level solution; if you need something per-action that is an exercise left up to you.

To make this concern, create a new file called static_cacheable.rb and save it to your $RAILS_ROOT/app/controllers/concerns folder, right next to cacheable.rb. Note that we will include a reference to Cacheable from within StaticCacheable, so that you only need to include StaticCacheable on your static controllers. In the action it defined we simply grab route information from params and feed that into fresh_when:

# app/controllers/concerns/static_cacheable.rb
module StaticCacheable
  extend ActiveSupport::Concern

  included do
    include Cacheable
    before_action :set_static_cache_headers
  end

  def set_static_cache_headers
    etag =  "#{params[:static]}#{params[:about]}#{current_user}"
    fresh_when etag: etag
  end
end

Note that including current_user makes the Etag different every for request, when current_user is logged in. This is because fresh_when will coerce the string representation of current_user (via an implicit .to_s), which will always vary because that string includes current_user's internal object-id (NOT its database id), which varies with each request.

Finally!

Once your Cache-Control and Etag headers are under control and correctly set, and you have correctly configured your proxy service, your site should be well-equipped to handle large volumes of traffic without falling over. Hurray!

A Quick Note About Cloudflare and Implementing This

It's worth noting that Cloudflare seems to strip the Etag header when Cache-Control renders it useless, as is the case when Cache-Control is set to private. This may seem annoying but it punches out your browser cache, presumably to ease troubleshooting.

You can still see the Etag header if you pull requests directly from your webserver (by its IP address), and it will also be visible during development. Unfortunately, it seems like you will have to rely on unit tests or Cloudflare's provided statistics to verify your cache strategy is working.

How To Set Your Windows Taskbar Application Icon with pyQT/pySide

If you are writing a Windows-only or cross-platform application using pyQT or pySide (version 6, as of this time of writing), you may discover that you need to change your application's icon, as it appears in the taskbar. By default, it is a boring, generic "window" icon, and, naturally, you will want to change it!

But the instructions for doing so aren't very clear. This StackOverflow post gets us started, but the given (and accepted) solutions seem to advise the unnecessary extra step of saving a copy of the icon as a Python variable itself!

There is a simpler way, and it involves just a few lines of code. Inside the __init__() function of your QT window's code, add the following:

from pathlib import Path
from PySide6.QtGui import QPixmap, QIcon #substitute the equivalent pyQT line if you're using pyQT

# load window icon
path_to_icon = 'res/icon.png' #the path to your icon file, relative to the project root
pixmap = QPixmap()
pixmap.loadFromData( Path( path_to_icon ).read_bytes() )
appIcon = QIcon(pixmap)
self.setWindowIcon(appIcon)

And voilà! You now have a pretty icon in the taskbar for your program.

Some Hamburger Helper for Turbo and Bulma

Cheesy!


If you use Bulma CSS, Turbo, and Stimulus in your web project, you may have noticed that the hamburger menu in the navbar doesn't work, and that you need to supply your own Javascript to make it work. While this is explained in the docs -- Bulma doesn't include any Javascript, after all -- the plain JS example provided doesn't work on websites using Turbo.

This occurs because, on a new page load, Turbo replaces the click handler's DOM element with new elements, even if that aspect of the page hasn't changed, without calling DOMContentLoaded (or any other startup event). Meaning, we cannot use DOMContentLoaded to set our click handlers!

So we need something else: instead of DOMContentLoaded we can use Stimulus to add a little "hamburger helper" to the project and get that menu working!

First, we want to create a new Stimulus controller, which we will call navbar_controller.js. In it, we will create one action, toggleMenu(), which will handle the click event and control the hamburger menu by toggling the is-active class on the hamburger HTML elements. In that file, add the following code:

import {Controller} from '@hotwired/stimulus';

// Navbar toggle
// Adapted from https://bulma.io/documentation/components/navbar/#navbar-menu

export default class extends Controller {

    toggleMenu() {

        // Get all "navbar-burger" elements
        const $navbarBurgers = Array.prototype.slice.call(document.querySelectorAll('.navbar-burger'), 0);

        // Get the target from the "data-target" attribute
        const target = this.element.dataset.target;
        const $target = document.getElementById(target);

        // Toggle the "is-active" class on both the "navbar-burger" and the "navbar-menu"
        this.element.classList.toggle('is-active');
        $target.classList.toggle('is-active');

    }

}

Then, in your navbar HTML code, find the <a role="button" class="navbar-burger"> tag that comprises the hamburger button and add the following data attributes:
  • data-controller: set this to the name of your Navbar controller: "navbar"
  • data-action: use this to bind the click event to toggleMenu(): "click->navbar#toggleMenu"
  • data-target: set this to the id of your navbar <div> element, which should have a class of navbar-menu. On this site, the element's id is "ze-navbar", so that would be the value for this data attribute.

Once you've done that, your .navbar-burger HTML element should look something like this:

<a role="button" class="navbar-burger" aria-label="menu" aria-expanded="false" data-controller="navbar" data-action="click->navbar#toggleMenu" data-target="[your navbar id here]">
  <span aria-hidden="true"></span>
  <span aria-hidden="true"></span>
  <span aria-hidden="true"></span>
</a>

And that should be it! Reload the page and everything should work nicely.

I know this seems extremely basic, but if you are like me and still a bit overwhelmed by all the quirks and eccentricities of building a web app in Turbo, you might be thankful to have it all spelled out for you. I certainly would have liked that!

sui generis.

Lyjia's Blog

See posts by category: