Django compression middleware now supports Zstandard

I think I first learnt about Zstandard (zstd) in a blog post by Gregory Szorc. At some stage I saw that zstd is also registered with IANA’s content coding registry for HTTP and I tried to find out how much of the web ecosystem already supported it. At that time there was a patch for zstd support in Nginx, but nothing else, as I recall.

Things are not much better right now, but zstd continued maturing and has been adopted for non-web use by many projects. I recently checked and found one HTTP client, wget2, that claimed support for zstd. So I decided to add support for zstd to Django compression middleware and to test it with wget2. With wget2 I can be sure that at least one web client is able to consume what the middleware provides.

I released version 0.3.0 of Django compression middleware a few days ago with support for zstd. Since I don’t know of any browsers that support it yet, I don’t expect many people to be excited about this. There isn’t even information on the Can I use … website about zstd yet (github issue). However, I see this as my small contribution to the ecosystem.

It is not clear that Zstandard will provide a massive win over the alternatives in all cases, but my testing on multiple HTML and JSON files suggest that it is mostly equal to or better than Brotli and basically always better than gzip with the defaults I currently use. “Better” here means a smaller payload produced in the same or less time.

Django compression middleware now supports zstd, Brotli and gzip.

Does the internet understand your language?

I heard an advertisement on the radio this morning where a son is talking to his father about some commercial service. He points the dad to the web address, and—since the ad is in Afrikaans and he spells the address in English—mentions “Die internet verstaan nie Afrikaans nie” (“The internet doesn’t understand Afrikaans”). I’m thinking of the reasons why whoever wrote the ad felt it would somehow improve things to add that bit to the copy.

Of course, I mostly agree—the Internet doesn’t understand Afrikaans, but neither does it understand English or any other language. Maybe the organisation just feels a bit bad that they don’t have an Afrikaans presence on the web, or might not even know how easy it is to register another domain name as an alias to their main website.

On the other hand, software processing information on the web is able to do amazing things with the information on the web—in English, Afrikaans and other languages. I’m not trying to belittle the fact that the technology support for languages are not equal, but domain names are just characters—you can type in whatever you want (the complexities of International Domain Names ignored for now).

Working with language data is my bread and butter, so it was an unfortunate reminder of the common perceptions about language and technology. I hope some people listening to that questioned it, or at least started thinking about how it could be changed.

My paper at OLC / DEASA

Yesterday I presented at the Open Learning Conference of Distance Education Association of Southern Africa. The title of my paper is “Re-evaluation of multilingual terminology”. I tried to make the case that terminological resources can serve as more than reference resources and I showed concrete examples of how it can also assist with conceptual modelling.

Ontology engineering is big business in the field of natural language processing, but I routinely still meet academics who think that terms with translations (maybe with definitions) is the highest goal we should strive for. My presentation was an attempt to provide a broadened vision.