All posts by Sascha

Freebloks 3D reaches 500’000 installs

The Android version of Freebloks 3D has now reached 500’000 installs over its lifetime and is generally considered the top search result when searching for “Blokus” in Google Play! WOOP WOOP!!

Freebloks 3D in Google Play

Thank you guys so much! Still, the game is fully ad-free and fully featured and hasn’t sold out yet.

Freebloks 3D is available for free in the Google Play Store.

If you’d like to support development, a paid “donation” version is available as Freebloks VIP.

WordMix 2.0

WordMix in Google Play Store

WordMix 2.0 is out! This is a due major update to my little word game for Android and includes a major face lift by bringing the game into the world of Material Design. To keep this simple, I dropped support for Android lower than 5, with a version supporting Android 4 still being available via Google Play Store. However to get the newest features you need to upgrade your device to Android 5.

Of course this update includes a number of bug fixes and general improvements, but here the major changes:

Material Design

Isn’t it pretty? The first thing you may notice is the immersive full screen mode, now spanning the status bar and extending over the navigation buttons, if present.

The action bar has been replaced with on screen controls to give more space to the background pictures and the board. So has the time progress bar given way to a simple timer. A dark button theme is easy on the eye and yet leaves focus on the game itself.

The game contains more background images and some of the old images have been replaced with new ones. All backgrounds are included in the free version, there are no more “premium” backgrounds.

This is just the first step, other screens will follow.

Bottom sheet for contextual information

Tapping a word will now reveal a contextual bottom sheet with information about the word. This for example shows the number of points for this word, whether the word is valid and has been played before and it may include actions of other apps.

If for exampe Google Translate or Wikipedia apps are installed, they appear as actions that allow the player to quickly look up or translate a played word.

Under the hood this will simply show any app that implements ACTION_PROCESS_TEXT, the same way this is integrated into Chrome.

The word suggestion feature now shows a list of possible words, their points, and allows the user to choose, with of course the option to first translate or look up the word to play. It allows to immediately set the word from the bottom sheet, which is very convenient.

New languages

Two new languages and dictionaries are now supported. Any feedback or suggestions for new languages is always welcome!

  • Czech
  • Polish

Available in Google Play and Amazon App Store

WordMix Pro in Google Play Store
WordMix in Google Play Store

WordMix is (c) Sascha Hlusiak 2012-2019.

iOS VPN ondemand probes and caching

In our company we push out a VPN configuration to iOS devices that contains on demand rules to only connect to the VPN when not in the company network, based on various criteria.

TL;DR: unexpected caching behaviour of NEOnDemandRuleDisconnect.probeURL

We discovered that iOS (12.3.1) appears to aggressively cache the response of the probeURL when using NEOnDemandRuleDisconnect on demand rules in a VPN configuration.

The response is cached in certain situations, causing the rule to match when it shouldn’t and not to match when it should. This cache was only reset on rebooting the device and not when switching networks. We did not even see new DNS requests for the probeURL when switching between networks, which was even more problematic as we use Split-DNS for the resource that was probed.

The VPN on demand rules were evaluated every time the network changed, but the probe wasn’t sent and the rules still matched on the cached response.

  • When returning a 200 OK, make sure to add Cache-Control headers!!
  • Do not return 301 Moved Permanently!!

Checking for a HTTP resource

The “probe” resource was only available from within the company network, controlled by Split-DNS.

  // disconnect if "probe" returns 200 OK
  let rule1 = NEOnDemandRuleDisconnect()
  rule1.interfaceTypeMatch = .wiFi
  rule1.probeURL = URL(string: "http://company.com/probe")
  rules.append(rule1)

  // otherwise connect
  let rule2 = NEOnDemandRuleConnect()
  rule2.interfaceTypeMatch = .wiFi
  rules.append(rule2)

The probe returned a 200 OK when queried within the company network. Publicly the DNS entry was pointing to a S3 bucket which did not contain the probed resource.

NEOnDemandRuleConnect and probeURL

We use the NEOnDemandRuleConnect classes that are attached to a VPN configuration. The documentation of the probeURL attribute is fairly sparse:

var probeURL: URL?

An HTTP or HTTPS URL. If a request sent to this URL results in a HTTP 200 OK response and all of the other conditions in the rule match, then then rule matches. If this property is nil (the default), then an HTTP request does not factor into the rule match.

https://developer.apple.com/documentation/networkextension/neondemandrule/1405981-probeurl

Apple’s Configuration Profile reference adds:

A URL to probe. If this URL is successfully fetched (returning a 200 HTTP status code) without redirection, this rule matches.

Cache rules

With extensive testing and sniffing we discovered the following cache behaviour of iOS for probeURL:

  • 301 Moved Permanently is cached for ever
  • 200 OK is cached for ever, if it contains content, ETag and Last-Modified headers and no Cache-Control.

Once a response like this has been seen, iOS will not send another DNS or HTTP query and will use the result until the device is rebooted.

In the case of the 301, the rule will always not match (as iOS will not follow redirects). In the case of the 200 response, the rue will always match, no matter what network the device is currently on.

Unexpected caching behaviour

Obviously the result of an on-demand probe should never be cached beyond the network boundaries, as it defeats the probe pointless.

Further we were expected to see at least DNS requests for the resource when switching networks as we use Split-DNS. Assuming that a response is still valid when DNS returns a different IP address is surprising.

When a resource returns ETag and Last-Modified headers and no further Cache-Control headers, we expected clients to validate whether the resource has been changed (304 Not Modified) rather than it being cached without validation.

Cache-Control headers

Adding the following header to our 200 OK response, caused iOS to always request the resource when processing the on demand probes:

Cache-Control: no-cache, no-store, must-revalidate, max-age=0

Selector.select() returns 0 immediately without blocking

TL;DR

If using non-blocking IO and setting SelectionKey.interestOps(0), a Selector will wake up on POLLHUP | POLLERR (e.g. connection reset by peer), but the JDK /Android SDK will not be able to surface this condition to the caller and will return 0 instead, potentially causing a loop with 100% CPU utilisation.

Details

I was recently investigating an issue with an Android Selector going into a busy loop where select() would return immediately without any keys being selected.

From the Android API documentation:

This method performs a blocking selection operation. It returns only after at least one channel is selected, this selector’s wakeup method is invoked, or the current thread is interrupted, whichever comes first.

I knew that neither the current thread was interrupted nor that the selector was woken up, yet the select() returned with an empty selected key set, consuming 100% CPU in the resulting busy loop.

There are a few reports of similar problems, e.g.

In particular that first link suggests that other implementations have encountered this when using the JDK and found workarounds like recreating the entire selector based on a heuristig.

After trying many things and digging through all related Android/Java/JNI classes I have finally found the cause of the spurious wake ups, which appears to be a bug in the JDK or an illegal, yet undocumented, use case when using
selectionKey.interestOps(0);

Program flow pseudo code

Our selector loop detects when channels are ready for reading but would aim to offload the actual reading of the data to a secondary thread.

do {
  // block until any channel is ready
  int ret = selector.select()
  // for any key that is ready
  for (key: selector.selectedKeys()) {
    if (key.isReadable()) {
      // make sure we are not waking up
      // until reading has finished
      key.interestOps(0);

      // offload the reading to a worker
      // the worker will call
      // key.interestOps(OP_READ) when done
      workQueue.add(key.channel());
    }
  }

  // make sure all selected keys are cleared
  // because we've handled them all
  selector.selectedKeys().clear();
} while (true);

As illustrated, once a key has been detected to be ready, we unregister all interested in all operations and offload the read work to a secondary thread. We did this by calling key.interestOps(0), and pass the channel to a worker thread. When the worker thread has completed the read, it registers the channel again with key.interestOps(OP_READ).

We observed a situation where select() would constantly return 0, with the selectedKeys set being empty, causing 100% CPU load.

Missing support for interestOps(0) in the JDK

The JDK / Android SDK promises that select will only wake up if any key is ready, the selector is woken up or interrupted, but neither of these things have happened. The documentation does not mention a special handling of interestOps(0) and it can be assumed that this is a valid operation to perform.

Under the hood Java and Android SDK are using poll(2) to block for the I/O ready state. The pollfd struct takes an events field (which in our case is 0) and it would populate a revents field with the values of the ready file description.

I found that poll(2) wakes up with revents being POLLHUP | POLLERR as a signal that the remove channel is closed. This is a valid case even when registered events is 0 and any read on such an fd would return -1.

The Android AbstractPollSelectorImpl.java however FILTERS the read the nioReadyOps() by the nioInterestOps():

sk.channel.translateAndSetReadyOps(rOps, sk);
if ((sk.nioReadyOps() & sk.nioInterestOps())!=0) {
  selectedKeys.add(sk);
  numKeysUpdated++;
}

So even if nioReadyOps() would return a value, it would be masked out by the nioInterestOps(). Unfortunately, the translateAndSetReadyOps() in SocketChannelImp.java will ensure that even nioReadyOps() is 0, because it is set to intOps in case of error:

if ((ops & (Net.POLLERR | Net.POLLHUP)) != 0) {
  newOps = intOps;
  sk.nioReadyOps(newOps);
  // No need to poll again in checkConnect,
  // the error will be detected there
  readyToConnect = true;
  return (newOps & ~oldOps) != 0;
}

Endless loop on interestOps(0) and POLLHUP | POLLERR

A selector will wake up if a registered channel has been disconnected even if interestOps has been set to 0!

However there is no way to access this condition in the userland because the condition has been masked out by the zero of interestOps. The selector will return an empty selectedKeys set instead even though poll signalled a ready file descriptor.