• Blog
  • Archive
  • Github
  • Research
  • About
  • RSS
  • Avy Interactive

    The excellent avy package for emacs provides an efficient mechanism for positioning the cursor in the current window. The avy workflow is:

    1. Active avy via one of several possible functions
    2. Type a few characters corresponding to the location on screen you want to navigate to
    3. Avy will highlight each occurrence of those characters with navigation hints
    4. Type the one or two letters corresponding to the desired destination

    Avy typically enables navigation to any position on screen in four or five keystrokes. There are variants to highlight candidates based on a single character (avy-goto-char), two characters (avy-goto-char-2), a character at the beginning of a word (avy-goto-word-1), or an arbitrary number of characters entered within a short time limit (avy-goto-char-timer).

    I find that the one and two character variants produce too many matches for me to easily visually identify my target. As a result, I have been using avy-goto-char-timer, which allows you to enter any number of characters; after a timeout following the final character, avy activates and highlights matches for selection.

    The avy-goto-char-timer function, unfortunately, does not provide a means for correcting typos. This minor annoyance has kept me from using avy as much as I would like for navigation within a buffer. I finally decided to fix it by introducing my own variant:

    (defun tr/avy-prompt (s)
      "Prompt for an avy target and activate avy with it."
      (interactive "MAvy: ")
      (avy-process (avy--regex-candidates s)))
    

    This function (which I bind to C-]) simply uses the minibuffer to prompt for a string and activates avy on that string. Since the string is entered in the minibuffer, all normal editing commands work on it. A future variant could potentially use completing-read to provide a filtered list of candidates (e.g., via selectrum).

    Comment emacs | December 20, 2020
  • Less Funny Type Errors Issue 2

    This is part two of the presumably unending type errors series. The previous entry involved an amusing type error. This time the error is a bit less amusing:

    ghc: panic! (the 'impossible' happened)
      (GHC version 8.8.3 for x86_64-unknown-linux):
            exprToType
      Call stack:
          CallStack (from HasCallStack):
            callStackDoc, called at compiler/utils/Outputable.hs:1159:37 in ghc:Outputable
            pprPanic, called at compiler/coreSyn/CoreSyn.hs:1997:28 in ghc:CoreSyn
    
    Please report this as a GHC bug:  https://www.haskell.org/ghc/reportabug

    Unfortunately, there was no additional context to help narrow down the problem. The file in question has a few hundred lines of code. My starting set of observations was:

    • The code compiles and works on GHC 8.6.5 (so can’t be too bad)
    • It used to compile and work under GHC 8.8.3 (so there were no compile errors)
    • There were some warnings (but they didn’t seem particularly relevant)

    I had vague memories of seeing this error before in another context and moving code around until the error went away. I don’t remember any details besides that approach to debugging being unsatisfying. Ultimately, I took a guess that:

    1. GHC was attempting to print a warning related to inferred types
    2. GHC has recently started warning about type family applications appearing as kinds

    There was one function in the file in question that had an explicitly quantified kind variable meeting those conditions. Adding an explicit (dependent) kind annotation to that variable caused the crash to go away.

    Comment haskell | May 19, 2020
  • Funny Type Errors Issue 1

    I find that I run into a large number of what I would consider Funny Type Errors while using GHC. This is the first post in what will presumably be a long series of posts on the topic.

    Today I was working on trying to improve the performance of a codebase that uses a large number of the type system extensions provided by modern GHC including -XGADTs and -XTypeInType. Seeing strange type errors in this code is expected on a daily basis. However, the code I was working in today was not doing anything too unusual. A reduced example of the surprising code is:

    {-# LANGUAGE GADTs #-}
    {-# LANGUAGE KindSignatures #-}
    {-# LANGUAGE TypeInType #-}
    {-# OPTIONS_GHC -Wall -Wcompat #-}
    import qualified Data.Functor.Const as C
    import           Data.Kind ( Type )
    
    data Pair (a :: k -> Type) (b :: k -> Type) where
      Pair :: a tp -> b tp -> Pair a b
    
    data MonoStruct =
      MonoStruct { item :: Pair (C.Const Int) (C.Const Int)
                 , otherData :: Int
                 }
    
    getFirst :: MonoStruct -> Int
    getFirst ms =
      let Pair a _b = item ms
      in C.getConst a
    
    update :: MonoStruct -> Int -> MonoStruct
    update ms f = ms { item = Pair (C.Const f) (C.Const f) }

    Naively, I expected this to compile. Instead, GHC (8.6.5 and 8.8.3) report the errors:

    ❯ ghci poly.hs
    GHCi, version 8.8.3: https://www.haskell.org/ghc/  :? for help
    [1 of 1] Compiling Main             ( poly.hs, interpreted )
    
    poly.hs:18:19: error:
        • Cannot use record selector ‘item’ as a function due to escaped type variables
          Probable fix: use pattern-matching syntax instead
        • In the expression: item ms
          In a pattern binding: Pair a _b = item ms
          In the expression: let Pair a _b = item ms in C.getConst a
       |
    18 |   let Pair a _b = item ms
       |                   ^^^^
    
    poly.hs:22:15: error:
        • Record update for insufficiently polymorphic field:
            item :: Pair (C.Const Int) (C.Const Int)
        • In the expression: ms {item = Pair (C.Const f) (C.Const f)}
          In an equation for ‘update’:
              update ms f = ms {item = Pair (C.Const f) (C.Const f)}
       |
    22 | update ms f = ms { item = Pair (C.Const f) (C.Const f) }
       |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    

    This is very strange: the MonoStruct type does not contain any polymorphic fields. In fact, I called it MonoStruct in this example because it looks entirely monomorphic. To dig a little deeper, we can ask ghci for some additional information:

    Prelude> :set -fprint-explicit-foralls
    Prelude> :set -fprint-explicit-kinds
    Prelude> :load poly.hs
    [1 of 1] Compiling Main             ( poly.hs, interpreted )
    Ok, one module loaded.
    *Main> :info MonoStruct
    data MonoStruct
      = forall {k}.
        MonoStruct {item :: Pair k (C.Const k Int) (C.Const k Int)}
       -- Defined at poly.hs:11:1
    

    This shows us what is really happening. Note the definition of Pair: it takes two data items that are indexed by a poly-kinded type parameter (k). This has an unfortunate interaction with the use of Const as the data elements, as Const is also kind polymorphic. This means that while the type of the Pair in MonoStruct looks monomorphic, the kind of the pair could change on update, which causes the error we see from ghci. As a concrete example, we could populate our Pair with Const Int Int or Const Int '[Natural]. Specializing the type put into the Pair removes the unwanted polymorphism:

    {-# LANGUAGE DataKinds #-}
    {-# LANGUAGE GADTs #-}
    {-# LANGUAGE KindSignatures #-}
    {-# LANGUAGE TypeInType #-}
    {-# OPTIONS_GHC -Wall -Wcompat #-}
    import           Data.Kind ( Type )
    
    data Tag = A | B
    
    data ConstInt (k :: Tag) = ConstInt { getConstInt :: Int }
    
    data Pair (a :: k -> Type) (b :: k -> Type) where
      Pair :: a tp -> b tp -> Pair a b
    
    data MonoStruct =
      MonoStruct { item :: Pair ConstInt ConstInt
                 }
    
    getFirst :: MonoStruct -> Int
    getFirst ms =
      case item ms of
        Pair a _b -> getConstInt a
    
    update :: MonoStruct -> Int -> MonoStruct
    update ms f = ms { item = Pair (ConstInt f) (ConstInt f) }
    Comment haskell | April 19, 2020
  • Metablog 2019

    Instead of writing about something productive, I am going to again write about writing. Or at least the infrastructure for writing. First, I decided to switch comments from Disqus to utteranc.es, which is a clever approach to hosting comments for statically-generated sites. For each post, the script creates a GitHub issue and adds each comment to the issue. There are no ads, and it is all around more civilized. Moreover, I appreciate the simplicity of the model. Not that I expect an excessive comment load.

    Next, I updated the version of the Hakyll static site generator from something very old to the latest as of the time of writing. I am very pleased to report that it did not require any code changes. I can only aspire to that level of stability in my own code. Despite not requiring any code changes, I took the opportunity to change one line of code to enable support for writing in Emacs org-mode instead of markdown.

    Comment haskell | February 20, 2019
  • Emacs Startup Optimization

    Recently, I had the urge to optimize the startup time of my emacs configuration. I managed to reduce my startup time from three seconds to under half a second. It was a fun exercise even though I don’t really start emacs that often. Even that minor reduction makes starting a new emacs instance (e.g., from mutt or in a terminal) much more pleasant. I could use the server functionality in emacs (along with emacsclient) to always connect to a single persistent instance, but I like the isolation of separate processes. I’ve had the server crash before bringing down all of my sessions, which is fairly annoying. Most of the startup time savings came from systematically employing lazy package loading through John Wiegley’s excellent use-package. Besides making it easy to properly load packages lazily, use-package also handles package installation and configuration.

    use-package encourages a configuration style that isolates the configuration for each package in a use-package form. Moreover, the configuration of each package is split into initialization steps taken at emacs startup time (via the :init section) and configuration steps executed after the package is loaded (via the :config section). Packages can also be loaded lazily, which significantly improves startup time. By default, use-package forms that bind keys or are associated with a specific file type are loaded lazily.

    After doing a bit of profiling with esup, which is an emacs startup profiler, I found a few surprising initialization steps that were taking up most of my startup time. The most expensive parts are difficult to optimize further: theme loading, font setting in GUI mode, and initialization of the package library. Aside from that, cc-mode was one of the most expensive things to load; moving more of the setup from :init to :config took care of that.

    There was at least one interesting and unexpected consequence of my use-package configuration that I did not initially appreciate. I decided to set

    (setq use-package-always-ensure t)

    This option tells use-package to automatically download and install packages that are not already installed. This is convenient to bootstrap a new emacs installation or to bring an existing one up to date with the latest changes to your emacs configuration. It has side effect of breaking when the package being set up is not in one of the emacs package repositories (i.e., local packages). This came up while setting up a major mode for netlogo. It eventually occurred to me that I could tell use-package to not try to install this package by setting :ensure nil in the use-package form:

    (use-package netlogo-mode
      :ensure nil
      :load-path "~/.config/local/emacs.d/lisp"
      :mode ("\\.nls$\\|\\.nlogo$" . netlogo-mode))

    In the process, I cleaned out old configurations I didn’t need anymore and enabled a few new packages including:

    • swiper (for sophisticated finding and replacing)
    • avy (for quick navigation)
    • switch-window (for faster movement between windows)

    It was fun to refresh my entire configuration. I should probably do that more often.

    Comment emacs | September 12, 2016
  • Emacs and Terminal Pasting

    Over the last few weeks, I’ve found myself having to copy and paste large chunks of text into emacs running in a terminal. This leads to some annoyance, as each character pasted triggers a keystroke. In particular, every newline triggers indentation via newline-and-indent. This is very annoying, as the indentation usually gets a bit messy and turns into an ugly staircase of text. Vim has a solution to this problem via a command :set paste, which turns off indentation while pasting. More generally, most terminals support what is known as bracketed paste to enter a large string of text as one virtual keystroke. I recently discovered the emacs equivalent: bracketed-paste mode on melpa. Now a long standing annoyance is fixed with a simple:

    (require 'bracketed-paste)
    (bracketed-paste-enable)
    Comment emacs, linux | September 3, 2016
  • Linux Messaging Clients

    I recently found myself looking at Linux instant messaging clients again. Sometimes I feel like some kind of Luddite for still using a desktop instant messaging client, but I just cannot bring myself to use a web application. I have used pidgin for as long as I can remember, but I have been looking for an alternative with better support for primarily keyboard interaction. My biggest complaints about pidgin center around odd focus issues that sometimes seem to require a mouse to fix.

    Unfortunately, none of the GUI clients seem to offer decent keybindings. That leaves console clients. I have tried finch, which isthe console frontend to libpurple (which is itself the library that provides all of the interesting functionality in pidgin). finch comes with an interesting console window manager built on top of ncurses called gnt. I found the default UI and keybindings to be maddening, even more than most normal GUI applications. That defeated the purpose of switching.

    gnt has a surprising feature: it supports alternative window manager layouts through plugins. I tried to use the one that mimics an irssi-style client. This interface superficially looked like irssi, but the interaction model was still annoying and required awkward focus changing commands. Moreover, it interacted badly with terminal resize events. Whenever I would move the terminal to another monitor with xmonad, the subsequent resize event would jumble the window positions. This caused awkward overlaps, which ruined the irssi-like illusion that the window manager plugin was trying to present. It turns out that the plugin really just carefully arranges windows to mimic the irssi interface, but only on application startup. It would not reflow the layout in response to changes.

    For now, I have settled on profanity. Profanity is a console client that also mimics the irssi interface. It is significantly more successful at doing so than finch. The only downside to profanity is that it only supports a single account. It is not a huge burden to use tmux to manage a few instances for different accounts, though I would not object to multi-account support. It is worth noting that profanity only supports XMPP, which is fine for my purposes. It seems to support all of the XMPP extensions that I need (and then some), including multi-user chats. I had to spend a bit of time configuring things properly, but I eventually managed to get it working well with tmux and my xmonad configuration. Most of this configuration was related to notifications. I use X window urgent hints to show activity notifications in my xmonad bar (taffybar). Profanity easily supports this mode of operation with the /flash configuration option. There was a bit of an issue with running it in a tmux session, though; tmux would set the urgent hint when there was any console activity at all. Annoyingly, this included clock updates in the profanity status bar. I had to disable that with the /time configuration option, which was unfortunately a new addition as of 0.4.7. Of course, the version shipped with Ubuntu 15.10 is 0.4.6.

    Overall, I am pleased with profanity so far. We shall see if I can stand it in a week.

    Comment linux | November 9, 2015
  • Unusual Emacs keybindings

    I recently ran into a strange problem with running emacs. A few days ago, my delete, page up, and page down keys stopped working, but only when emacs was run in the terminal. Whenever I hit one of those keys, emacs would complain that the region was empty. This was baffling for a while. Searching the internet turned up nothing, so I had to assume that it was something I had done. One further symptom manifest: when I used my M-[ keybinding, the bound function would execute and it inserted 3~ into the buffer. That was unusual.

    It turns out that binding M-[ to anything is a bad idea because Meta and Escape are the same in emacs. It also turns out that Escape-[ is the prefix for a few keycodes, including delete, page up, and page down. My delete key was being intercepted and invoking my keybinding. When the associated function was successful, the rest of the delete keycode was inserted into the buffer.

    Lesson learned: do not bind M-[.

    Comment emacs | February 13, 2015
  • TAGS for Haskell

    I periodically try to use TAGS to navigate round my code in emacs. When it works, it is very convenient. I have not been using them lately, partly because generating the tags and keeping them up-to-date has always been a bit fiddly. In the past, I have tried to get emacs to automatically regenerate my tags when I save changes to a file. I have had solutions that work to a greater or lesser extent, but they are always a bit unsatisfying. Inevitably, I end up having to write an extra shell script to generate the tags. Usually, I need a script because I often work in projects that span more than one repository or cabal project. The simple solutions I have seen for tag management all assume that all of your code lives under one project root.

    This time, given that I will probably require some kind of shell script anyway, I decided to try a more UNIX-y approach. My tag generation script now uses inotifywait (from inotify-tools) to watch for filesystem changes and rebuild my tags when necessary:

    #!/bin/bash
    
    DIRS="*/src"
    EVENTS="-e close_write -e move"
    
    inotifywait ${EVENTS} -mr ${DIRS} | while read event ; do
      hasktags -e -o TAGS ${DIRS}
    done;

    This says: whenever a file under src in my repository is closed (after some writes) or moved, run hasktags to rebuild the TAGS file. So far, it works remarkably well and emacs does not need to know anything about the process. inotifywait only works on Linux; I am still looking for a corresponding utility for OS X.

    While this approach produces a nice TAGS file, I still have a small problem with the find-tag function in emacs. This function lets you jump to the tag for the identifier under the cursor. The default function does not know anything about the qualified module imports typical of Haskell code, and includes the module qualifier in the suggested tag name. This does not work, because the TAGS file does not record the qualifier (and it cannot do so, since you can choose a different qualifier each time you import the module). You can always type in whatever name you want when using find-tag, but having the default chosen by find-tag would be ideal (and faster). To get around this, I stole and adapted a bit of code from clojure-mode that addresses this very same problem:

    (defun find-tag-without-module (next-p)
      (interactive "P")
      (find-tag (first (last (split-string (symbol-name (symbol-at-point)) "\\.")))
                next-p))
    
    (eval-after-load "haskell-mode"
      '(progn
         (define-key haskell-mode-map (kbd "M-.") 'find-tag-without-module)))

    So far, everything is working well.

    Comment haskell, emacs, linux | February 8, 2015
  • Arrays in Haskell

    Relative to many other languages, the humble array is underrepresented in Haskell. That may follow from the difficulty of fitting arrays into pure code; an entire array to update a single element rarely makes sense. There are some immutable data structures that look like arrays, but are not actually. Mutable arrays (in ST or IO) make more sense, so I want to write up a few notes about them.

    Haskell has two general purpose array libraries: array and vector. There are also some special-purpose libraries like repa, which enables efficient (and implicitly parallel) numeric computations. The vector package provides a flexible and efficient array abstraction, with support for fusing many operations. In contrast, the array package has a reputation as archaic and slightly odd; certainly, the API specified in the Haskell report is strange when compared with the API provided by vector, which more closely mirrors the APIs provided by containers and the other data structure libraries in the Haskell ecosystem. That is a fair criticism. However, if you do not need the features in vector, array is still worth a look.

    Why Array

    If you just need contiguous (and possibly unboxed) storage of values with Int (or Int-like) keys with support for O(1) reads and writes, array is still a very good choice. For this use case, array even has a significant advantage over vector: the index into the array can be any type implementing the Ix typeclass (for “indexable”). This means that you can define extra types (perhaps as a newtype around Int) as array indexes. I have found this to be immensely useful, as it prevents me from accidentally indexing an array with the wrong Int that I had lying around.

    Unboxed data

    Safe indexing and O(1) updates are undeniably useful. However, one of the real strengths of arrays (and vectors) is that they support unboxed data. In a boxed array, elements are stored via pointer. In an unboxed array, the elements are stored directly in the allocated array. This has a few nice implications:

    • it removes a layer of pointer indirection,
    • the elements are all stored contiguously, and
    • the garbage collector does not need to scan the elements.

    In short, if you can store your data in an unboxed format, it is usually worth it. There is one restriction: you cannot store unevaluated thunks in an unboxed container. Since that extra layer of indirection has been removed, all of the elements are always fully evaluated. In many cases, this is actually also a good thing.

    Both array and vector have a default set of instances allowing them to store most of the unboxable types in base. But what if we want to store our own types in unboxed arrays? The answer turns out to be a little complicated.

    Unboxed mutable vectors

    The vector package has a few classes that need to be implemented for your type T (in the case of mutable vectors):

    • Data.Vector.Generic.Mutable.MVector Data.Vector.Unboxed.Mutable.MVector T
    • Unbox T

    It really is not obvious how to implement them by hand. You need to root around in the implementation of vector to figure it out. Luckily, these instances can be derived with GeneralizedNewtypeDeriving. Or, they could, until ghc 7.8 made GeneralizedNewtypeDeriving safe. If you are willing to use Template Haskell, the vector-th-unbox package can generate the necessary instances.

    Unboxed mutable arrays

    The story for array is not much simpler. I could not find much documentation on this issue, so hopefully this post will shed some light on these darker corners of the library ecosystem. In order to store a type T in an unboxed mutable array, you only need one instance (at least for the IO variant):

    • MArray IOUArray T IO

    This is a bit problematic. First, it does not really fit the form we would expect for a (GeneralizedNewtypeDeriving) deriving clause. Normally, we would expect the T to come last, but MArray is just not defined that way. That is okay – we can work around that with StandaloneDeriving:

    {-# LANGUAGE GeneralizedNewtypeDeriving #-}
    {-# LANGUAGE MultiParamTypeClasses #-}
    {-# LANGUAGE StandaloneDeriving #-}
    
    import GHC.IO ( IO(..) )
    import qualified Data.Array.IO as IOA
    
    newtype T = T Int
    
    deriving instance IOA.MArray IOA.IOUArray T IO

    We have to import the constructor for IO these days, or GHC complains. It turns out that GHC still complains:

        Can't make a derived instance of ‘IOA.MArray IOA.IOUArray T IO’
          (even with cunning newtype deriving):
          cannot eta-reduce the representation type enough
        In the stand-alone deriving instance for
          ‘IOA.MArray IOA.IOUArray T IO’

    I confess that I do not really know what that means, aside from it just not working. Perhaps GHC does not know what parameter it is supposed to be deriving for. In any case, that leaves us with one option: we get to write the instance ourselves. Doing so is unpleasant, but at least it is fairly obvious from the existing instances defined in array. For our type T, which is a newtype around Int, the instance looks like the following:

    
    import qualified GHC.Base as B
    import GHC.ST ( ST(..) )
    import Control.Monad.ST ( stToIO )
    
    import qualified Data.Array.IO as IOA
    import qualified Data.Array.IO.Internals as IOA
    import qualified Data.Array.MArray as MA
    import qualified Data.Array.Base as BA
    
    instance MA.MArray (BA.STUArray s) T (ST s) where
      {-# INLINE getBounds #-}
      getBounds (BA.STUArray l u _ _) = return (l, u)
      {-# INLINE getNumElements #-}
      getNumElements (BA.STUArray _ _ n _) = return n
      {-# INLINE unsafeRead #-}
      unsafeRead (BA.STUArray _ _ _ marr#) (B.I# i#) = ST $ \s1# ->
        case B.readIntArray# marr# i# s1# of
          (# s2#, e# #) -> (# s2#, T (B.I# e#) #)
      {-# INLINE unsafeWrite #-}
      unsafeWrite (BA.STUArray _ _ _ marr#) (B.I# i#) (T (B.I# e#)) = ST $ \s1# ->
        case B.writeIntArray# marr# i# e# s1# of
          s2# -> (# s2#, () #)
      unsafeNewArray_ (l, u) = BA.unsafeNewArraySTUArray_ (l, u) BA.wORD_SCALE
    
    instance MA.MArray IOA.IOUArray T IO where
      {-# INLINE getBounds #-}
      getBounds (IOA.IOUArray arr) = stToIO $ BA.getBounds arr
      {-# INLINE getNumElements #-}
      getNumElements (IOA.IOUArray arr) = stToIO $ BA.getNumElements arr
      {-# INLINE unsafeRead #-}
      unsafeRead (IOA.IOUArray arr) i = stToIO $ BA.unsafeRead arr i
      {-# INLINE unsafeWrite #-}
      unsafeWrite (IOA.IOUArray arr) i e = stToIO $ BA.unsafeWrite arr i e
      unsafeNewArray_ bounds = stToIO (IOA.IOUArray <$> BA.unsafeNewArray_ bounds)

    Entirely unpleasant, but worth the effort. There are a few important things to note about these instances:

    • You need to replace calls to readIntArray/writeIntArray and the I# constructors with variants appropriate for your underlying type. If you have a type that is not a newtype around some existing type, you will need to write your own variants.
    • The wORD_SCALE function is defined in Data.Array.Base, along with a few others. You also need to choose the appropriate scale (which is basically the stride in the array) for your type. Again, you may need to define your own for custom types.
    • While it may look like a function you would never call, you still need to implement unsafeNewArray_. The default implementation of newArray, which you probably do not want to implement, calls it. The default implementation of unsafeNewArray_ fills the array elements with a call to error. This is fine for boxed arrays, but immediately bottoms out when it is evaluated to be put into an unboxed array. The call to unsafeNewArraySTUArray_ just allocates some memory and leaves its contents undefined in the traditional C sense.

    The substantive definition is for ST, with a trivial wrapper lifting it into IO. You probably want to keep all of the INLINE pragmas, as most of this code will get compiled away nicely.

    Conclusion

    Unboxed data can be a bit of a pain to work with in Haskell (especially with the recent changes to GeneralizedNewtypeDeriving), but it is often worth it when performance is important. I have tried to document the steps required to unbox custom types with the array library. I also argue that array is still a useful library, despite its quirky API.

    Comment haskell | January 24, 2015