Wishlist for XMPP

I've been using XMPP since ~2017, and onboarding people to it for just as long. In spite of the many setbacks facing XMPP - low hype, low funding, and the world's pervasive ignorance and apathy towards free software, privacy, and decentralization 1 You might think it's a problem restricted to the unwashed masses, but no - even the tech community is afflicted. Even the free software community is afflicted, or they wouldn't all be using GitHub, Telegram, Discord, and Signal like lambs to the slaughter. Hence, "pervasive". - the XMPP community continues to improve the clients and the ecosystem in general.

Here are some things I would like to see in the future. Some of them come from user feedback, given by both casual and technical users. Others come from my own experience in using XMPP and onboarding others.

Note that I don't use iOS, so despite the occasional points for iOS, my own experience is mostly limited to Android and desktop Linux.

Points of friction in onboarding

  1. Quicksy for iOS. (Monal is working on it.) Done as of 22nd August 2024! An iOS version of Quicksy based on Monal is now available on the App Store. https://fosstodon.org/@Monal/113004853520449161

    Quicksy is currently our best bet at onboarding the masses who don't use password managers, are prone to forgetting passwords, and consider adding contacts to be unnecessary labor.

    • Actually, we need to be able to create a Quicksy account from any client. Monocles Chat, for example, adds many oft-requested features to Conversations.
  2. Decentralized and private contact discovery that does not rely on phone numbers.

    Quicksy is centralized. Movim's idea of suggesting contacts on the same server is also centralized.

    An idea originating in the JoinJabber, Prav, and diasp.in communities is to suggest adding (or automatically add) to your contacts members of private group chats you are in. This can work quite well -

    1. Most people who onboard others usually invite them to one or more private groups.
    2. It also takes care of suggesting contacts from the correct "circles".

    Some claim that "members of a group chat can already add each other manually", but that's missing the point.

    1. Not suggesting contacts from private groups affects discoverability - it hides the action away behind additional steps. In my experience, most people don't add mutual contacts in private groups, even when they know them. Most people have never even opened the channel details screen. 2 You have to realize the (low) significance Jabber holds for most people - in most cases, it's "a weird app I have to use because of this weird person who doesn't use WhatsApp." We must accept and adapt to this reality if we are to change it. Relatively dedicated users will find a way to add others, but the majority of people won't.
    2. Adding mutual contacts from group chats is manual labor which people neither want nor need to perform. I've tried getting people to manually add mutual contacts, and the response is usually, "This app [sic] is too much work."

      Software should always aim to reduce labor performed by users.

  1. OMEMO should be enabled by default, which will decrease number of the onboarding steps for each contact.
  2. The first OMEMO message to/from a new contact often can't be decrypted.
  3. On a fresh install, all existing or discovered contacts should be added to the chat list. When a new contact is added, the contact should be added to the chat list.

    Most people see the empty chat list and don't know how to proceed.

    • Conversations and forks have a fairly obvious "start chat" floating action button (FAB), but new users don't seem to notice it.
    • The UI should reduce manual work - hitting the FAB creates additional work during the critical onboarding task of starting a chat.
    • The current UI (empty and manually-populated chat list) also makes their existing contacts less discoverable, by hiding them in the contacts/new chat screen.
    • In a similar vein, Conversations and/or some of its forks open the contacts menu when there are no chats in the chat list. But I think having a populated chat list on fresh installs is still better UX…

Conversations and forks

  1. Contacts frequently miss friend requests. Contact requests should be made more prominent, and should perhaps be made notifications.
  2. Power and data savers.
    1. The battery optimization dialogue can be missed, misunderstood, or dismissed too easily. It can be accessed again in "Manage accounts", but that's too well-hidden. 3 Despite using Conversations since 2017, I didn't discover this until this year. 🙂 It should be a persistent notification, as well as a message in the chat list. To ensure that people read it, perhaps there should be a 3 second timer before the dialogue can be dismissed.
    2. Lock the app to recent apps, in interfaces which support it.
    3. Exempt clients from OEM-specific power/data savers.

Features hindering use (post-onboarding)

General enhancements

These affect both casual and power users.

  1. Conversations and forks - jump to search result. (like Gajim)
  2. Better reply UI.
    1. Swipe to reply in mobile clients. (like Monocles)
    2. Make it visually different from quotes (includes showing the OP's name, time, etc). (like Gajim)
    3. Click/Tap to jump to original message. (like Gajim and Cheogram)
    4. Ability to reply to a message without quoting the whole thing.
      • Allow the user to trim down the quoted text to only the part that is actually relevant to the reply.
      • Clients should not include nested quotes when replying. Otherwise, the quoted text becomes progressively larger, which just adds noise to the chat.
      • Clients should avoid notifying users when their name is mentioned in a quote.
    5. Edit your message to change the message you replied to. Useful when you forget to select a message to reply to, or respond to the wrong message.
    6. Reply to files.

      tmp-1722805365175.jpg

      Monocles Chat's UI for this is rather basic and messy, as it merely quotes the full download URL, which can be pretty long and not very informative. To figure out what file somebody is talking about, you have to go through the tedious process of copying the download URLs of possible files and checking them against the quoted URL.

      Screenshot from 2024-08-05 02-45-23.png

      Gajim's UI is better (you can tap to jump to the file, and the reply is just the file name), but does not show the file preview, which might be preferable.

  3. Better call experience. (Observed in Conversations and forks.)
    1. If you are on an XMPP call and receive a PSTN call, the XMPP app immediately stops receiving your audio. You can hear your contact, but they can't hear you - you can't even tell them that you'll call them back later. You just have to hang up, or wait for the PSTN call to end.
    2. You must connect your Bluetooth headset before you answer a call. Once you answer it, the client does not route audio to the headset, even after the phone has connected to it.
    3. There's no way to switch from phone to Bluetooth device or vice versa during a call.
  4. Correct any message you sent - not just the last one, but the last N messages. Even if you sent it from one client, you should be able to correct it in another. Restarting a client shouldn't prevent you from editing your messages, either.

    Monocles supports this since v1.7.11.

  5. End-to-end encryption should retain full history on new devices.

    Ideally this should be compatible with OMEMO, rather than requiring people to use a different encryption protocol. There's discussion about implementing client-to-client MAM to achieve this.

    If not, we should move to OpenPGP for XMPP (OX), if the good parts of OMEMO UX can be achieved with it.

  6. Message moderation. Many clients support it, with the notable exception of Conversations and Quicksy.

    Message moderation is necessary to provide a safe and friendly environment for new users. Nobody needs to see shock porn spam, and certainly not casual users.

Features for casual users

These are features which privacy-apathetic contacts cite as reasons for using WhatsApp rather than XMPP (even when they have the contact in question in their XMPP roster).

  1. Push is either broken on iOS, or is too easy to misconfigure. As a result, many of my Siskin and Monal contacts don't receive messages and calls until they open the app. Which makes people less likely to use XMPP in general.
  2. Group AV calls in Conversations, Monocles Chat, and Gajim. (like Dino)
  3. Image galleries/grouped file transfer
    • When you open an image, you should be able to scroll between images in the same chat.

      Ideally this should be possible with any image viewer of the user's choice (I'd much prefer to use Fossify Gallery, which has all the features I want and which I've configured to my liking, rather than a built-in image viewer which will probably have fewer features; adding those features will just increase the burden on maintainers) - but from what I've heard, Android restrictions make this impossible.

    • Add caption to sent files/groups of files
  4. Message reactions. (like Gajim)
    • Monocles supports sending reactions, but does not yet have the expected reactions UI to display them.
  5. Live location sharing.
  6. WhatsApp-style "status". The groundwork is there with Movim's social media XEP(s) - other clients need to add support.
  7. Share contacts from within the app. A number of users have asked me how to share contacts from XMPP. They are used to being able to initiate sharing from within WhatsApp - they have no idea they can share contacts from any contact manager app.

    (Naturally, I can and do teach them how to share from the contacts app…but how many can we teach?)

  8. When Android users share something, the system recommends them contacts to share with (in addition to applications). From what I can make out, XMPP clients don't hook into this functionality, which affects discoverability and creates additional work - first to find the entry for the XMPP app, and then to find the contact in the app.
  9. Better file transfer workflow.
    1. (like WhatsApp) If a video is larger than the upload file size limit, offer to split it into two or more files before sending.
    2. (like Element) Each time you share images or videos, ask the user about whether to resize/compress them, or send them as-is. The last choice should be the default for the next prompt.
  10. Delete messages from other devices. (like Monocles Chat) Some users literally use WhatsApp as a personal notetaking app because of this functionality 🤔
  11. Polls. (Message reactions can work as a stopgap, so this is low on the list.)

Features for power users

  1. Threads on desktop and other platforms. (like Cheogram, Monocles Chat)
    • Let users move their own messages from one thread to another
    • Let owners and moderators move anybody's messages from one thread to another
    • A "forum view" which shows a list of threads instead of messages. You must choose a thread before writing, or create a new one. (like Zulip)
  2. (like Matrix?) Public channels which users could request to join. Moderators could vet the users and allow/deny joining. Users could see the room subject and a limited amount of the room's messages to know what to expect.

    That keeps the channel discoverable (unlike private rooms), while also keeping out bots, spam, bridged users, etc.

    It would also allow for encrypted public rooms (!)

    According to singpolyma (of Cheogram and soprani.ca fame), "we even have the protocol for this, but [no] implementations yet" - https://xmpp.org/extensions/xep-0045.html#regapprove.

  3. Tagged (rather than hierarchical) organization of chats.

    Gajim has a concept of workspaces. Presumably, this feature exists to allow users to organize chats as they like…but it's hierarchical. Hierarchical organization is too inflexible to really allow expressive modeling of the real world. 4 Also a common criticism of UNIX's hierarchical filesystems. Critically, if my contact is both a programmer and an OSM contributor, Gajim does not allow them to be present in both my "Programming" and "OSM" workspaces.

    I can imagine situations where the restrictions of hierarchical organization can be useful, e.g. rigidly-separated "work" and "home" workspaces. But whereas it's possible to emulate rigid hierarchies through tags, it's not possible to emulate tags through hierarchies.

  4. Showing join, part, and moderation actions (quiet, ban, kick). (like Gajim)

    Important for moderators managing large channels, and channel members in general.

Bugs

  1. Silent disconnection from channels (a.k.a. Schrödinger's Chat). Server restarts should be invisible to users.

    Gajim requires manual reconnection in some situations. Conversations and its forks still require manual exit/rejoin. Servers also need to be updated with the requisite configuration.

    Related - stream management for S2S connections (ejabberd issue).

    The MIX or MUC2 proposals aim to fix this isssue.

  2. All possible OMEMO failure states should be clearly communicated to users.
  3. If I keep Gajim offline for even an hour or two (e.g. I suspend my laptop for a few hours of travel), it fails to decrypt messages received in the meantime. Thus, maintaining an unbroken set of logs in Gajim all but requires me to keep it online at all times, which defeats the whole purpose of offline message support.
  4. All clients should show the same names for contacts with phone numbers. (e.g. Quicksy and Prav users) Conversations and its forks show their usernames, but in Gajim I can't seem to search for these users by their usernames.
  5. Gajim displaying edited messages as separate messages is pretty confusing and annoying for users - not to mention inconsistent with other clients.

User interface

  1. Mobile clients should be more bottom-bar oriented. (like Monocles Chat)
  2. Message formatting toolbar (present in Monocles Chat) should be displayed by default.
  3. Mobile client context menus should use icons rather than text (like WhatsApp message long-press menu). Graphics can be recognized much more quickly than text. They're also useful for illiterate/low-literate users.

Other enhancements (not preventing use)

  1. Search keywords, e.g. from:<username> (like Gajim)
  2. Messages should be displayed without message styling syntax characters. (like Monocles Chat)
    • But really, we should replace Message Styling with XHTML-IM. Convert styling into XHTML-IM to satisfy the markup fans, if need be.
  3. More styling - bullet lists, headings, URLs
    • URL markup is often cited as a security hazard, because you can hide a harmful link behind innocuous link text. But isn't that easily handled by the appilcation prompting the user when they open the link? e.g. "You're about to visit <full URL>. Proceed?"
  4. Format XMPP room URIs to show room title (like Monocles)
  5. I often need a way to quickly switch between two or more chats, both on Android and on the desktop. Clients could support breaking out one or more chats into a new window (especially on Android), or providing a keybinding to switch to the last-seen chat (especially on desktop).
  6. Client support for account/server migration, e.g. https://migrate.modernxmpp.org/
  7. Blocking messages from users, in such a way that clients still receive and log their messages, but hide them in the user interface. Users should be able to view/hide these messages, both one at a time and en masse.
  8. The ability to see past versions of corrected and moderated messages.
  9. Allow users to set per-chat/channel profile pictures. I may want to use an actual photograph of myself for some users, and a different profile picture for other users. Similar to per-channel nicknames, I want to be able to set different profile pictures to suit each channel.
  10. Dedicated UI for searching and joining IRC channels using Biboumi. (like Element?)

    This is low priority, because we should be focusing on modern protocols like XMPP and moving away from primitive protocols like IRC. Still, if XMPP clients provide a good IRC interface, they can be attractive to power users.